Put comments in between multi-line statement (with line continuation)
Put comments in between multi-line statement (with line continuation)
When i write a following pyspark command:
# comment 1
df = df.withColumn('explosion', explode(col('col1'))).filter(col('explosion')['sub_col1'] == 'some_string')
# comment 2
.withColumn('sub_col2', from_unixtime(col('explosion')['sub_col2']))
# comment 3
.withColumn('sub_col3', from_unixtime(col('explosion')['sub_col3']))
I get the following error:
.withColumn('sub_col2', from_unixtime(col('explosion')['sub_col2']))
^
IndentationError: unexpected indent
Is there a way to write comments in between the lines of multiple-line commands in pyspark?
1 Answer
1
This is not a pyspark
issue, but rather a violation of python syntax.
pyspark
Consider the following example:
a, b, c = range(3)
a +
# add b
b +
# add c
c
This results in:
a +# add b
^
SyntaxError: invalid syntax
The is a continuation character and python interprets anything on the next line as occurring immediately after, causing your error.
One way around this is to use parentheses instead:
(a +
# add b
b +
# add c
c)
When assigning to a variable this would look like
# do a sum of 3 numbers
addition = (a +
# add b
b +
# add c
c)
Or in your case:
# comment 1
df = (df.withColumn('explosion', explode(col('col1')))
.filter(col('explosion')['sub_col1'] == 'some_string')
# comment 2
.withColumn('sub_col2', from_unixtime(col('explosion')['sub_col2']))
# comment 3
.withColumn('sub_col3', from_unixtime(col('explosion')['sub_col3'])))
SyntaxError: invalid syntax
The thing is you can't do assignment within parentheses
– ira
Aug 24 at 6:45
The correct way would be
addition = (a + b + c)
or in my case df = (df.withColumn... )
– ira
Aug 24 at 9:43
addition = (a + b + c)
df = (df.withColumn... )
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Thank you, for some reason, i am getting:
SyntaxError: invalid syntax
– ira
Aug 23 at 15:06