I'm fairly inexperienced with python, and I'm having trouble getting some code running.
counts = {key:len(list(group)) for key, group in it.groupby(sorted(topics))}
That line will run in pyspark
(interactive mode) but if I attempt to spark-submit
it I get a SyntaxError
exception. The following code is equivalent and does run in both cases:
counts = {}
for key, group in it.groupby(sorted(topics)):
counts[key] = len(list(group))
Can anyone tell me why the first code doesn't work in spark-submit. If it makes a difference, the code is being executed within a function 1 tab out.
The exception I get using a dictionary comprehension:
Traceback (most recent call last):
File "./sessions.py", line 24, in <module>
execfile("./sessionSearch.py")
File "./sessionSearch.py", line 50
counts = {poop:len(list(group)) for poop, group in it.groupby(sorted(topics))}
^
SyntaxError: invalid syntax
Your cluster runs Python 2.6, which doesn't support dictionary comprehension syntax.
Either use a generator expression plus the dict()
function (see Alternative to dict comprehension prior to Python 2.7 ), or configure your cluster to deploy Python 2.7 .
Using dict()
your line would be:
counts = dict((key, len(list(group))) for key, group in it.groupby(sorted(topics)))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.