I have a defaultdict
that looks like this:
"Some text" : ["Some date", "Some date", "Some Date"]
I am trying to access each individual value of each key like so:
for processedTweet, date in tweetsDict.iteritems():
print date
for d in date:
print d
This works fine in a normal Python script. It prints the entire list first then in the for loop it prints each individual date.
But when I send this as part of a Map/Reduce job to Hadoop, it breaks the list into individual characters, not Strings, ie:
Some date
becomes
S
o
m
etc. Why is this happening and how can I fix it?
The map/reduce job in hadoop is distributing your list values to the for-loop instead of passing the entire list. By default, when python iterates over a string object each iteration returns the next character in the string.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.