简体   繁体   中英

Sorting file into Groups Python

The task is to: read tweets and separate them into to groups based on specific hours (Month-Day-Year-Hour). The tweets related to a specific hour will be stored in a separate file in a folder. With file name "Mon-Day-Year-Hour.txt".

I am new to python, only starting coding in it as of a couple of days ago for a class. As of right now I have the file that tweets came from loaded into a list, and have sorted the list based on time created. I have looked into the itertools.groupby() function, but I'm not sure how to implement it correctly or for my purpose.

Here's a bit of what I have so far:

for line in open("CrimeReport.txt", "r").readlines():
    tweet = json.loads(line)
    tweets.append(tweet)

Sorted tweets:

sorted_tweets = sorted(tweets, key=lambda item:datetime.datetime.strptime(item['created_at'],
                                    '%a %b %d %H:%M:%S +0000 %Y'))

I apologize for the poor formatting.

dic = {}
for key, value in groupby(v, lambda x: x%2):
    if key not in dic.keys():
        dic[key] = list(value)
    else:
        dic[key] += list(value)

groupby will get your data, and a function which will returns each data's id. by iterating over groupby and adding data into a dictionary you have it completely grouped. But if you have large data, dictionary may not be fast enough.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM