简体   繁体   中英

creating a dataframe from a dictionary of tuples in pandas

I have this data structure :

word1 [('date', freq) , ('date',freq) , ...]
word2 [('date',freq) , ('date',freq) , ...]

and so on.for analyzing the time series , I want to create a dataframe. I can not figure out the best way to do it as I'm quite new to python(and I apologize for that). should I use:

classmethod DataFrame.from_dict(data, orient='index', dtype=None) 

There's a lot of possible ways to start, but assuming a structure of words as

words
Out[203]: 
[[('2000-01-01', 1), ('2000-01-02', 5)],
 [('2000-01-01', 2), ('2000-01-02', 4)]]

the following is a natural starting point.

df = pd.DataFrame(index=range(0), columns=['date', 'word', 'freq'])
i = 0
for j, word in enumerate(words):
    for d, f in word:
        df.loc[i] = [d, j, f]
        i += 1

df.loc[i] will append new rows. If you know the total number of entries from the start, you could change index=range(0) to the correct value. Next steps would probably be

df.date = pd.to_datetime(df.date)
df.set_index(['date', 'word'], drop=True)
                freq
date       word     
2000-01-01 0       1
2000-01-02 0       5
2000-01-01 1       2
2000-01-02 1       4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM