简体   繁体   中英

converting pandas dataframe to contain a dictionary or list of lists

      state      Year  Month  count
0       alabama  2017.0   10.0     31
1       alabama  2017.0   11.0     30
2       alabama  2017.0   12.0     31
3       alabama  2018.0    1.0     31
4       alabama  2018.0    2.0     28
5       alabama  2018.0    3.0     31
6       alabama  2018.0    4.0     30
7       alabama  2018.0    5.0     31
8       alabama  2018.0    6.0     30
9       alabama  2018.0    7.0     14
10     arkansas  2017.0   10.0     31
11     arkansas  2017.0   11.0     30
12     arkansas  2017.0   12.0     31

Can I convert dataframe above to:

                                                            Month
state                                                        
alabama         {2017:10.0, 2017:11.0, 2017:12.0, 2018:1.0, 2018:2.0, 2018:3.0, 2018:4.0, 2018:5.0, 2018:6.0, 2018:7.0}
arkansas        {2017:10.0, 2017:11.0, 2017:12.0}

related to converting pandas dataframe to contain a list

based on @Vaishali's comment below, since dictionary cannot contain duplicate keys, this should be ok too:

                                                            Month
state                                                        
alabama         [[2017,10.0], [2017,11.0], [2017,12.0], [2018,1.0], [2018,2.0], [2018,3.0], [2018,4.0], [2018,5.0], [2018,6.0], 2[018,7.0]]
arkansas        [[2017,10.0], [2017,11.0], [2017,12.0]]

Try

df.groupby('state').apply(lambda x: list(zip(x['Year'], x['Month'])))


state
alabama     [(2017.0, 10.0), (2017.0, 11.0), (2017.0, 12.0...
arkansas     [(2017.0, 10.0), (2017.0, 11.0), (2017.0, 12.0)]
In [73]: (df.groupby('state')['Year','Month']
            .apply(lambda x: x.values.tolist())
            .to_frame('Month')
            .reset_index())
Out[73]:
      state                                              Month
0   alabama  [[2017.0, 10.0], [2017.0, 11.0], [2017.0, 12.0...
1  arkansas   [[2017.0, 10.0], [2017.0, 11.0], [2017.0, 12.0]]

I guess this will work.

d={}
for index, row in df.iterrows():
  if(d.get(row['state'],0)==0):
    d[row['state']=[].append(str(row['year'])+" : "+ str(row['month']))
  else:
    d[row['state']] = d[row['state']].append(str(row['year'])+" : "+ str(row['month']))

This will have it like

arkansas        ["2017 : 10.0", "2017 : 11.0", "2017 : 12.0"]

Or also

df.groupby('state').apply(lambda x:x[['Year','Month']].values)

state
alabama     [[2017.0, 10.0], [2017.0, 11.0], [2017.0, 12.0...
arkansas     [[2017.0, 10.0], [2017.0, 11.0], [2017.0, 12.0]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM