简体   繁体   中英

Loop through df and create new df

I have a set of data that looks like:

data = pd.DataFrame([['A',1],['B',4,5],['C',7,8,9]],columns =['Key','Oct','Nov','Dec'])

Key | Oct | Nov | Dec
A   | 1   |     |
B   | 4   | 5   |
C   | 7   | 8   | 9

and I am trying to convert it into a data set so that each time a value is present, it adds the Key, column header, and value to a new data frame that would look like:

Key | Month | Amt
A   | Oct   | 1
B   | Oct   | 4
B   | Nov   | 5
C   | Oct   | 7
C   | Nov   | 8
C   | Dec   | 9

I'm working with pandas, so I thought using iterrows to loop through the df would work, but it isn't giving me what I'm ultimately after. FYI, the actual file is 20 columns and 500 rows, but both columns and rows are dynamic depending on the day's activity, so I'm looking for a solution that doesn't need to have the column header explicitly defined, if possible.

Thanks!

You could use stack + reset_index and rename the columns:

import pandas as pd

data = pd.DataFrame([['A', 1], ['B', 4, 5], ['C', 7, 8, 9]], columns=['Key', 'Oct', 'Nov', 'Dec'])

result = data.set_index('Key').stack().reset_index()
result.columns = ['Key', 'Month', 'Amt']  # renames the columns

print(result)

Output

  Key Month  Amt
0   A   Oct  1.0
1   B   Oct  4.0
2   B   Nov  5.0
3   C   Oct  7.0
4   C   Nov  8.0
5   C   Dec  9.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM