I have a set of data that looks like:
data = pd.DataFrame([['A',1],['B',4,5],['C',7,8,9]],columns =['Key','Oct','Nov','Dec'])
Key | Oct | Nov | Dec
A | 1 | |
B | 4 | 5 |
C | 7 | 8 | 9
and I am trying to convert it into a data set so that each time a value is present, it adds the Key, column header, and value to a new data frame that would look like:
Key | Month | Amt
A | Oct | 1
B | Oct | 4
B | Nov | 5
C | Oct | 7
C | Nov | 8
C | Dec | 9
I'm working with pandas, so I thought using iterrows to loop through the df would work, but it isn't giving me what I'm ultimately after. FYI, the actual file is 20 columns and 500 rows, but both columns and rows are dynamic depending on the day's activity, so I'm looking for a solution that doesn't need to have the column header explicitly defined, if possible.
Thanks!
You could use stack + reset_index and rename the columns:
import pandas as pd
data = pd.DataFrame([['A', 1], ['B', 4, 5], ['C', 7, 8, 9]], columns=['Key', 'Oct', 'Nov', 'Dec'])
result = data.set_index('Key').stack().reset_index()
result.columns = ['Key', 'Month', 'Amt'] # renames the columns
print(result)
Output
Key Month Amt
0 A Oct 1.0
1 B Oct 4.0
2 B Nov 5.0
3 C Oct 7.0
4 C Nov 8.0
5 C Dec 9.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.