[英]how do i take a python pandas dataframe and create a new table using the column and row names as the new column
I was hoping someone could point me in the right direction.我希望有人能指出我正确的方向。 I have a dataframe that I would like to take the first column, join it with the name of the rest of the columns and assign the value to this new column.我有一个数据框,我想取第一列,将它与其余列的名称连接起来,并将值分配给这个新列。
2020-03-20DF.csv 2020-03-20DF.csv
Store,Total Started,2 Week,4 Week,5 Week,6 Week
Boston,9,0,5,1,3
New York,3,0,0,0,3
San Diego,6,0,6,0,0
Tampa Bay,1,0,1,0,0
Houston,14,0,7,0,7
Chicago,2,0,0,0,2
what i have so far到目前为止我所拥有的
import pandas as pd
df1 = pd.read_csv('2020-03-20DF.csv')
df1.set_index('Store', inplace=True)
print(df1)
Total Started 2 Week 4 Week 5 Week 6 Week
Store
Boston 9 0 5 1 3
New York 3 0 0 0 3
San Diego 6 0 6 0 0
Tampa Bay 1 0 1 0 0
Houston 14 0 7 0 7
Chicago 2 0 0 0 2
What I would like to see is我想看到的是
Boston-2 Week Boston-4 Week Boston-5 Week Boston-6 Week
0 5 1 3
etc.等等。
For the particular case:对于特定情况:
>>> df[df['Store'] == 'Boston'].filter(like='Week').add_prefix('Boston-')
Boston-2 Week Boston-4 Week Boston-5 Week Boston-6 Week
0 0 5 1 3
# generally:
>>> for store in df['Store']:
... print(df[df['Store'] == store].filter(like='Week').add_prefix(f'{store}-'))
Boston-2 Week Boston-4 Week Boston-5 Week Boston-6 Week
0 0 5 1 3
New York-2 Week New York-4 Week New York-5 Week New York-6 Week
1 0 0 0 3
San Diego-2 Week San Diego-4 Week San Diego-5 Week San Diego-6 Week
2 0 6 0 0
Tampa Bay-2 Week Tampa Bay-4 Week Tampa Bay-5 Week Tampa Bay-6 Week
3 0 1 0 0
Houston-2 Week Houston-4 Week Houston-5 Week Houston-6 Week
4 0 7 0 7
Chicago-2 Week Chicago-4 Week Chicago-5 Week Chicago-6 Week
5 0 0 0 2
as mentioned, used the code example from another post如前所述,使用了另一篇文章中的代码示例
import pandas as pd
df1 = pd.read_csv('2020-03-20DF.csv')
df1.set_index('Store', inplace=True)
s = df1.stack()
df2 = pd.DataFrame([s.values], columns=[f'{i}-{j}' for i, j in s.index])
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
print(df2)
DataFrame.stack数据帧堆栈
Would this be a suitable alternative?这会是一个合适的选择吗?
df2 = df1.drop('Total Started', axis=1).stack()
print(df2.head())
Store
Boston 2 Week 0
4 Week 5
5 Week 1
6 Week 3
New York 2 Week 0
dtype: int64
It uses a multi-index.它使用多索引。
Then, use tuples to index the values you want.然后,使用元组索引您想要的值。
Eg例如
df2[('Boston', '4 Week')]
5
To get to what you actually asked for (a single-level index with joined strings) you could do:要获得您实际要求的内容(带有连接字符串的单级索引),您可以执行以下操作:
df2.index = pd.Series(df2.index.values).apply('-'.join)
print(df2.head())
Boston-2 Week 0
Boston-4 Week 5
Boston-5 Week 1
Boston-6 Week 3
New York-2 Week 0
dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.