I was hoping someone could point me in the right direction. I have a dataframe that I would like to take the first column, join it with the name of the rest of the columns and assign the value to this new column.
2020-03-20DF.csv
Store,Total Started,2 Week,4 Week,5 Week,6 Week
Boston,9,0,5,1,3
New York,3,0,0,0,3
San Diego,6,0,6,0,0
Tampa Bay,1,0,1,0,0
Houston,14,0,7,0,7
Chicago,2,0,0,0,2
what i have so far
import pandas as pd
df1 = pd.read_csv('2020-03-20DF.csv')
df1.set_index('Store', inplace=True)
print(df1)
Total Started 2 Week 4 Week 5 Week 6 Week
Store
Boston 9 0 5 1 3
New York 3 0 0 0 3
San Diego 6 0 6 0 0
Tampa Bay 1 0 1 0 0
Houston 14 0 7 0 7
Chicago 2 0 0 0 2
What I would like to see is
Boston-2 Week Boston-4 Week Boston-5 Week Boston-6 Week
0 5 1 3
etc.
For the particular case:
>>> df[df['Store'] == 'Boston'].filter(like='Week').add_prefix('Boston-')
Boston-2 Week Boston-4 Week Boston-5 Week Boston-6 Week
0 0 5 1 3
# generally:
>>> for store in df['Store']:
... print(df[df['Store'] == store].filter(like='Week').add_prefix(f'{store}-'))
Boston-2 Week Boston-4 Week Boston-5 Week Boston-6 Week
0 0 5 1 3
New York-2 Week New York-4 Week New York-5 Week New York-6 Week
1 0 0 0 3
San Diego-2 Week San Diego-4 Week San Diego-5 Week San Diego-6 Week
2 0 6 0 0
Tampa Bay-2 Week Tampa Bay-4 Week Tampa Bay-5 Week Tampa Bay-6 Week
3 0 1 0 0
Houston-2 Week Houston-4 Week Houston-5 Week Houston-6 Week
4 0 7 0 7
Chicago-2 Week Chicago-4 Week Chicago-5 Week Chicago-6 Week
5 0 0 0 2
as mentioned, used the code example from another post
import pandas as pd
df1 = pd.read_csv('2020-03-20DF.csv')
df1.set_index('Store', inplace=True)
s = df1.stack()
df2 = pd.DataFrame([s.values], columns=[f'{i}-{j}' for i, j in s.index])
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
print(df2)
DataFrame.stack
Would this be a suitable alternative?
df2 = df1.drop('Total Started', axis=1).stack()
print(df2.head())
Store
Boston 2 Week 0
4 Week 5
5 Week 1
6 Week 3
New York 2 Week 0
dtype: int64
It uses a multi-index.
Then, use tuples to index the values you want.
Eg
df2[('Boston', '4 Week')]
5
To get to what you actually asked for (a single-level index with joined strings) you could do:
df2.index = pd.Series(df2.index.values).apply('-'.join)
print(df2.head())
Boston-2 Week 0
Boston-4 Week 5
Boston-5 Week 1
Boston-6 Week 3
New York-2 Week 0
dtype: int64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.