简体   繁体   中英

Pandas reshaping data from Multiple columns into a single Column

I have a data set that I would like to reshape part of the results. The data set always starts with the first few columns and is followed by a variable number of columns that group the data. If the key belongs to that group, it will be marked by an x. Each key might belong to multiple groups. It could also be empty. The data structure is like this:

Key  Date Added Group1Name Group2Name Group3Name ... GroupXName
1    1/1/2018   x           X
2    1/1/2018               x
3    1/1/2018                          
4    1/1/2018   x 
5    1/1/2018                                         x

I want to reformat as:

Key  Date Added Group
1    1/1/2018   Group1Name,Group2Name
2    1/1/2018   Group2Name           
3    1/1/2018        
4    1/1/2018   Group1Name
5    1/1/2018   GroupXName

Seems like you havent tried much and it's hard to really reproduce your data with what you provided but the idea is to have the columns have the proper values instead of 'x' and to take the dataframe from wide to long format...

columns_to_consider = ['Group1Name',  'Group2Name', ... ]
for column in columns_to_consider:
    df[column] = df[column].str.replace('X', column)
reshaped_df = pd.melt(df, id_vars=['Key', 'Date Added'], value_vars=columns_to_consider)

Use apply with axis=1 param:

def group_func(series):
        values = []
        for val, idx in zip(series, series.index.values):
            if val is 'x':
                values += [str(idx)]
        return " ".join(values)

cols_to_agg = ['Group1Name', 'Group2Name', 'Group3Name', 'Group4Name']
df.loc[:,'Group'] = df.loc[:,cols_to_agg].apply(group_func, axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM