简体   繁体   中英

Transforming data frame in python

I have a situation where I have the following dataset:

ID      A     B    C
1      aa     -    -
2      -      bb   -
3      -      -    cc
4      aaa    -    -

that should be transformed to the following data frame:

ID    A
 1    aa
 2    bb
 3    cc
 4    aa

So essentially shifting rows so that it fits to the first column

you can use bfill with axis along columns after replace the symbol '-' by nan:

df_ = df.replace('-', np.nan).bfill(1)[['ID', 'A']]
print(df_)
  ID    A
0  1   aa
1  2   bb
2  3   cc
3  4  aaa

You can use df.replace to replace - with np.nan and set 'ID' as index then df.stack and df.droplevel

df.replace('-',np.nan).set_index('ID').stack().droplevel(1)

ID
1     aa
2     bb
3     cc
4    aaa
dtype: object
for i in range(len(df)):
    if df.at[i, 'A'] == '-':
        if df.at[i, 'B'] == '-':
            df.at[i, 'A'] = df.at[i, 'C']
        else:
            df.at[i, 'A'] = df.at[i, 'B']

df.drop(['B', 'C'], axis=1, inplace = True)

Just using nested-if statements to find the column in which we have something other than - and assigning that value to the value in column A .

**Output** : df

    ID  A
0   1   aa
1   2   bb
2   3   cc
3   4   aaa

You can try this:

df.replace('-', np.nan, inplace=True)
df['new'] = df[df.columns[1:]].apply(
    lambda x: ''.join(x.dropna().astype(str)),
    axis=1
)
df = df[['ID', 'new']]
print(df)

Ouput:

   ID  new
0   1   aa
1   2   bb
2   3   cc
3   4  aaa

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM