I have below data frame with 5 columns, I need to check specific string("-") in all columns and add precedent value in new column(F) if "-" is found. for example, "-" is located in Column B row zero and two; hence, 'a' and 'c'[precedent Column value] are added in Column(F) in related rows and so on.
Source Data Frame:
Desired Data Frame would be:
I have written below codes but get value length error when I want to create new Column(F), appreciate your support.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e'},
'B': {0: '-', 1: 'a', 2: '-', 3: 'b', 4: 'd'}})
df['C'] = np.where(df['B'].isin(df['A'].values), df['B'], np.nan)
df['C'] = df['C'].map(dict(zip(df.A.values, df.B.values)))
df['D'] = np.where(df['C'].isin(df['B'].values), df['C'], np.nan)
df['D'] = df['D'].map(dict(zip(df.B.values, df['C'].values)))
df['E'] = np.where(df['D'].isin(df['C'].values), df['D'], np.nan)
df['E'] = df['E'].map(dict(zip(df['C'].values, df['D'].values)))
a=np.array(df.iloc[:,:5])
g=[]
for index,x in np.ndenumerate(a):
temp=[]
if x=="-":
temp.append(x-1)
g.append(temp)
df['F']=g
print(df)
Replace misisng values to all columns by DataFrame.where
exclude previous values by -
compared by DataFrame.shift
ed values, then back filling missing values and select first column by position:
df['F'] = df.where(df.shift(-1, axis=1).eq('-')).bfill(axis=1).iloc[:, 0]
print (df)
A B F
0 a - a
1 b a NaN
2 c - c
3 d b NaN
4 e d NaN
You can do:
df['F']=[i[0][-1] if len(i)>1 else np.nan for i in df.fillna('').sum(axis=1).str.split('-') ]
output:
df['F']
Out[41]:
0 a
1 a
2 c
3 a
4 a
Name: F, dtype: object
List Comprehension Explanation:
sum
it across rows
-
-
if length is > 1, else -
wont be present hence fill with np.nan
[-1]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.