I have a pandas dataframe in which one column of text strings contains new line separated values. I want to split each CSV field and create a new row per entry.
My Data Frame is like:
Col-1 Col-2
A Notifications
Returning Value
Both
B mine
Why Not?
Expected output is:
Col-1 Col-2
A Notifications
A Returning Value
A Both
B mine
B Why Not?
First replace()
string ''
with np.nan
and then use fillna(method='ffill')
:
df = pd.DataFrame({'Col-1':['A','','','B',''],
'Col-2':['Notifications','Returning Value','Both','mine','Why Not?']})
df
Col-1 Col-2
0 A Notifications
1 Returning Value
2 Both
3 B mine
4 Why Not?
df['Col-1'] = df['Col-1'].replace('',np.nan).fillna(method='ffill')
df
Col-1 Col-2
0 A Notifications
1 A Returning Value
2 A Both
3 B mine
4 B Why Not?
Reconstruct second column to flatten series and then just concatenate it with first column:
df = pd.DataFrame({'Col-1': ['A', 'B'], 'Col-2': ['Notifications\nReturning Value\nBoth', 'mine\nWhy Not?']})
df
representation:
Col-1 Col-2
0 A Notifications\nReturning Value\nBoth
1 B mine\nWhy Not?
Main part:
series = pd.DataFrame(df['Col-2'].str.split('\n').tolist()).stack()
series.index = series.index.droplevel(1)
series.name = 'Col-2'
result = pd.concat([df['Col-1'], series], axis=1)
Result:
Col-1 Col-2
0 A Notifications
1 A Returning Value
2 A Both
3 B mine
4 B Why Not?
IIUC you want pd.reset_index()
Assuming your data is stored in a variable called df:
df = df.reset_index().set_index('Col-1')
a dummy example since you're not providing an easy way to create the MultiIndex:
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
First second
bar one 0.792900
two -0.070508
baz one -0.599464
two 0.334504
foo one 0.835464
two 1.614845
qux one 0.674623
two 1.907550
Now if we want the first column to be the index:
s = s.reset_index().set_index('first')
print(s)
second 0
first
bar one 0.792900
bar two -0.070508
baz one -0.599464
baz two 0.334504
foo one 0.835464
foo two 1.614845
qux one 0.674623
qux two 1.907550
More info here: Advanced Indexing
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.