The code below effectively merges all values in a pandas df
row
before any 4 letter string
. This only applies to rows directly underneath X
in Col A
.
df = pd.DataFrame({
'A' : ['X','Foo','No','','X','Big','No'],
'B' : ['','Bar','Merge','','','Cat','Merge'],
'C' : ['','Fubu','XXXX','','','BgCt','YYYY'],
})
maskX = df.iloc[:,0].apply(lambda x: x=='X')
maskX.index += 1
maskX = pd.concat([pd.Series([False]), maskX])
maskX = maskX.drop(len(maskX)-1)
mask = (df.iloc[:, 1:].applymap(len) == 4).cumsum(1) == 0
for i,v in maskX.items():
mask.iloc[i,:] = mask.iloc[i,:].apply(lambda x: x and v)
df.A[maskX] = df.A + df.iloc[:, 1:][mask].fillna('').apply(lambda x: x.sum(), 1)
df.iloc[:, 1:] = df.iloc[:, 1:][~mask].fillna('')
This works fine unless there's values other than strings in the df. So if include floats or integers it returns an error to that Column. eg
df = pd.DataFrame({
'A' : ['X','Foo','No','','X','Big','No'],
'B' : ['','Bar','Merge','','','Cat','Merge'],
'C' : ['','Fubu','XXXX','','','BgCt','YYYY'],
'D' : ['','',1.0,2.0,3.0,'',''],
})
TypeError: ("object of type 'float' has no len()", 'occurred at index D')
I'm not quite sure why because the merge
only occurs to the row
beneath X
in Col A
. None of which contains floats
?
applymap
applies the function len
to each element of the dataframe. Since floating-point numbers do not have length, the function cannot be applied to them. If you still want to know their "length," convert them to strings:
df.iloc[:, 1:].astype(str).applymap(len)
However, be advised that the function str
is not guaranteed to produce a particular string representation of a float. For example, len(str(5.0000))
is 3, not 6, as you might expect.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.