[英]Pandas column indexing strings
So I want to take only the first three characters of a pandas column and match them. 因此,我只想获取pandas列的前三个字符并将其匹配。 This is what I have come up with but the implementation is incorrect: 这是我想出的,但是实现不正确:
df.loc[df[0:2] == 'x, y] = 'x'
You are close, need str
and define column for replacement if df
is DataFrame
, also for x, y
there is 4
characters with whitespace: 您很接近,如果df
是DataFrame
,则需要str
并定义要替换的列,对于x, y
还有4
带空格的字符:
df.loc[df['col'].str[:4] == 'x, y', 'col'] = 'x'
#another solution
#df.loc[df['col'].str.startswith('x, y'), 'col'] = 'x'
If working with Series
: 如果使用Series
:
s[s.str[:4] == 'x, y'] = 'x'
Sample : 样品 :
df = pd.DataFrame({'col':['x, y temp', 'sx, y', 'x, y', 's']})
print (df)
col
0 x, y temp
1 sx, y
2 x, y
3 s
#if want replace substring
df['col1'] = df['col'].str.replace('^x\, y', 'x')
#if want set new value if condition
df.loc[df['col'].str[:4] == 'x, y', 'col'] = 'x'
print (df)
col col1
0 x x temp <-col1 replace only substring
1 sx, y sx, y
2 x x
3 s s
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.