[英]How to slice strings in a column by another column in pandas
df=pd.DataFrame({'A':['abcde','fghij','klmno','pqrst'], 'B':[1,2,3,4]})
I want to slice column A by column B eg: abcde[:1]=a, klmno[:3]=klm
but two statements all failed: 我想按列B对列A进行切片,例如: abcde[:1]=a, klmno[:3]=klm
但是两个语句都失败了:
df['new_column']=df.A.map(lambda x: x.str[:df.B])
df['new_column']=df.apply(lambda x: x.A[:x.B])
TypeError: string indices must be integers TypeError:字符串索引必须是整数
and 和
df['new_column']=df['A'].str[:df['B']]
new_column
return NaN
new_column
返回NaN
Try to get new_column
: 尝试获取new_column
:
A B new_column
0 abcde 1 a
1 fghij 2 fg
2 klmno 3 klm
3 pqrst 4 pqrs
Thank you so much 非常感谢
You need axis=1
in the apply
method to loop through rows: 您需要apply
方法中的axis=1
来遍历行:
df['new_column'] = df.apply(lambda r: r.A[:r.B], axis=1)
df
# A B new_column
#0 abcde 1 a
#1 fghij 2 fg
#2 klmno 3 klm
#3 pqrst 4 pqrs
A less idiomatic but usually faster solution is to use zip
: 不太习惯但通常更快的解决方案是使用zip
:
df['new_column'] = [A[:B] for A, B in zip(df.A, df.B)]
df
# A B new_column
#0 abcde 1 a
#1 fghij 2 fg
#2 klmno 3 klm
#3 pqrst 4 pqrs
%timeit df.apply(lambda r: r.A[:r.B], axis=1)
# 1000 loops, best of 3: 440 µs per loop
%timeit [A[:B] for A, B in zip(df.A, df.B)]
# 10000 loops, best of 3: 27.6 µs per loop
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.