[英]Pandas make new column from string slice of another column
I want to create a new column in Pandas using a string sliced for another column in the dataframe.我想使用为数据框中另一列切片的字符串在 Pandas 中创建一个新列。
For example.例如。
Sample Value New_sample
AAB 23 A
BAB 25 B
Where New_sample
is a new column formed from a simple [:1]
slice of Sample
其中
New_sample
是由简单的[:1]
Sample
切片形成的新列
I've tried a number of things to no avail - I feel I'm missing something simple.我尝试了很多方法都无济于事 - 我觉得我错过了一些简单的东西。
What's the most efficient way of doing this?这样做的最有效方法是什么?
You can call the str
method and apply a slice, this will be much quicker than the other method as this is vectorised (thanks @unutbu):您可以调用
str
方法并应用切片,这将比其他方法快得多,因为这是矢量化的(感谢@unutbu):
df['New_Sample'] = df.Sample.str[:1]
You can also call a lambda function on the df but this will be slower on larger dataframes:您还可以在 df 上调用 lambda 函数,但这在较大的数据帧上会变慢:
In [187]:
df['New_Sample'] = df.Sample.apply(lambda x: x[:1])
df
Out[187]:
Sample Value New_Sample
0 AAB 23 A
1 BAB 25 B
You can also use slice()
to slice string of Series
as following:您还可以使用
slice()
对Series
字符串进行切片,如下所示:
df['New_sample'] = df['Sample'].str.slice(0,1)
From pandas documentation :来自熊猫文档:
Series.str.slice(start=None, stop=None, step=None)
系列.str.slice(开始=无,停止=无,步骤=无)
Slice substrings from each element in the Series/Index
从系列/索引中的每个元素切片子字符串
For slicing index ( if index is of type string ), you can try:对于切片索引(如果索引是字符串类型),您可以尝试:
df.index = df.index.str.slice(0,1)
Adding solution to a common variation when the slice width varies across DataFrame Rows:当切片宽度跨 DataFrame Rows 变化时,为常见变化添加解决方案:
#--Here i am extracting the ID part from the Email (i.e. the part before @)
#--First finding the position of @ in Email
d['pos'] = d['Email'].str.find('@')
#--Using position to slice Email using a lambda function
d['new_var'] = d.apply(lambda x: x['Email'][0:x['pos']],axis=1)
#--Imagine x['Email'] as a string on which, slicing is applied
Hope this Helps !希望这可以帮助 !
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.