简体   繁体   English

python pandas substring 基于列值

[英]python pandas substring based on columns values

Given the following df:给定以下df:

 data = {'Description': ['with lemon', 'lemon', 'and orange', 'orange'], 'Start': ['6', '1', '5', '1'], 'Length': ['5', '5', '6', '6']} df = pd.DataFrame(data) print (df)

I would like to substring the "Description" based on what is specified in the other columns as start and length, here the expected output:我想 substring 的“描述”基于其他列中指定的开始和长度,这里是预期的 output:

 data = {'Description': ['with lemon', 'lemon', 'and orange', 'orange'], 'Start': ['6', '1', '5', '1'], 'Length': ['5', '5', '6', '6'], 'Res': ['lemon', 'lemon', 'orange', 'orange']} df = pd.DataFrame(data) print (df)

Is there a way to make it dynamic or another compact way?有没有办法让它动态或其他紧凑的方式?

 df['Res'] = df['Description'].str[1:2]

You need to loop, a list comprehension will be the most efficient (python ≥3.8 due to the walrus operator, thanks @I'mahdi):您需要循环,列表理解将是最有效的(python ≥3.8,由于海象运算符,感谢@I'mahdi):

 df['Res'] = [s[(start:=int(a)-1):start+int(b)] for (s,a,b) in zip(df['Description'], df['Start'], df['Length'])]

Or using pandas for the conversion (thanks @DaniMesejo):或使用 pandas 进行转换(感谢@DaniMesejo):

 df['Res'] = [s[a:a+b] for (s,a,b) in zip(df['Description'], df['Start'].astype(int)-1, df['Length'].astype(int))]

output: output:

 Description Start Length Res 0 with lemon 6 5 lemon 1 lemon 1 5 lemon 2 and orange 5 6 orange 3 orange 1 6 orange

Given that the fruit name of interest always seems to be the final word in the description column, you might be able to use a regex extract approach here.鉴于感兴趣的水果名称似乎总是描述列中的最后一个词,您可能可以在此处使用正则表达式提取方法。

 data["Res"] = data["Description"].str.extract(r'(\w+)$')

You can use.map to cycle through the Series, do split(' ') to separate the words if there is space and get the last word in the list [-1].您可以使用 .map 循环浏览系列,如果有空格,请使用 split(' ') 分隔单词并获取列表中的最后一个单词 [-1]。

df['RES'] = df['Description'].map(lambda x: x.split(' ')[-1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM