简体   繁体   English

提取长字符串之间的值

[英]Extracting a value between a long string

I am trying to extract a string from a longer string in one of my columns. 我正在尝试从我的一列中的较长字符串中提取字符串。

Here is a sample of what I have tried: 这是我尝试过的示例:

df['Campaign'] = df.full_utm.str.extract('utm_campaign=([^&]*)')

and this is a sample of the string I am referring to: 这是我所指的字符串的示例:

?utm_source=Facebook&utm_medium=CPC&utm_campaign=April+Merchants+LAL+-+All+SA+-+CAP+250&utm_content=01noprice

The problem is that this only returns this: 问题是,这只会返回以下内容:

A

The desired output in this context would be 在这种情况下,期望的输出将是

April+Merchants+LAL+-+All+SA+-+CAP+250

Use urlparse 使用urlparse

Ex: 例如:

import urllib.parse as urlparse

df['Campaign'] = df["full_utm"].apply(lambda x: urlparse.parse_qs(urlparse.urlparse(x).query)["utm_campaign"]) 
print(df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM