[英]Extract non-digit characters before certain character in pandas dataframe
I have a pandas dataframe that looks like this: 我有一个看起来像这样的熊猫数据框:
> row extract_column
> 0 412952266-desiredtext1»randtext-irrelevant
> 1 512952766-desiredtext1»randtext-irrelevant
> 2 212952766-desiredtext1»randtext-irrelevant
> 3 112953066-desiredtext1»randtext-irrelevant
> 4 712953066-desiredtext1»randtext-irrelevant
> 5 612953366-desiredtext1»randtext-irrelevant
> 6 912953366-desiredtext1»randtext-irrelevant
> 7 412954866-desiredtext1»randtext-irrelevant
> 8 312954966-desiredtext1»randtext-irrelevant
> 9 212954966-desiredtext1»randtext-irrelevant
> 10 612955866-desiredtext1»randtext-irrelevant
> 11 912256266-desiredtext1»randtext-irrelevant
> 12 812256366-desiredtext1»randtext-irrelevant
> 13 512256566-desiredtext1»randtext-irrelevant
> 14 412256566-desiredtext1»randtext-irrelevant
> 15 312256566-desiredtext1»randtext-irrelevant
> 16 212256566-desiredtext1»randtext-irrelevant
> 17 612256566-desiredtext1»randtext-irrelevant
> 18 812956666-desiredtext2»randtext-irrelevant
> 19 912957166-desiredtext2»randtext-irrelevant
> 20 012957866-desiredtext2»randtext-irrelevant
> 21 12952966-desiredtext2»randtext-irrelevant
> 22 2012953066-desiredtext2»randtext-irrelevant
> 23 012953066-desiredtext2»randtext-irrelevant
> 24 312953066-desiredtext2»randtext-irrelevant
> 25 112254166-desiredtext2»randtext-irrelevant
> 26 712254166-desiredtext2»randtext-irrelevant
I want to get the desiredtext1, desiredtext2 fields from extract_column. 我想从extract_column获取desiredtext1,desiredtext2字段。 The desired data is always followed by the » symbol and preceded by 9 digits followed by a dash.
所需的数据始终后跟»符号,并在前跟9个数字和一个破折号。
尝试extract
df.extract_column.str.extract(r'-([^\.]*)\»', expand=False)
df.extract_column.str.extract('-(\\w+)')
Out[100]:
0
0 desiredtext1
1 desiredtext1
2 desiredtext1
3 desiredtext1
4 desiredtext1
5 desiredtext1
6 desiredtext1
7 desiredtext1
8 desiredtext1
9 desiredtext1
10 desiredtext1
11 desiredtext1
12 desiredtext1
13 desiredtext1
14 desiredtext1
15 desiredtext1
16 desiredtext1
17 desiredtext1
18 desiredtext2
19 desiredtext2
20 desiredtext2
21 desiredtext2
22 desiredtext2
23 desiredtext2
24 desiredtext2
25 desiredtext2
26 desiredtext2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.