I have a df:
MinMaleTA
28 888(G2M)
29 888(AAM)
30 888(G2M)
31 888(G2M)
32 888(AAM)
33 888(G2M)
34 888(G2M)
35 888(AAM)
36 888(G2M)
37 888(G2M)
38 888(G2M)
39 888(G2M)
40 888(AAM)
41 888(G2M)
42 888(G2M)
43 888(G2M)
sometimes more than 3 digit string inside '()',like:
28 888(G2MPTM)
How can I the string between '()' in MinMaleTA.
something like:
result = df['MinMaleTA'].startwith"(" and endwith")"
the output for the first 2 rows should be:
G2M AAM
Use str.extract
method with a regex:
>>> df['MinMaleTA'].str.extract(r'\((.*)\)')
0
28 G2M
29 AAM
30 G2M
31 G2M
32 AAM
33 G2M
34 G2M
35 AAM
36 G2M
37 G2M
38 G2M
39 G2M
40 AAM
41 G2M
42 G2M
43 G2M
\\(
and \\)
match the character (
and )
(.*)
is the capturing group that match any number of characters.
如果字符串始终具有相同的构造 - 并且在( )
具有相同的大小
result = df['MinMaleTA'].str[-4:-1]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.