get partial string contained in “()” from a pandas DataFrame

Question

I have a df:

      MinMaleTA

28    888(G2M)
29    888(AAM)
30    888(G2M)
31    888(G2M)
32    888(AAM)
33    888(G2M)
34    888(G2M)
35    888(AAM)
36    888(G2M)
37    888(G2M)
38    888(G2M)
39    888(G2M)
40    888(AAM)
41    888(G2M)
42    888(G2M)
43    888(G2M)

sometimes more than 3 digit string inside '()',like:

 28 888(G2MPTM)

How can I the string between '()' in MinMaleTA.

something like:

result = df['MinMaleTA'].startwith"(" and endwith")"

the output for the first 2 rows should be:

G2M AAM

Answer 1

Use str.extract method with a regex:

>>> df['MinMaleTA'].str.extract(r'\((.*)\)')
      0
28  G2M
29  AAM
30  G2M
31  G2M
32  AAM
33  G2M
34  G2M
35  AAM
36  G2M
37  G2M
38  G2M
39  G2M
40  AAM
41  G2M
42  G2M
43  G2M

\\( and \\) match the character ( and )

(.*) is the capturing group that match any number of characters.

Answer 2

如果字符串始终具有相同的构造 - 并且在( )具有相同的大小

result = df['MinMaleTA'].str[-4:-1]

get partial string contained in “()” from a pandas DataFrame

Question

2 answers

solution1
1 ACCPTED 2021-07-29 21:41:04

solution2
0 2021-07-30 02:05:00

get partial string contained in “()” from a pandas DataFrame

Question

2 answers

solution1 1 ACCPTED 2021-07-29 21:41:04

solution2 0 2021-07-30 02:05:00

solution1
1 ACCPTED 2021-07-29 21:41:04

solution2
0 2021-07-30 02:05:00