简体   繁体   中英

How to extract the characters from a string that are inside parentheses?

Picture of the DataFrame:

I have one column named contracting and another named contractor inside a DataFrame.

I need to divide, for example, the column contractor, into 2 new columns: one column containing the Fiscal number that is inside the parenthesis and another column containing all the rest (the description).

Example:

Contractor: Meo(504615947)

I need that it becomes:

Contractor_Name: Meo and Contractor_Number:504615947

I tried to do this:

proc_2013[['contractor_description', 'contractor_NIF']]= pd.DataFrame(proc_2013['contractor'].str.split(('('),1).tolist())

proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('(\d+)')  

Problem 1:

I can have a name description inside a parenthesis as well, followed by the number that I am trying to extract.

Problem 2:

Sometimes, if the contractor is from a foreign country, it has a letter in the beginning of the Fiscal Number (not only numbers as I assumed at first, using my second line of code).

All Fiscal Numbers have 9 digits.

对于任何字母数字,您都可以将\\d更改为\\w

proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('\((\w+)\)')  

As far as i could understand your question, this can be a possible solution,

df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM