简体   繁体   中英

How to extract specific value from excel column using python pandas dataframe

Need to extract specific value from excel column using python pandas dataframe

The column Product that I am trying to extract looks the below & need to extract only Product # from it. The column also has other numbers but the Product # always comes after the term 'UK Pro' & Product # could be 3 to 4 digit number in a particular row of data.

In[1]:

df['Product'].head()

#Dataframe looks like this:

Out[1]:

Checking center: King 2000: UK Pro 1000: London

Checking center: Queen 321: UK Pro 250: Spain

CC: UK Pro 3000: France

CC: UK Pro 810: Poland

Expected Output:

Product #

1000

250

3000

810

Started with this:

df['Product #'] = df1['Product'].str.split(':').str[1]

But this does split only based on the first two occurrence of: operator.

Then tried this:

df1['Product #'] = df1['Product'].str.split('UK Pro', 1).str[0].str.strip()

You can use pandas.Series.str.extract :

df["Product #"] = df["Product"].str.extract("UK Pro (\d+)", expand=False)

# Output:

print(df)
   Product #
0        NaN
1        NaN
2       1000
3        NaN
4        NaN
5        250
6        NaN
7       3000
8        NaN
9        810
10       NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM