Need to extract specific value from excel column using python pandas dataframe
The column Product that I am trying to extract looks the below & need to extract only Product # from it. The column also has other numbers but the Product # always comes after the term 'UK Pro' & Product # could be 3 to 4 digit number in a particular row of data.
In[1]:
df['Product'].head()
#Dataframe looks like this:
Out[1]:
Checking center: King 2000: UK Pro 1000: London
Checking center: Queen 321: UK Pro 250: Spain
CC: UK Pro 3000: France
CC: UK Pro 810: Poland
Expected Output:
Product #
1000
250
3000
810
Started with this:
df['Product #'] = df1['Product'].str.split(':').str[1]
But this does split only based on the first two occurrence of: operator.
Then tried this:
df1['Product #'] = df1['Product'].str.split('UK Pro', 1).str[0].str.strip()
You can use pandas.Series.str.extract
:
df["Product #"] = df["Product"].str.extract("UK Pro (\d+)", expand=False)
print(df)
Product #
0 NaN
1 NaN
2 1000
3 NaN
4 NaN
5 250
6 NaN
7 3000
8 NaN
9 810
10 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.