I am trying to extract sub string from username column . but i am not getting my actual result. my df is like below
data = {'Name':['inf.negem.netmgmt', 'infbe_cdb', 'inf_igh', 'INF_EONLOG','inf.dkprime.netmgmt','infaus_mgo','infau_abr']}
df = pd.DataFrame(data)
print(df)
Name
0 inf.negem.netmgmt
1 infbe_cdb
2 inf_igh
3 INF_EONLOG
4 inf.dkprime.netmgmt
5 infaus_mgo
6 infau_abr
I tried following code.but i am not
df['Country'] = df['Name'].str.slice(3,6)
I would like to see output like below
output = {'Country':['No_Country', 'be', 'No_Country', 'No_Country','No_Country','aus','au']}
df = pd.DataFrame(output)
print(df)
Country
0 No_Country
1 be
2 No_Country
3 No_Country
4 No_Country
5 aus
6 au
Note: I would like to extract words between 'inf' and '_' as country and would like to create new column as Country. if nothing is there after inf then it's value is 'No_Country'
Here's one way using str.extract
:
df['Country'] = (df.Name.str.lower()
.str.extract(r'inf(.*?)_')
.replace('', float('nan'))
.fillna('No_Country'))
print(df)
Name Country
0 inf.negem.netmgmt No_Country
1 infbe_cdb be
2 inf_igh No_Country
3 INF_EONLOG No_Country
4 inf.dkprime.netmgmt No_Country
5 infaus_mgo aus
6 infau_abr au
Using list comprehension and re.findall
:
import re
df['Country'] = ["".join(re.findall(r'inf(.*?)_', i)) for i in df['Name']]
print(df)
Name Country
0 inf.negem.netmgmt
1 infbe_cdb be
2 inf_igh
3 INF_EONLOG
4 inf.dkprime.netmgmt
5 infaus_mgo aus
6 infau_abr au
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.