I am new to Python. I have the data out from a plugin which is coming through an excel sheet and I need to extract the values from that column.
Plugin Output
Country:USA State: Virginia Address: 23 xys lane SSN:2345550404 Zip : 22102 City: Fairfax
Country:India State:Virginia SSN:2345550401 ZIP:452002 City: Indore
I need to search the SSN in each row and create a new column in the new pandas data frame to create a separate column.
Desired Output:
SSN
2345550404
2345550401
Answer for Serial Number:
def find_serialnumber(x):
num = re.findall(r'Serial Number:\s*([^\n]+)', x)
return " ".join(num)
import re
def find_number(x):
num = re.findall(r'(?:SSN_)(\d+)', x)
return " ".join(num)
df['SSN'] =df['Output'].apply(lambda x: find_number(x))
Also extract function from pandas:
So \d+ means match 1 or more digits.
df['SSN'] = df['Output'].apply(lambda x: re.findall(r'(?:SSN_)(\d+)', x)[0] if re.findall(r'(?:SSN_)(\d+)', x) else x)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.