Extract last term after comma into new column

Question

I have a pandas dataframe which is essentially 2 columns and 9000 rows

CompanyName  |  CompanyAddress

and the address is in the form

Line1, Line2, ..LineN, PostCode

ie basically different numbers of comma-separated items in a string (or dtype 'object'), and I want to just pull out the post code ie the item after the last comma in the field

I've tried the Dot notation string manipulation suggestions (possibly badly):

df_address['CompanyAddress'] = df_address['CompanyAddress'].str.rsplit(', ')

which just put '[ ]' around the fields - I had no success trying to isolate the last component of any split-up/partitioned string, with maxsplit kicking up errors.

I had a small degree of success following EdChums comment to Pandas split Column into multiple columns by comma

pd.concat([df_address[['CompanyName']], df_address['CompanyAddress'].str.rsplit(', ', expand=True)], axis=1)

However, whilst isolating the Postcode, this just creates multiple columns and the post code is in columns 3-6... equally no good.

It feels incredibly close, please advise.

    EmployerName    Address
0   FAUCET INN LIMITED  [Union, 88-90 George Street, London, W1U 8PA]
1   CITIBANK N.A    [Citigroup Centre,, Canary Wharf, Canada Squar...
2   AGENCY 2000 LIMITED     [Sovereign House, 15 Towcester Road, Old Strat...
3   Transform Trust     [Unit 11 Castlebridge Office Village, Kirtley ...
4   R & R.C.BOND (WHOLESALE) LIMITED    [One General Street, Pocklington Industrial Es...
5   MARKS & SPENCER FINANCIAL SERVICES PLC  [Marks & Spencer Financial, Services Kings Mea...

Answer 1

Given the DataFrame,

df = pd.DataFrame({'Name': ['ABC'], 'Address': ['Line1, Line2, LineN, PostCode']})

    Address                         Name
0   Line1, Line2, LineN, PostCode   ABC

If you need only post code, you can extract that using rsplit and re-assign it to the column Address. It will save you the step of concat.

df['Address'] = df['Address'].str.rsplit(',').str[-1]

You get

    Address     Name
0   PostCode    ABC

Edit: Give that you have dataframe with address values in list

df = pd.DataFrame({'Name': ['FAUCET INN LIMITED'], 'Address': [['Union, 88-90 George Street, London, W1U 8PA']]})

    Address                                         Name
0   [Union, 88-90 George Street, London, W1U 8PA]   FAUCET INN LIMITED

You can get last element using

df['Address'] = df['Address'].apply(lambda x: x[0].split(',')[-1])

You get

    Address     Name
0   W1U 8PA     FAUCET INN LIMITED

Answer 2

Just rsplit the existing column into 2 columns - the existing one and a new one. Or two new ones if you want to keep the existing column intact.

df['Address'], df['PostCode'] = df['Address'].str.rsplit(', ', 1).str

Edit: Since OP's Address column is a list with 1 string in it, here is a solution for that specifically:

df['Address'], df['PostCode'] = df['Address'].map(lambda x: x[0]).str.rsplit(', ', 1).str

Answer 3

rsplit返回一个列表，尝试rsplit（'，'）[0]获取源代码行中的最后一个元素

Extract last term after comma into new column

Question

3 answers

solution1
2 ACCPTED 2018-04-04 21:09:56

solution2
0 2018-04-04 21:00:51

solution3
0 2018-04-04 21:02:27

Extract last term after comma into new column

Question

3 answers

solution1 2 ACCPTED 2018-04-04 21:09:56

solution2 0 2018-04-04 21:00:51

solution3 0 2018-04-04 21:02:27

solution1
2 ACCPTED 2018-04-04 21:09:56

solution2
0 2018-04-04 21:00:51

solution3
0 2018-04-04 21:02:27