Transform string to Pandas df

Question

I have the string like that:

'key=IAfpK, age=58, key=WNVdi, age=64, key=jp9zt, age=47'

How can I transform it to Pandas DataFrame?

	key	age
0
1

Thank you

Answer 1

Use:

In [919]: s = 'key=IAfpK, age=58, key=WNVdi, age=64, key=jp9zt, age=47'
In [922]: d = {}

In [927]: for i in s.split(', '):
     ...:     ele, val = i.split('=')
     ...:     if ele in d:
     ...:         d[ele].append(val)
     ...:     else:
     ...:         d[ele] = [val]
     ...: 
In [930]: df = pd.DataFrame(d)

In [931]: df
Out[931]: 
     key age
0  IAfpK  58
1  WNVdi  64
2  jp9zt  47

Answer 2

A quick and somewhat manual way to do it would be to first create a list of dict values appending each string. Then convert that list to a dataframe. ( https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html ):

import pandas as pd

keylist = []
keylist.append({"key": 'IAfpK', "age": '58'})
keylist.append({"key": 'WNVdi', "age": '64'})
keylist.append({"key": 'jp9zt', "age": '47'})

#convert the list of dictionaries into a df
key_df = pd.DataFrame(keylist, columns = ['key', 'age'])

However, this is only efficient for that specific string you mentioned, if you need to work on a longer string/more data then a for loop would be more efficient.

Although I think this answers your question, there are probably more optimal ways to go about it :)

Answer 3

Try:

s = "key=IAfpK, age=58, key=WNVdi, age=64, key=jp9zt, age=47"

x = (
    pd.Series(s)
    .str.extractall(r"key=(?P<key>.*?),\s*age=(?P<age>.*?)(?=,|\Z)")
    .reset_index(drop=True)
)
print(x)

Prints:

     key age
0  IAfpK  58
1  WNVdi  64
2  jp9zt  47

Transform string to Pandas df

Question

3 answers

solution1
0 2022-05-30 16:41:07

solution2
0 2022-05-30 16:42:36

solution3
0 2022-05-30 20:04:14

Transform string to Pandas df

Question

3 answers

solution1 0 2022-05-30 16:41:07

solution2 0 2022-05-30 16:42:36

solution3 0 2022-05-30 20:04:14

solution1
0 2022-05-30 16:41:07

solution2
0 2022-05-30 16:42:36

solution3
0 2022-05-30 20:04:14