简体   繁体   中英

spread json data in one column to multiple columns

Using Pandas I have created a csv file which contains data like this (hundred rows):

first                            second
{'val1': 'm', 'val2': 'f'}       {'val3': 'm', 'val4': 'f'}
{'val1': 'g', 'val2': 'h'}       {'val3': 'i', 'val4': 'k'}
...

Is there any way to replace the current header with the first part of Json data. Its value should be under that column. I mean something like this:

   val1    val2   val3    val4
    'm'     'f'    'm'     'f'
    'g'     'h'    'i'     'k'

Here's a way to do that. I'm using eval since it seems that the CSV contains string representation of dictionaries. (note that you should be certain that the CSV comes from a safe place and doesn't contain any malicious code - eval is problematic from a security perspective).

df["first"] = df["first"].apply(eval)
df["second"] = df["second"].apply(eval)

res = pd.concat([pd.json_normalize(df["first"]), pd.json_normalize(df["second"])], axis=1)
print(res)

==>
  val1 val2 val3 val4
0    m    f    m    f
1    g    h    i    k

Lets try this,

# pre-processing would be required if column value is string literal
from ast import literal_eval

(
    pd.concat([
        df[col].apply(lambda x: literal_eval(x))
            .apply(pd.Series) for col in df.columns
    ], axis=1)
)

  val1 val2 val3 val4
0    m    f    m    f
1    g    h    i    k

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM