Given a json
string of records where the "schema" for each record is not consistent (eg each record does not have the full set of "columns"):
s = """[{"a": 3, "b":[]}, {"a": 4, "b": [4]}, {"a": 5}]"""
A pandas DataFrame
can be constructed from this string:
import pandas as pd
import json
json_df = pd.DataFrame.from_records(json.loads(s))
Which results in
a b
0 3 []
1 4 [4]
2 5 NaN
How can all NaN
instances of a pandas Series
column be filled with empty list
values? The expected resulting DataFrame would be:
a b
0 3 []
1 4 [4]
2 5 []
I have tried the following; none of which worked:
json_df[json_df.b.isna()] = [[]]*json_df[json_df.b.isna()].shape[0]
from itertools import repeat
json_df[json_df.b.isna()] = repeat([], json_df[json_df.b.isna()].shape[0])
import numpy as np
json_df[json_df.b.isna()] = np.repeat([], json_df[json_df.b.isna()].shape[0])
Thank you in advance for your consideration and response.
first find the nan and replace by the same shape of data
json_df.loc[json_df.b.isnull(), 'b'] = json_df.loc[json_df.b.isnull(), 'b'].apply(lambda x: [])
a b
0 3 []
1 4 [4]
2 4 []
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.