简体   繁体   中英

Python/Pandas - Updating dataframe from JSON with conditions

I'm trying to add a new column to my dataframe like this:

df_precip_avail_rain_hourly['coordE'] = [
        item for item in data["features"] 
        if item["properties"]["cellId"] == df_precip_avail_rain_hourly.SId
    ][0]["geometry"]["coordinates"][0][0][0]

Without the pandas update, this yields a float:

[item for item in data["features"] 
 if item["properties"]["cellId"] == 38][0]["geometry"]["coordinates"][0][0][0]
#returns 10.914622377957983

However, If I want to update my DF with it, I get the following error:

ValueError                                Traceback (most recent call last)
<ipython-input-154-bbdf5e48ffd5> in <module>()
----> 1 df_precip_avail_rain_hourly['coordE'] = [item for item in data["features"] if (item["properties"]["cellId"] == df_precip_avail_rain_hourly.SId).bool()][0]["geometry"]["coordinates"][0][0][0]

<ipython-input-154-bbdf5e48ffd5> in <listcomp>(.0)
----> 1 df_precip_avail_rain_hourly['coordE'] = [item for item in data["features"] if (item["properties"]["cellId"] == df_precip_avail_rain_hourly.SId).bool()][0]["geometry"]["coordinates"][0][0][0]

/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py in bool(self)
    908                              "{0}".format(self.__class__.__name__))
    909 
--> 910         self.__nonzero__()
    911 
    912     def __abs__(self):

/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py in __nonzero__(self)
    890         raise ValueError("The truth value of a {0} is ambiguous. "
    891                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 892                          .format(self.__class__.__name__))
    893 
    894     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I tried to use .bool() like:

df_precip_avail_rain_hourly['coordE'] = [
        item for item in data["features"] 
        if (item["properties"]["cellId"] == df_precip_avail_rain_hourly.SId).bool()
    ][0]["geometry"]["coordinates"][0][0][0]

The same error appears however. What can I do to resolve this? Thank you!

EDIT df_precip_avail_rain_hourly has data like:

index SId
1     38
2     38
3     46

And data is a JSON with elements like:

{'geometry': {'coordinates': [[[10.914622377957983, 45.682007076150505],
    [10.927456267537572, 45.68179119797432],
    [10.927147329501077, 45.672795442796335],
    [10.914315493899755, 45.67301125363092],
    [10.914622377957983, 45.682007076150505]]],
  'type': 'Polygon'},
 'id': 0,
 'properties': {'cellId': 38},
 'type': 'Feature'}

From this, I'd like to make

index SId coordE
1     38  10.914622377957983
2     38  10.914622377957983
3     46  11.995422377959684

etc.

Pandas does not understand how to evaluation this line of your code:

if item["properties"]["cellId"] == df_precip_avail_rain_hourly.SId

It is trying to compare (what looks like) a single value to the entire series SId . Passing this to if is causing the ambiguity.

A better approach would be to convert data into a data frame, then merge the data frames:

df_coords = pd.DataFrame(
    [[item['properties']['cellId'], item['geometry']['coordinates'][0][0][0]] 
     for item in data], columns=['SId','coordE'])

df_precip_avail_rain_hourly.merge(df_coords, how='left', on='SId')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM