简体   繁体   English

如何在数据框列上正确使用.apply(lambda x :)

[英]How to correctly use .apply(lambda x:) on dataframe column

The issue I'm having is an error Im receiving from df_modified['lat'] = df.coordinates.apply(lambda x: x[0]) It returns error TypeError: 'float' object is not subscriptable . 我遇到的问题是Im从df_modified['lat'] = df.coordinates.apply(lambda x: x[0])收到错误,它返回错误TypeError: 'float' object is not subscriptable Since "coordinates" is already a list (see JSON SNIPPET) I was trying to use lambda to pull out the element [0] and place it in a new column named "lat" and place element [1] in a new column named "long". 由于“坐标”已经是列表(请参阅JSON SNIPPET),因此我尝试使用lambda提取元素[0]并将其放置在名为“ lat”的新列中,并将元素[1]放置在名为“”的新列中长”。 Any help with this problem would be appreciated. 任何有关此问题的帮助将不胜感激。 Thank you! 谢谢!

import pandas as pd
import json
import requests
from pandas.io.json import json_normalize

# READS IN JSON
source = requests.get('www.url.com')
data = json.loads(source.text)

# Flattens the JSON data since it had nested dictionaries
df = pd.io.json.json_normalize(data)

# Renamed "lat_long.coordinates" because the "." was confusing .apply() function
df.rename(columns={'lat_long.coordinates': 'coordinates'}, inplace=True)

# Created a new data frame with seleted columns
df_modified = df.loc[:, ['county_name', 'arrests', 'incident_count']]

# Attempt to create a new column "lat" and "long" and place the elemnts accordingly  i.e. [-75.802503,  41.820569]
df_modified['lat'] = df.coordinates.apply(lambda x: x[0])
df_modified['long'] = df.coordinates.apply(lambda x: x[1])

print(df_modified.head(30))

SAMPLE JSON SNIPPET 样本JSON片段

{
    ":@computed_region_amqz_jbr4": "587",
    ":@computed_region_d3gw_znnf": "18",
    ":@computed_region_nmsq_hqvv": "55",
    ":@computed_region_r6rf_p9et": "36",
    ":@computed_region_rayf_jjgk": "295",
    "arrests": "1",
    "county_code": "44",
    "county_code_text": "44",
    "county_name": "Mifflin",
    "fips_county_code": "087",
    "fips_state_code": "42",
    "incident_count": "1",
    "lat_long": {
      "type": "Point",
      "coordinates": [
        -77.620031,
        40.612749
      ]
    }

You can do it the other way around. 您可以采用其他方法。 Take the lat and long prior to filtering the columns. 就拿latlong之前过滤列。

import pandas as pd

import json

with open('sample.json') as infile:
    data = json.load(infile)

df = pd.io.json.json_normalize(data)

df.rename(columns={'lat_long.coordinates': 'coordinates'}, inplace=True)
df['lat'] = df['coordinates'].apply(lambda x: x[0])
df['long'] = df['coordinates'].apply(lambda x: x[1])

# Created a new data frame with seleted columns
df_modified = df.loc[:, ['county_name', 'arrests', 'incident_count', 'lat', 
                         'long']]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM