简体   繁体   English

使用涉及函数的Python将DataFrame填充到for循环中

[英]Fill DataFrame in for loop with Python involving a function

I want to retrieve geo data (long/lat) from Mapzen for addresses in Germany. 我想从Mapzen检索地理数据(长/纬度)以获取德国的地址。 Mapzen provides an API which asks for a key. Mapzen提供了一个要求输入密钥的API。 Each request returns a Json. 每个请求都返回一个Json。

This following code returns the long/lat and address name for one address: 以下代码返回一个地址的长/低和地址名称:

import pandas as pd
import request

BASE_URL = 'https://search.mapzen.com/v1/search'
txt = 'Stübekamp 33, Hamburg, Germany'
resp = requests.get(BASE_URL, params = {'api_key': "YourKey", 'size': 1, 'text': txt})
data = resp.json()
Full = pd.DataFrame(columns=["Long", "Lat", "Street"])
LongLat = data["bbox"][0:2]
Street = data["features"][0]["properties"]["label"]
Full.loc[1] = pd.Series({"Long": LongLat[1], "Lat": LongLat[0], "Street": Street})

I tried to replace the txt argument to loop over it, but as long as I understand the request.get method cannot be looped over. 我尝试替换txt参数以对其进行循环,但是只要我了解request.get方法无法被循环即可。 Therefore, I followed this approach and defined a function which I use in a for loop. 因此,我遵循了这种方法,并定义了一个我在for循环中使用的函数。

What I want the for loop to do is to paste the string of one row in addresses in the txt argument in the function. 我想要for循环做的是将一行的字符串粘贴到函数中txt参数中的地址中。 This should be done n times, whereas n is the length of the addresses vector. 这应该完成n次,而n是地址向量的长度。 The retrieved information (long/lat/address) should be added to a new row in the AllAddresses DataFrame. 检索到的信息(长/纬度/地址)应添加到AllAddresses数据帧的新行中。 So in the end I have a DataFrame with three Columns ("Long", "Lat", "Street") and in this case 3 rows. 因此,最后我有一个包含三个列(“ Long”,“ Lat”,“ Street”)的DataFrame,在这种情况下为三行。

def Getall(Input):
    resp = requests.get('https://search.mapzen.com/v1/search', params = {'api_key': "YourKey", 'size': 1, 'text': Input})
    data = resp.json()
    LongLat = data["bbox"][0:2]
    Street = data["features"][0]["properties"]["label"]
    Full = pd.DataFrame(columns=["Long", "Lat", "Street"])
    Full.loc[1] = pd.Series({"Long": LongLat[1], "Lat": LongLat[0], "Street": Street})

    return Full


addresses = pd.DataFrame(["Stübekamp 33, Hamburg, Germany", "Mesterfeld 28, Hamburg, Germany","Beutnerring 2, Hamburg, Germany"])


AllAddresses = []
for index, row  in addresses.iterrows(): 
    Input = row("0")
    data = Getall(Input)
    AllAddresses.append = data

This code however, returns the error: 但是,此代码返回错误:

TypeError: 'Series' object is not callable

I read that iterrows is the way to go, but I am coming from R and feel a little lost here. 我读到,繁琐的路要走,但我来自R,在这里感到有些迷茫。

Addresses is a pandas dataframe for no apparent reason. 没有明显的原因,Addresses是一个熊猫数据框。 Then you iterate over it which is generally a bad idea if you needed a pandas dataframe which you don't. 然后,您需要对其进行迭代,如果您不需要熊猫数据框,那么通常这是一个坏主意。 Then you take "row" which is a series and call it like a function row("0"). 然后,将“ row”作为一个序列,并像函数row(“ 0”)一样调用它。 As it is not a function you get an error. 由于它不是函数,因此会出现错误。 Just make addresses a list to solve your first problem. 只需列出地址即可解决您的第一个问题。

Then of course you will find you have a problem with Full which also does not need to be a dataframe; 然后,您当然会发现Full出现了问题,它也不必是数据帧。 you cannot add a row like that; 您不能添加这样的行; and you are returning a dataframe for each row which is likely not what you want either. 并且您正在为每一行返回一个数据框,这可能也不是您想要的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM