简体   繁体   English

如何将熊猫数据帧一行一行地写入CSV文件?

[英]How to write a pandas dataframe to CSV file line by line, one line at a time?

I have a list of about 1 million addresses, and a function to find their latitudes and longitudes. 我有大约一百万个地址的列表,以及一个查找其纬度和经度的函数。 Since some of the records are improperly formatted (or for whatever reason), sometimes the function is not able to return the latitudes and longitudes of some addresses. 由于某些记录的格式不正确(或出于任何原因),因此有时该函数无法返回某些地址的纬度和经度。 This would lead to the for loop breaking. 这将导致for循环中断。 So, for each address whose latitude and longitude is successfully retrieved, I want to write it to the output CSV file. 因此,对于成功检索到纬度和经度的每个地址,我想将其写入输出CSV文件。 Or, perhaps instead of writing line by line, writing in small chunk sizes would also work. 或者,也许不是逐行写入,而是以小块大小写入也可以。 For this, I am using df.to_csv in "append" mode ( mode='a' ) as shown below: 为此,我在“追加”模式( mode='a' )中使用df.to_csv ,如下所示:

for i in range(len(df)):
    place = df['ADDRESS'][i]
    try:
        lat, lon, res = gmaps_geoencoder(place)
    except:
        pass

    df['Lat'][i] = lat
    df['Lon'][i] = lon
    df['Result'][i] = res

    df.to_csv(output_csv_file,
          index=False,
          header=False,
          mode='a', #append data to csv file
          chunksize=chunksize) #size of data to append for each loop

But the problem with this is that, it is printing the whole dataframe for each append. 但是,这样做的问题是,它正在为每个追加打印整个数据帧。 So, for n lines, it would write the whole dataframe n^2 times. 因此,对于n行,它将写入整个数据帧n^2次。 How to fix this? 如何解决这个问题?

If you really want to print line by line. 如果您确实要逐行打印。 (You should not). (你不应该)。

for i in range(len(df)):
    df.loc[[i]].to_csv(output_csv_file,
        index=False,
        header=False,
        mode='a')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM