简体   繁体   English

Python 搜索和附加 2 个 csv 文件

[英]Python searching and appending 2 csv files

I have 2 CSV files.我有 2 个 CSV 文件。 The first file has a list of all the states in the US but have missing values in the Longitude and Latitude column.第一个文件包含美国所有州的列表,但在经度和纬度列中缺少值。 I found another CSV file that contains all the longitude and latitude values for all states in the US.我找到了另一个 CSV 文件,其中包含美国所有州的所有经度和纬度值。

What I want to do now is to loop through the 'Location' column on the first file, match it with the 'Location' column on the 2nd file then get the corresponding values for its Longitude and Latitude.我现在想要做的是遍历第一个文件上的“位置”列,将它与第二个文件上的“位置”列匹配,然后获取其经度和纬度的相应值。 After which, I will need to append these values onto the Longitude and Latitude column in the first file之后,我需要将这些值附加到第一个文件中的经度和纬度列

Currently, what I have is this:目前,我所拥有的是:

aviationdata = pd.read_csv('AviationData.csv', sep = ',', header = 0, encoding = 'iso-8859-1') #this is the first file
location = pd.read_csv('location.csv') #this is the 2nd file

import csv

with open('location.csv', 'r') as loc:
    locationfile = loc.read()

for i in range(len(aviationdata['Location'])):
    currentlocation = aviationdata['Location'].iloc[i]
    axis = []
    for i in currentlocation:
        if i in aviationdata['Location']:
   ... #i do not know how to continue from here

I do not know how to come up with the codes to compare the location field to extract the longitude and latitude code from location.csv and append them to the longitude and latitude columns accordingly in aviationdata .我不知道该如何拿出代码比较位置字段来提取的经度和纬度的代码location.csv并追加到经度和纬度列相应的aviationdata

这些是第一个文件的字段

These are the fields for the first file (aviationdata)这些是第一个文件 (aviationdata) 的字段

这些是第二个文件的字段

These are the fields for the 2nd file (location)这些是第二个文件(位置)的字段

this looks like a good job for merge https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html这看起来很适合合并https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

Assuming that the Location columns in both DataFrames are exactly the same (that is in terms of case, and spacing) then,假设两个 DataFrame 中的 Location 列完全相同(即大小写和间距),那么,

1.) Get all interested columns from Aviation Data 1.) 从航空数据中获取所有感兴趣的列

aviationdata = aviationdata[["Location", "Country", "Make", "Weather.Condition", "Year", "Month"]]

2.) Now merge Aviation Data with the currentLocation DataFrame on column name "Location" 2.) 现在将 Aviation Data 与列名称“Location”上的currentLocation DataFrame 合并

aviationdata = aviationdata.merge(currentlocation, on=['Location'])

aviationdata.head(10)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM