csvfile1
status,longitude,latitude,timestamp
ok,10.12,17.45,14569003
ok,11.34,16.78,14569000
csvfile2
weather,timestamp,latitude1,longitude1,latitude2,longitude2
rainy,14569003,17.45,10.12,17.50,11.25
sunny,14569000,13.76,12.44,16.78,11.34
expected output
status,weather,longitude,latitude,timestamp
ok,rainy,10.12,17.45,14569003
ok,sunny,11.34,16.78,14569000
I would like to combine the columns longitude,latitude and timestamp of both the files.
There are two longitudes and two latitudes in csvfile2. So i want to compare if it matches any one of the longitude-latitude pairs along with the timestamp.
And the column name order is also different in both the files.
Any help would be appreciated.
Thank you.
You can use it.
import pandas as pd
first = pd.read_csv('csvfile1.csv')
second = pd.read_csv('csvfile2.csv')
merged = pd.merge(first, second, how='left', on='what you want(it can be label or a list)')
merged.to_csv('merged.csv', index=False)
for more details, You can see these link1 . link2 both are helpful.
awk solution:
join_csv.awk script:
#!/bin/awk -f
BEGIN {
FS=OFS=","; # field separator
print "status,weather,longitude,latitude,timestamp" # header line
}
NR==FNR && NR>1 { # processing the first file
a[$4]=$1 FS $2 FS $3 # accumulating the needed values (status, longitude, latitude)
}
FNR>1 { # processing the second file
if ($2 in a) { # if `timestamp` matches
split(a[$2],data,FS); # extracting items for further comparison
if ((data[2]==$4 || data[2]==$6) && (data[3]==$3 || data[3]==$5)) {
print data[1],$1,data[2],data[3],$2
}
}
}
Usage :
awk -f join_csv.awk file1 file2
The output:
status,weather,longitude,latitude,timestamp
ok,rainy,10.12,17.45,14569003
ok,sunny,11.34,16.78,14569000
Hope this answer will help you:
import csv
file1 = open("csvfile1.csv", "r")
file2 = open("csvfile2.csv", "r")
file1_dict = csv.DictReader(file1)
file2_dict = csv.DictReader(file2)
new_file = open("new_file.csv", "w")
csv_writer = csv.writer(new_file)
csv_writer.writerow(["status", "weather", "longitude", "latitude", "timestamp"])
for f1_row, f2_row in zip(file1_dict, file2_dict):
f1_row, f2_row = dict(f1_row), dict(f2_row) # In python2 no need to convert to dict
if f1_row["timestamp"] == f2_row["timestamp"]: #Here write the condition to check your latitude and longitude also.
csv_writer.writerow([f1_row["status"], f2_row["weather"], f1_row["longitude"], f1_row["latitude"], f1_row["timestamp"]])
file1.close()
file2.close()
new_file.close()
Got output:
status,weather,longitude,latitude,timestamp
ok,rainy,10.12,17.45,14569003
ok,sunny,11.34,16.78,14569000
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.