[英]Python - Iterate each line in two CSV files and compare timestamp values
我有以下兩個CSV文件:
CSV文件1:
Range1,2018-05-17 01:50:17+0000,2018-05-17 02:00:17+0000
Range2,2018-05-17 01:50:17+0000,2018-05-17 04:00:17+0000
Range3,2018-05-17 01:50:17+0000,2018-05-17 08:00:17+0000
CSV File2:
TimeStamp1,2018-05-17 01:59:17+0000
TimeStamp2,2018-05-17 03:59:17+0000
TimeStamp3,2018-05-17 07:59:17+0000
我想遍歷File1中的每個Range,並確定哪個TimeStamp屬於所比較的Range。 例如,我的Python腳本的輸出將顯示:
輸出:
TimeStamp1 falls within Range1
TimeStamp1, TimeStamp2 falls within Range2
TimeStamp1, TimeStamp2, TimeStamp3 falls within Range3
我開始寫類似這樣的內容,但是在獲取輸出以及if語句正確地最初遍歷File1的File2中的所有行時遇到問題,然后在File1中的下一行重復,並在File2中再次重復所有行。 先感謝您。
import csv
with open('File1', 'rb') as range, open('File2', 'rb') as timeStamp:
range_reader = csv.reader(range, quotechar='"')
timeStamp_reader = csv.reader(timeStamp, quotechar='"')
for range_row in range_reader:
print range_row[2]
print range_row[3]
for timeStamp_row in timeStamp_reader:
print timeStamp_row[2]
if range_row[2] <= timeStamp_row[2] and range_row[3] >= timeStamp_row[2]
print " %s falls within %s "% (timeStamp_row[1], range_row[1])
您的代碼中幾乎沒有錯誤。 首先,您搞砸了索引。 索引從0開始。因此,從所有索引中減去1。
您不能重復讀取文件,因為讀取器會打到它的結尾,然后它將不再讀取任何東西,因為它在結尾。 因此,對於第二個循環,您需要將其閱讀器重置為重新開始。 通過設置搜索可以輕松完成。
import csv
with open('File1', 'r') as ranges, open('File2', 'r') as timeStamp:
range_reader = csv.reader(ranges, quotechar='"')
timeStamp_reader = csv.reader(timeStamp, quotechar='"')
rangeArray = {}
for range_row in range_reader:
print("%s / %s" % ( range_row[1], range_row[2])) # This looks better, and gives more info than just printing both timestamps on each line
timeStamp.seek(0) # This will set position of cursor in timeStamp back to start, so it can iterate repeatedly
rangeArray[range_row[0]] = []
for timeStamp_row in timeStamp_reader:
if range_row[1] <= timeStamp_row[1] and range_row[2] >= timeStamp_row[1]:
rangeArray[range_row[0]].append(timeStamp_row[0])
print (" %s falls within %s " % (timeStamp_row[0], range_row[0]))
print("\n\n")
# Desired Output:
for key in rangeArray:
print("%s falls within %s" % (', '.join([str(x) for x in rangeArray[key]]), key))
這給出了這樣的輸出:
2018-05-17 01:50:17+0000 / 2018-05-17 02:00:17+0000
TimeStamp1 falls within Range1
2018-05-17 01:50:17+0000 / 2018-05-17 04:00:17+0000
TimeStamp1 falls within Range2
TimeStamp2 falls within Range2
2018-05-17 01:50:17+0000 / 2018-05-17 08:00:17+0000
TimeStamp1 falls within Range3
TimeStamp2 falls within Range3
TimeStamp3 falls within Range3
TimeStamp1 falls within Range1
TimeStamp1, TimeStamp2 falls within Range2
TimeStamp1, TimeStamp2, TimeStamp3 falls within Range3
import csv
with open('File1.csv', 'rb') as ranger, open('File2.csv', 'rb') as timeStamp:
range_reader = [x for x in csv.reader(ranger, quotechar='"')]
timeStamp_reader = [x for x in csv.reader(timeStamp, quotechar='"')]
for range_row in range_reader:
temp = []
for timeStamp_row in timeStamp_reader:
if range_row[1] <= timeStamp_row[1] and range_row[2] >= timeStamp_row[1]:
temp.append(timeStamp_row[0])
if temp:
print " %s falls within %s "% (','.join(temp), range_row[0])
Lukasas ans很好,但是如果您的數據集很大,則每次在for循環中搜索可能不是一個好主意。 只需在開始時復制它們即可。 此外,要根據需要進行輸出,需要在外循環開始時保存它們。
TimeStamp1 falls within Range1
TimeStamp1,TimeStamp2 falls within Range2
TimeStamp1,TimeStamp2,TimeStamp3 falls within Range3
正如您將看到的,從我用Python 3編寫代碼開始,我做了很多改動。您是否在使用Python 2?
無論如何,很高興回答問題。 我認為這基本上可以按照您希望的方式進行:
import csv
import datetime
with open('File1', 'r') as range, open('File2', 'r') as timeStamp:
range_rows = list(csv.reader(range, quotechar='"'))
timeStamp_rows = list(csv.reader(timeStamp, quotechar='"'))
range_list = []
d=datetime.datetime.now()
for row in range_rows:
time = [row[0], d.strptime(row[1][:-5],"%Y-%m-%d %H:%M:%S"), d.strptime(row[2][:-5],"%Y-%m-%d %H:%M:%S")]
range_list.append(time)
timeStamp_list = []
for row in timeStamp_rows:
time = [row[0], d.strptime(row[1][:-5],"%Y-%m-%d %H:%M:%S")]
timeStamp_list.append(time)
for i in range_list:
for e in timeStamp_list:
if i[1] <= e[1] and i[2] >= e[1]:
print(" %s falls within %s "% (e[0], i[0]))
輸出:
TimeStamp1 falls within Range1
TimeStamp1 falls within Range2
TimeStamp2 falls within Range2
TimeStamp1 falls within Range3
TimeStamp2 falls within Range3
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.