簡體   English   中英

Python csv:找到最新的有條件的記錄

[英]Python csv: find the latest record with a condition

我有一個帶有以下示例數據的 csv:

id bb_id cc_id datetime
-------------------------
1  11    44    2019-06-09
2  33    55    2020-06-09
3  22    66    2020-06-09
4  11    44    2019-06-09
5  11    44    2020-02-22

假設條件是if bb_id == 11 and cc_id == 44獲得最新記錄,即:

11    44    2020-02-22

我如何從 csv 獲得這個?

我做了什么:

 with open('sample.csv') as csv_file
     for indx, data in enumerate(csv.DictReader(csv_file)):
         # check if the conditional data is in the file?
         if data['bb_id'] == 11 and data['cc_id'] == 44:
                     # sort the data by date? or should I store all the relevant data before hand in a data structure like list and then apply sort on it? could I avoid that? as I need to perform this interactively multiple times

將所有選定的記錄放在一個列表中,然后使用max() function 以日期為鍵。

selected_rows = []
with open('sample.csv') as csv_file
    for data in csv.DictReader(csv_file):
        # check if the conditional data is in the file?
        if data['bb_id'] == 11 and data['cc_id'] == 44:
            selected_rows.append(data)
latest = max(selected_rows, key = lambda x: x['datetime'])
print(latest)

如果您真的想在常規 python 中執行此操作,則如下所示很簡單:

with open('sample.csv') as csv_file:
    list_of_dates = []
    for indx, data in enumerate(csv.DictReader(csv_file)):
         if data['bb_id'] == 11 and data['cc_id'] == 44:
             list_of_dates.append(data['datetime'])

   sorted = list_of_dates.sort()
   print( sorted[-1] ) # you already know the values for bb and cc

也試試:

def sort_func(e):
    return e['datetime']

with open('sample.csv') as csv_file:
    list_of_dates = []
    for indx, data in enumerate(csv.DictReader(csv_file)):
         if data['bb_id'] == 11 and data['cc_id'] == 44:
             list_of_dates.append(data)

    sorted = list_of_dates.sort(key=sort_func)
    print( sorted[-1] )

我知道的最簡單的方法:

import pandas as pd
import pandasql as ps

sample_df = pd.read_csv(<filepath>);

ps.sqldf("""select *
            from (select * 
            from sample_df
            where bb_id = 11 
             and cc_id = 44
             order by datetime desc) limit 1""", locals())

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM