简体   繁体   English

如何使用 Pandas 在 CSV 文件中识别过去 10 天的数据行?

[英]How to identify data rows for the last 10 days in CSV file with pandas?

I'm new to Python and currently seeking help with the following:我是 Python 新手,目前正在寻求以下方面的帮助:

How can I identify data rows for the last 10 days in CVS file with Pandas?如何使用 Pandas 在 CVS 文件中识别过去 10 天的数据行? My first column (report_date) in CSV file has data values (yyyy-mm-dd) I have hundreds of records for each day, but I need to get only last 10 days from this file, based on the date in report_date column and ideally save output to a new CSV file.我在 CSV 文件中的第一列 (report_date) 有数据值 (yyyy-mm-dd) 我每天有数百条记录,但我只需要根据 report_date 列中的日期和理想情况从该文件中获取最后 10 天将输出保存到新的 CSV 文件。

My code so far:到目前为止我的代码:

import pandas as pd

data = pd.read_csv("path/to/my/file/myfile.csv")    

df = pd.DataFrame(report_date) 

days=10    
cutoff_date = df["report_date"].dt.date.iloc[-1] - pd.Timedelta(days=days)

Would someone be able to help?有人可以帮忙吗? Thanks in advance!提前致谢!

Create DatetimeIndex first with index_col and parse_dates parameters in read_csv :创建DatetimeIndex与第一index_colparse_dates在参数read_csv

df = pd.read_csv("path/to/my/file/myfile.csv", 
                 index_col=['report_date'], 
                 parse_dates=['report_date'])   

And then is possible use DataFrame.last :然后可以使用DataFrame.last

df1 = df.last('10d')

And last save to file by DataFrame.to_csv :最后通过DataFrame.to_csv保存到文件:

df1.to_csv('new.csv')

Your solution should be changed with convert column to datetimes in read_csv :您的解决方案应该更改为在read_csv列转换为日期read_csv

df = pd.read_csv("path/to/my/file/myfile.csv", parse_dates=['report_date'])    

days=10    
cutoff_date = df["report_date"].dt.date.iloc[-1] - pd.Timedelta(days=days)

Then compare dates by Series.dt.date in boolean indexing :然后在boolean indexingSeries.dt.date比较日期:

df1 = df[df["report_date"].dt.date > cutoff_date]

Last save to file with removed default index by DataFrame.to_csv :最后通过DataFrame.to_csv保存到删除默认索引的DataFrame.to_csv

df1.to_csv('new.csv', index=False)

EDIT: I believe you need:编辑:我相信你需要:

df = pd.DataFrame({'data': range(30)}, index= pd.date_range('2020-01-25', periods=30))  
print (df)
            data
2020-01-25     0
2020-01-26     1
2020-01-27     2
2020-01-28     3
2020-01-29     4
2020-01-30     5
2020-01-31     6
2020-02-01     7
2020-02-02     8
2020-02-03     9
2020-02-04    10
2020-02-05    11
2020-02-06    12
2020-02-07    13
2020-02-08    14
2020-02-09    15
2020-02-10    16
2020-02-11    17
2020-02-12    18
2020-02-13    19
2020-02-14    20
2020-02-15    21
2020-02-16    22
2020-02-17    23
2020-02-18    24
2020-02-19    25
2020-02-20    26
2020-02-21    27
2020-02-22    28
2020-02-23    29

today = pd.Timestamp('today').floor('d')
df1 = df[df.index > today].first('10d')
print (df1)
            data
2020-02-11    17
2020-02-12    18
2020-02-13    19
2020-02-14    20
2020-02-15    21
2020-02-16    22
2020-02-17    23
2020-02-18    24
2020-02-19    25
2020-02-20    26

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用月份的第一天和最后几天创建 pandas 数据框 - how to create a pandas data frame with the first days and the last days of Months 如何在 Python 中使用 Pandas 打印 csv 文件中的所有行/数据? - How to print all rows/data from csv file with Pandas in Python? 如何从10列的CSV文件中选择熊猫数据框中仅2列的行的固定范围? - How to select a fix range of rows of only 2 columns in a pandas dataframe from CSV file of 10 columns? 如何使用python(pandas)更新csv文件中所有行的最后一列值 - How to update the last column value in all the rows in csv file using python(pandas) How to read every column of a csv file in python after every 10-15 rows which have the same header using pandas or csv? - How to read every column of a csv file in python after every 10-15 rows which have the same header using pandas or csv? 如何使用熊猫在csv文件中搜索和识别浮点值? - How to search and identify a float value in a csv file using pandas? Python,大型csv文件上的pandas.read_csv,具有来自Google云端硬盘文件的1000万行 - Python, pandas.read_csv on large csv file with 10 Million rows from Google Drive file Pandas 选择最近 20 天的数据。 - Pandas Select last 20 days of data. 如何获得最后 10 个工作日,如果 - how to get last 10 business days if 在熊猫中读取大型csv的最后N行 - Reading last N rows of a large csv in Pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM