简体   繁体   English

检查 python dataframe 中不同列的开始和结束日期

[英]Checking start and end date of different columns in python dataframe

I have a dataframe that has a date column and 4 other columns that contain numerical values.我有一个 dataframe 有一个日期列和其他 4 个包含数值的列。 But each of these other 4 columns start and end at different times.但是其他 4 列中的每一列都在不同的时间开始和结束。 Is there a way in python that I can check start and end date for each column? python 中是否有一种方法可以检查每列的开始和结束日期? Here's an example of my dataframe:这是我的 dataframe 的示例:

 df = pd.DataFrame({
'Date': [1930, 1931, 1932, 1933,1934],
'File1': [np.nan, 72, 58, 280, 958],
'File2': [np.nan, np.nan, np.nan, 13, 89],
'File3': [np.nan, 55, 68, 18, np.nan],
'File4': [45, 552, 177, np.nan, np.nan]
}) 

So for example i want to extract/know the start and end date for file 3 (in this case it should return 1931 and 1933).例如,我想提取/知道文件 3 的开始和结束日期(在这种情况下,它应该返回 1931 和 1933)。

If there's a way i can know the start and end date for all files that will be even better.如果有办法我可以知道所有文件的开始和结束日期,那就更好了。

Thank you in advance先感谢您

You can try something like that:你可以尝试这样的事情:

column_search='File2'
df_search=df[df[column_search].notnull()]
print(f"start date: {df_search['Date'].min()} ")
print(f"end date: {df_search['Date'].max()}")

According to your comment: To iterate trough columns:根据您的评论:迭代槽列:

for column in df.columns:
    df_search=df[df[column].notnull()]
    print(f"start date: {df_search['Date'].min()} ")
    print(f"end date: {df_search['Date'].max()}")

if Date column is Index of the df:如果日期列是 df 的索引:

for column in df.columns:
    idx_list=df.index[df[column].notnull()].tolist() 
    print(f"start date: {min(idx_list)} ")
    print(f"end date: {max(idx_list)} ")

No need for explicity loops over columns, you can just use "apply".不需要对列进行显式循环,您只需使用“应用”即可。

This would give you a dictionary where the key is the file name and the values are the start and end date as a list:这将为您提供一个字典,其中键是文件名,值是列表的开始日期和结束日期:

df = df.set_index('Date')
result_dict = {}
def check_date(column):
    x = column.notnull()
    print(type(column[x]))
    result_dict[column.name] = [column[x].head(1).index[0], 
    column[x].tail(1).index[0]]
df.apply(check_date)
print(result_dict)

I get this result:我得到这个结果:

{'File1': [1931, 1934], 'File2': [1933, 1934], 'File3': [1931, 1933], 'File4': [1930, 1932]}

Hope this helps.希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python将日期列表与数据框中的起始和结束日期列进行比较 - python compare date list to start and end date columns in dataframe Python:具有开始日期和结束日期的数据框,解压为 1 个日期字段 - Python: Dataframe with Start Date and End Date, Decompress to 1 Date Field 从具有开始和结束日期的 dataframe 列生成日期范围 - Generating date range from dataframe columns with start and end dates 使用开始日期和结束日期列重新采样 - Resampling with start and end date columns 如何检查两个日期是否介于代表开始日期和结束日期的数据框的两列之间 - How to check if two dates falls between two columns of dataframe representing start date and end date 如何计算 DataFrame 中“开始”和“结束”列包含特定日期的行数? - How to count how many rows in DataFrame where "start" and "end" columns contain a certain date? 根据开始和结束列(速度)扩展数据帧 - expanding a dataframe based on start and end columns (speed) 在python 3中将出生日期pandas数据框分成3个不同的列 - split a date of birth pandas dataframe into 3 different columns in python 3 Pandas:Dataframe 与每日数据的开始和结束日期 - Pandas: Dataframe with start & end date to daily data 根据数据框中的开始和结束日期更改年份 - Change year based on start and end date in dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM