简体   繁体   English

合并两个不同列中存在的日期以生成列的平均值

[英]Combine date present in two different columns to generate mean for a column

I have a dataset of the following format which has the Starting column values ranging from 2021-01-01 to 2022-03-13 and same goes for the Ending column where my values begin from 2021-01-01 to 2022-03-13.我有以下格式的数据集,其起始列值范围从 2021-01-01 到 2022-03-13,结束列也是如此,我的值从 2021-01-01 到 2022-03-13 .

The data for rainfall gets collected on a daily basis such that the entries are as follows:每天收集降雨数据,条目如下:

这是我的 pandas df 中数据的格式。

为格式道歉。我想像这样显示新数据框

I am trying to combine and form monthly average values for the dataset.我正在尝试合并并形成数据集的每月平均值。 I cannot find a way where I am able to take monthly average values and store them in a different pandas dataframe such that it appears as follows:我找不到一种方法可以获取月平均值并将它们存储在不同的 pandas dataframe 中,如下所示:

The Monthly Rainfall is found using Total rainfall/ Total days in the month每月降雨量是使用当月的总降雨量/总天数得出的

Any help would be appreciated!任何帮助,将不胜感激!

I have tried to use groupy and mean together from pandas library to find the output but it doesn't appear in the format I want.我尝试使用 groupy 和 mean 一起从 pandas 库中找到 output 但它没有以我想要的格式出现。

df=df.groupby(['Starting','Ending','Location_id'])['rainfall'].mean().reset_index() df=df.groupby(['Starting','Ending','Location_id'])['rainfall'].mean().reset_index()

To solve the problem, you can write a function like this:为了解决这个问题,你可以这样写一个function:

import math
from datetime import datetime

def to_date(x, y):
  lists = zip([datetime.strptime(dt, '%Y-%m-%d').date() for dt in x], [datetime.strptime(dt, '%Y-%m-%d').date() for dt in y])
  return [0 if math.isinf((x-y).days) else (x-y).days for x,y in lists]

Basically this function takes two lists (x,y) and turn every item in those into date() objects.基本上这个 function 需要两个列表 (x,y) 并将其中的每个项目转换为date()对象。 And returns a new lists with items as days object. For your information, if you deduct same dates, Python returns an inf integer, which is infinite.并返回一个新列表,其中的项目为days object。供您参考,如果扣除相同的日期,Python 将返回一个inf integer,这是无限的。 To go over this, you can check if the item is an infitine integer, if so return 0 else return days .至 go 超过此,您可以检查该项目是否为无限 integer,如果是则返回 0 否则返回days

Here's the code snippet I wrote, since you didn't provide a dataset, I wrote using the images you provided:这是我写的代码片段,因为你没有提供数据集,我用你提供的图片写的:

import pandas as pd

d = {
    'New_Starting': ['2021-01-01','2021-01-01','2021-01-01'],
    'New_Ending': ['2021-01-31','2021-01-31','2021-01-31'],
    'Location_id': [45, 52, 30],
    'Rainfall': [4.07, 6.53, 3.71]
}

d = pd.DataFrame(d)
d['Monthly_Rainfall'] = d['Rainfall'] / to_date(d['New_Ending'], d['New_Starting'])

Output: Output:

    New_Starting    New_Ending  Location_id Rainfall    Monthly_Rainfall
0   2021-01-01      2021-01-31       45     4.07        0.135667
1   2021-01-01      2021-01-31       52     6.53        0.217667
2   2021-01-01      2021-01-31       30     3.71        0.123667

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果且仅两个列中都存在值,则两个日期列之间的差异 - Difference between two date column if and only values are present in both the columns pandas - 将时间和日期从两个 dataframe 列组合到一个日期时间列 - pandas - combine time and date from two dataframe columns to a datetime column 如何将日期信息列合并为一个日期列? (每列当前是不同的日期) - How to combine columns of date information into one date column? (each column is currently a different date) 获取 pandas dataframe 中的所有列,当不同列中存在不同时区时,该列是日期列 - Get all columns in a pandas dataframe that is a date-column when different time-zones are present in different columns 在Python中合并具有不同日期范围的多列 - Combine multiple columns with different date ranges in Python 结合月份和年份列来创建日期列 - Combine month and year columns to create date column 在 Python 中将两个日期列合并为一个 - Combine two date columns together to one in Python 使用PANDAS合并和处理两列作为日期 - Combine and Manipulate two columns as Date using PANDAS Pandas - 创建一个二进制列,如果两个不同的列中至少有一个存在 True,则该列返回 True - Pandas - Create a binary column that returns True if a True is present in at least one out of two different columns 如何将csv文件中的两列日期和时间合并到pandas中的1个datetime列? - How do I combine two columns of date and time in a csv file to 1 datetime column in pandas?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM