简体   繁体   English

根据Condition修改Wide Pandas数据框

[英]Modifying Wide Pandas Data frame based on Condition

I am attempting to edit values for a wide form of time series data based on a condition in python using the pandas library.我正在尝试使用 pandas 库根据 python 中的条件编辑各种时间序列数据的值。 The data is satellite observational values on a given date (see photo below).数据是给定日期的卫星观测值(见下图)。 The first column is a unique id and all subsequent columns are date values.第一列是唯一的 id,所有后续列都是日期值。 This means that each row is a time series for that specific id.这意味着每一行都是该特定 ID 的时间序列。

The idea is this:这个想法是这样的:

if n1 is the current observation and n2 is the next observation and n3 is the observation after that then:如果n1是当前观测值, n2是下一个观测值, n3是之后的观测值,则:

if ((n2 - n1) > 0.3) and (n3 >= (0.9 * n1)):
    n2 = (n1 + n3) / 2

Just to be clear, n1, n2, n3 are the first three values of this data frame, not attributes.需要明确的是,n1、n2、n3 是该数据帧的前三个值,而不是属性。 For the attached example n1 would be 0.25916876 and n2 would be 0.25916876 and n3 would be 0.23824187.对于附加的示例,n1 将是 0.25916876,n2 将是 0.25916876,n3 将是 0.23824187。

How can I modify my Data frame with this rule?如何使用此规则修改我的数据框? Could this be done with list comprehension?这可以通过列表理解来完成吗?

This is what df looks like这就是 df 的样子

If your dataframe is named df , then you can try:如果您的 dataframe 名为df ,那么您可以尝试:

mask = (df.n1 - df.n2 > 0.3) & (df.n3 >= (0.9*df.n1))
df.n2.where(~mask, (df.n1 + df.n3) / 2)

I assume you want to do this process for each column of the dataframe.我假设您想对 dataframe 的每一列执行此过程。 This is working with a fake dataframe I created to replicate the process:这与我创建的用于复制该过程的假 dataframe 一起使用:

# Iterate over each column
for c in list(df):
    df[c] = np.where((df[c]-df[c].shift(1, fill_value=0)>0.3) &
                     (df[c].shift(-1, fill_value=0) > 0.9*df[c].shift(1, fill_value=0)), 
                     np.mean(df[c].shift(-1, fill_value=0),df[c].shift(1, fill_value=0)), 
                     df[c])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据pandas中另一个数据框中的条件更新数据框 - how to update a data frame based on the condition in another data frame in pandas Python pandas:根据条件从多个数据框中访问数据 - Python pandas: Accessing data from multiple data frame based on condition 如何根据 pandas 数据帧中的条件减去时间数据类型 - how to subtract time data type based on condition in pandas data frame 根据条件将值从一个pandas数据帧替换为另一个pandas数据帧 - Substitute values from one pandas data frame to another based on condition 熊猫:根据条件在数据框组的末尾删除行 - Pandas: Strip rows at the end of data-frame group based on condition 基于多种条件的Pandas Data Frame isnan更新 - pandas Data Frame isnan update based on multiple condition 根据if条件为熊猫数据框中的列分配值 - assigning values to column in pandas data-frame based on if condition 从具有基于另一列的条件的 pandas 数据帧中删除重复项 - Removing duplicates from pandas data frame with condition based on another column pandas 根据条件和时间戳序列对数据帧进行切片 - pandas slice a data frame based on the condition and timestamp sequence 如何根据条件在熊猫数据框的多列上分配值 - How to assign values on multiple columns of a pandas data frame based on condition
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM