简体   繁体   English

根据之前的行在Pandas中生成列值

[英]Generate Column Value in Pandas based on previous rows

Let us assume I am taking a temperature measurement on a regular interval and recording the values in a Pandas Dataframe 让我们假设我定期进行温​​度测量并将值记录在Pandas Dataframe中

day   temperature [F]
0       89          
1       91         
2       93         
3       88            
4       90

Now I want to create another column which is set to 1 if and only if the two previous values are above a certain level. 现在,我想创建另一个列,当且仅当前两个值在一定水平以上时,该列才设置为1。 In my scenario I want to create a column value of 1 if the two consecutive values are above 90, thus yielding 在我的场景中,如果两个连续的值都大于90,我想创建一个列值1,从而得出

day   temperature        Above limit?
0       89               0
1       91               0
2       93               1
3       88               0
4       91               0
5       91               1
6       93               1

Despite some SO and Google digging, it's not clear if I can use iloc[x], loc[x] or something else in a for loop? 尽管有一些SO和Google的研究,但尚不清楚我是否可以在for循环中使用iloc [x],loc [x]或其他内容?

Try this: 尝试这个:

df = pd.DataFrame({'temperature': [89, 91, 93, 88, 90, 91, 91, 93]})

limit = 90
df['Above'] = ((df['temperature']>limit) & (df['temperature'].shift(1)>limit)).astype(int)
df

In the future, please include the code to testing (in this case the df construction line) 将来,请包括要测试的代码(在本例中为df施工线)

df['limit']=""
df.iloc[0,2]=0

for i in range (1,len(df)):
     if df.iloc[i,1]>90 and df.iloc[i-1,1]>90:
          df.iloc[i,2]=1
     else:
          df.iloc[i,2]=0

Here iloc[i,2] refers to ith row index and 2 column index(limit column). 这里iloc [i,2]引用第i行索引和第2列索引(限制列)。 Hope this helps 希望这可以帮助

You are looking for the shift function in pandas. 您正在寻找熊猫的shift功能。


import io
import pandas as pd

data = """
day   temperature        Expected
0       89               0
1       91               0
2       93               1
3       88               0
4       91               0
5       91               1
6       93               1
"""

data = io.StringIO(data)
df = pd.read_csv(data, sep='\s+')

df['Result'] = ((df['temperature'].shift(1) > 90) & (df['temperature'] > 90)).astype(int)

# Validation
(df['Result'] == df['Expected']).all()

Solution using shift() : 使用shift()的解决方案:

>> threshold = 90
>> df['Above limit?'] = 0
>> df.loc[((df['temperature [F]'] > threshold) & (df['temperature [F]'].shift(1) > threshold)), 'Above limit?'] = 1
>> df
    day temperature [F] Above limit?
0   0   89              0
1   1   91              0
2   2   93              1
3   3   88              0
4   4   90              0

尝试使用rolling(window = 2) ,然后按以下方式应用apply():

df["limit"]=df['temperature'].rolling(2).apply(lambda x: int(x[0]>90)&int(x[-1]> 90))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM