[英]Generate Column Value in Pandas based on previous rows
Let us assume I am taking a temperature measurement on a regular interval and recording the values in a Pandas Dataframe 让我们假设我定期进行温度测量并将值记录在Pandas Dataframe中
day temperature [F]
0 89
1 91
2 93
3 88
4 90
Now I want to create another column which is set to 1 if and only if the two previous values are above a certain level. 现在,我想创建另一个列,当且仅当前两个值在一定水平以上时,该列才设置为1。 In my scenario I want to create a column value of 1 if the two consecutive values are above 90, thus yielding
在我的场景中,如果两个连续的值都大于90,我想创建一个列值1,从而得出
day temperature Above limit?
0 89 0
1 91 0
2 93 1
3 88 0
4 91 0
5 91 1
6 93 1
Despite some SO and Google digging, it's not clear if I can use iloc[x], loc[x] or something else in a for loop? 尽管有一些SO和Google的研究,但尚不清楚我是否可以在for循环中使用iloc [x],loc [x]或其他内容?
Try this: 尝试这个:
df = pd.DataFrame({'temperature': [89, 91, 93, 88, 90, 91, 91, 93]})
limit = 90
df['Above'] = ((df['temperature']>limit) & (df['temperature'].shift(1)>limit)).astype(int)
df
In the future, please include the code to testing (in this case the df construction line) 将来,请包括要测试的代码(在本例中为df施工线)
df['limit']=""
df.iloc[0,2]=0
for i in range (1,len(df)):
if df.iloc[i,1]>90 and df.iloc[i-1,1]>90:
df.iloc[i,2]=1
else:
df.iloc[i,2]=0
Here iloc[i,2] refers to ith row index and 2 column index(limit column). 这里iloc [i,2]引用第i行索引和第2列索引(限制列)。 Hope this helps
希望这可以帮助
You are looking for the shift
function in pandas. 您正在寻找熊猫的
shift
功能。
import io
import pandas as pd
data = """
day temperature Expected
0 89 0
1 91 0
2 93 1
3 88 0
4 91 0
5 91 1
6 93 1
"""
data = io.StringIO(data)
df = pd.read_csv(data, sep='\s+')
df['Result'] = ((df['temperature'].shift(1) > 90) & (df['temperature'] > 90)).astype(int)
# Validation
(df['Result'] == df['Expected']).all()
尝试使用rolling(window = 2) ,然后按以下方式应用apply():
df["limit"]=df['temperature'].rolling(2).apply(lambda x: int(x[0]>90)&int(x[-1]> 90))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.