[英]how to avoid writing a gross for-loop on a pandas dataframe
I have a DataFrame like:我有一个 DataFrame 像:
import pandas as pd
begin_month = pd.Series([1, 19, 45, 32, 54])
end_month = pd.Series([19,45,32,54,99])
inventory = pd.DataFrame({"begin_month":begin_month, "end_month": end_month})
I want to make a third column, a boolean value, that says, "for each month, does the begin_month inventory == the previous month's end_month inventory level?"我想创建第三列,一个布尔值,表示“对于每个月,begin_month 库存是否 == 上个月的 end_month 库存水平?”
I can write a foul for-loop that does this, but am wondering how I could write a vectorized action to achieve the same thing.我可以编写一个错误的 for 循环来执行此操作,但我想知道如何编写矢量化操作来实现相同的目的。 Furthermore, the edge case is index location 0, for which there is nothing to compare its begin_month value to.此外,边缘情况是索引位置 0,没有什么可以比较它的 begin_month 值。
import pandas as pd
begin_month = pd.Series([1, 19, 145, 32, 54])
end_month = pd.Series([19,45,32,54,99])
df = pd.DataFrame({"begin_month":begin_month, "end_month": end_month})
df['parity'] = df['begin_month'] == df['end_month'].shift()
df.ix[0,'parity'] = True
print df
The key is to use .shift() so that you can compare the current row with an adjacent row.关键是使用 .shift() 以便您可以将当前行与相邻行进行比较。 and I set df.ix[0, 'parity'] = True because it has no predecessor to compare it to.并且我设置了 df.ix[0, 'parity'] = True 因为它没有可以与之比较的前身。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.