当pandas中的列满足某个条件时如何拉第一个实例？

Question

I'm trying to pull the first instance an account balance equals or drops below 0. In the example below I would like to create a column where only the row where X and Y move from a positive number to below or equal to 0 ie X would be 2017-1-4 in row 4 and Y would be 2018-2-3 in row 8.我正在尝试将第一个实例中的帐户余额等于或低于 0。在下面的示例中，我想创建一个列，其中只有 X 和 Y 从正数移动到低于或等于 0 的行，即 X将是第 4 行中的 2017-1-4，而 Y 将是第 8 行中的 2018-2-3。

df= pd.DataFrame()
df['Account'] = ['X','X','X','X','X','Y','Y','Y']
df['Balance'] = [100,90,80,0,0,900,90,-1]
df['Date'] = [pd.to_datetime('2017-1-1'),pd.to_datetime('2017-1-2'),pd.to_datetime('2017-1-3'),pd.to_datetime('2017-1-4'),pd.to_datetime('2017-1-5'),pd.to_datetime('2018-2-1'),pd.to_datetime('2018-2-2'),pd.to_datetime('2018-2-3')]
print(df)

Thanks!谢谢！

edit: I think the answer I probably looking for was something like this编辑：我认为我可能正在寻找的答案是这样的

x = df.groupby('Account')['Balance']\
       .apply(lambda x: (x<=0) & (0<x.shift()))

This would return the instance when the balance went to 0 or less and compare to what is was previously.这将在余额变为 0 或更少时返回实例，并与之前的情况进行比较。 However, when I try to get the date information it gives me a number which I don't get:但是，当我尝试获取日期信息时，它给了我一个我没有得到的数字：

y = np.where(x,df['Date'],pd.NaT)

array([NaT, NaT, NaT, 1483488000000000000, NaT, NaT, NaT, 1517616000000000000], dtype=object)数组（[NaT，NaT，NaT，1483488000000000000，NaT，NaT，NaT，1517616000000000000]，dtype = 对象）

How do I resolve this?我该如何解决这个问题？ Still quite new to Python and Pandas so might be something quite obvious! Python 和 Pandas 仍然很新，所以可能很明显！

Answer 1

A possible solution could be using df.values which returns the dataframe as a numpy array object.一种可能的解决方案是使用 df.values，它将 dataframe 作为 numpy 数组 object 返回。 You can then use a combination of for loops to iterate through each row of the dataframe and check if account == X or Y and Balance <= 0, and return the date if so:然后，您可以使用 for 循环的组合来遍历 dataframe 的每一行并检查 account == X 或 Y 和 Balance <= 0，如果是，则返回日期：

def zero_bal(a, df=df):
    for each in df.values:
        if each[0] == a and each[1] <= 0:
                return each[2]

X, Y = zero_bal('X'), zero_bal('Y')

In the code above, the "each" in "for each in df.values:" would be something like:在上面的代码中，“for each in df.values:”中的“each”类似于：

['X', 80, Timestamp('2017-01-03 00:00:00')] ['X', 80, 时间戳('2017-01-03 00:00:00')]

You can then use indices each[0], each[1] and each[2] to select the Account, Balance and Date respectively and check whether they are what you are looking for.然后，您可以分别使用索引 each[0]、each[1] 和 each[2] 到 select 帐户、余额和日期，并检查它们是否是您要查找的内容。

Answer 2

You could apply the boolean mask directly to your dataframe, as follows: x = df.groupby('Account')['Balance'].apply(lambda x: (x<=0) & (0<x.shift()))您可以将 boolean 掩码直接应用于 dataframe，如下所示： x = df.groupby('Account')['Balance'].apply(lambda x: (x<=0) & (0<x.shift()))

df[x] or df[x]['column_name_that_you_need'] df[x]或df[x]['column_name_that_you_need']

当pandas中的列满足某个条件时如何拉第一个实例？

问题描述

2 个解决方案

解决方案1
0 2021-02-10 10:33:02

解决方案2
0 已采纳 2021-02-10 13:45:20

当pandas中的列满足某个条件时如何拉第一个实例？

问题描述

2 个解决方案

解决方案1 0 2021-02-10 10:33:02

解决方案2 0 已采纳 2021-02-10 13:45:20

解决方案1
0 2021-02-10 10:33:02

解决方案2
0 已采纳 2021-02-10 13:45:20