如果另一列中存在任何值> 0，则需要为数据框分配值

Question

I'm working with a transaction database query set, and I wasn't able to pull specific dates for payments, so I'm trying to write sone code in python to assign the dates for me. 我正在使用交易数据库查询集，但无法提取特定的付款日期，因此我尝试用python编写sone代码为我分配日期。 My first thought was to do it in excel, but the dataset is 800,000+ rows X 100+ columns, so it's not practical to do this any other way. 我的第一个想法是在excel中执行此操作，但数据集是800,000+行X 100+列，因此以任何其他方式执行此操作都不切实际。 The dataset has values in some of the rows in the payments column, so I need to add a date column with dates only in the rows that contain a payment amount. 数据集在付款列的某些行中具有值，因此我需要添加一个日期列，该日期列仅在包含付款金额的行中具有日期。

I have created all of the columns to store the dates, and my idea was to loop through the rows and assign a date if that row contains a value greater than zero (as there are 0s in the columns, as well as NULL values). 我已经创建了所有列来存储日期，并且我的想法是遍历各行并分配一个日期（如果该行包含的值大于零）（因为列中有0以及NULL值）。

df['Payment Date] = ''

for value in df:
    if value > 0 :
        df['Payment Date'] = '06/01/2019'

I expect the output to have dates assigned to the rows from the payment date column that have actual values. 我希望输出将日期分配给付款日期列中具有实际值的行。

Answer 1

If I understand correctly, you are trying to (1) identify rows in your Dataframe with values that are greater than zero, and (2) assign a specific date to a new column for all of those rows. 如果我理解正确，则您尝试（1）识别数据框中具有大于零值的行，并且（2）为所有这些行的新列分配特定日期。

First, for reproducibility and clarity, let's generate some random data that is representative of your dataset: 首先，为了可重复性和清晰度，让我们生成一些代表数据集的随机数据：

import pandas as pd

# Generate a random 5x4 Dataframe
df = pd.DataFrame(np.random.randn(5,4), columns=list('ABCD'))

# Set many of the values to zero 
df[df > 0] = 0

Now, we want to create a new column to store the desired dates: 现在，我们要创建一个新列来存储所需的日期：

df['Payment Date'] = ''

And finally, set that column to the date desired for all rows that contain any values greater than zero (note that this requires that the sum across all rows, skipping N/As, is greater than zero, which is the condition tested below): 最后，将该列设置为包含大于零的所有值的所有行的期望日期（请注意，这要求所有行的总和（不包括N / As）大于零，这是下面测试的条件）：

row_inds = df.sum(axis=1, skipna=True)>0
df.loc[row_inds, 'Payment Date'] = '06/01/2019'

Which gives you the desired result. 这给您想要的结果。

如果另一列中存在任何值> 0，则需要为数据框分配值

问题描述

1 个解决方案

解决方案1
0 2019-06-29 19:59:44

如果另一列中存在任何值&gt; 0，则需要为数据框分配值

问题描述

1 个解决方案

解决方案1 0 2019-06-29 19:59:44

如果另一列中存在任何值> 0，则需要为数据框分配值

解决方案1
0 2019-06-29 19:59:44