遍历熊猫行并根据其他列中的值设置列值

Question

I have a dataframe, one column (col1) of which contains values either Y or N. I would like to assign values (random, not repetitive numbers) to the next column (col2) based on the values in col1 - if value in col1 equals to N, then value in col2 would be some number, if value in col1 equals to Y, then value in col2 would repeat the previous. 我有一个数据框，其中一列（col1）包含Y或N值。我想根据col1中的值将值（随机数，不是重复数）分配给下一列（col2）-如果col1中的值等于N，则col2中的值将是某个数字，如果col1中的值等于Y，则col2中的值将重复前一个。 I tried to create a for loop and iterate over rows using df.iterrows(), however the numbers in col2 were equal for all Ns. 我试图创建一个for循环并使用df.iterrows（）遍历行，但是col2中的数字对于所有N都是相等的。

Example of the dataframe I want to get: 我要获取的数据框示例：

df = pd.DataFrame([[N, Y, Y, N, N, Y], [1, 1, 1, 2, 3, 3]])

where for each new N new number is assigned in other column, while for each Y the number is repeated as in previous row. 其中，每个新的N个新数字在其他列中分配，而每个Y的数字均与上一行相同。

Answer 1

Assuming a DataFrame df: 假设一个DataFrame df：

df = pd.DataFrame(['N', 'Y', 'Y', 'N', 'N', 'Y'], columns=['YN'])
    YN
0   N
1   Y
2   Y
3   N
4   N
5   Y

Using itertuples (no repeation): 使用itertuples （无重复）：

np.random.seed(42)
arr = np.arange(1, len(df[df.YN == 'N']) + 1)
np.random.shuffle(arr)

cnt = 0
for idx, val in enumerate(df.itertuples()):
    if df.YN[idx] == 'N':
        df.loc[idx, 'new'] = arr[cnt]
        cnt += 1
    else:
        df.loc[idx, 'new'] = np.NaN
df.new = df.new.ffill().astype(int)
df
    YN  new
0   N   1
1   Y   1
2   Y   1
3   N   2
4   N   3
5   Y   3

Using apply (repetition may arise with small number range): 使用apply （可能会在较小的数字范围内出现重复）：

np.random.seed(42)
df['new'] = df.YN.apply(lambda x: np.random.randint(10) if x == 'N' else np.NaN).ffill().astype(int)
    YN  new
0   N   6
1   Y   6
2   Y   6
3   N   3
4   N   7
5   Y   7

遍历熊猫行并根据其他列中的值设置列值

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-03-12 05:59:55

遍历熊猫行并根据其他列中的值设置列值

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-03-12 05:59:55

解决方案1
1 已采纳 2019-03-12 05:59:55