熊猫：生成一个数据框列，其值取决于数据框的另一列

Question

I am trying to generate a pandas Dataframe where a column will have numerical values based on the values of a column in another dataframe. 我正在尝试生成一个熊猫数据帧，其中一列将具有基于另一个数据帧中一列的值的数值。 Below is an example: I want to generate another dataframe based on a column of dataframe df_ 下面是一个示例：我想基于数据框df_的列生成另一个数据框

ipdb> df_ = pd.DataFrame({'c1':[False, True, False, True]})
ipdb> df_
      c1
0  False
1   True
2  False
3   True

Using df_ another dataframe df1 is generated with columns as below. 使用df_，将生成具有以下列的另一个数据帧df1。

ipdb> df1
   col1  col2
0     0   NaN
1     1   0
2     2   NaN
3     3   1

Here, 'col1' has normal index values and 'c1' has NaN in the rows where there was False in df_ and sequentially incrementing values where 'c1' is True. 在这里，'col1'具有正常的索引值，而'c1'在df_中存在False的行中具有NaN，并在'c1'为True时按顺序递增值。

To generate this dataframe, below is what I have tried. 为了生成此数据框，以下是我尝试过的操作。

ipdb> df_[df_['c1']==True].reset_index().reset_index()
   level_0  index    c1
0        0      1  True
1        1      3  True

However, I feel there should be a better way to generate a dataframe with the two columns as in df1. 但是，我认为应该像df1一样，有一种更好的方法来生成包含两列的数据框。

Answer 1

I think you need cumsum and subtract 1 for start counting from 0 : 我认为您需要cumsum并从0开始减去1 ：

df_ = pd.DataFrame({'c1':[False, True, False, True]})

df_['col2'] = df_.loc[df_['c1'], 'c1'].cumsum().sub(1)
print (df_)
      c1  col2
0  False   NaN
1   True   0.0
2  False   NaN
3   True   1.0

Another solution is count occurencies of True values by sum with numpy.arange and assign back to filtered DataFrame : 另一个解决方案是用numpy.arange的sum计算True值的出现numpy.arange然后分配回已过滤的DataFrame ：

df_.loc[df_['c1'],'col2']= np.arange(df_['c1'].sum())
print (df_)
      c1  col2
0  False   NaN
1   True   0.0
2  False   NaN
3   True   1.0

Details : 详细资料 ：

print (df_['c1'].sum())
2

print (np.arange(df_['c1'].sum()))
[0 1]

Answer 2

another way to solve this, 解决这个问题的另一种方法，

df.loc[df['c1'],'col2']=range(len(df[df['c1']]))

Output: 输出：

      c1  col2
0  False   NaN
1   True   0.0
2  False   NaN
3   True   1.0

熊猫：生成一个数据框列，其值取决于数据框的另一列

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-11-20 06:26:21

解决方案2
2 2018-11-20 06:38:07

熊猫：生成一个数据框列，其值取决于数据框的另一列

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-11-20 06:26:21

解决方案2 2 2018-11-20 06:38:07

解决方案1
2 已采纳 2018-11-20 06:26:21

解决方案2
2 2018-11-20 06:38:07