如何基于熊猫中现有列的迭代来创建新列？

Question

I have a dataframe, 我有一个数据框

     foo   column1 column2 ..... column9999
0     5      0.8      0.01
1     10     0.9      0.01
2     15     0.2      1.2
3     8      0.12     0.5
4     74     0.78     0.7
.      ...     ...

Based on this existing columns, I want to create new column. 基于此现有列，我想创建一个新列。
If I go one by one, it would be like this, 如果我一个人走，就会是这样，

df["A1"] = df.foo[df["column1"] > 0.1].rank(ascending=False)
df.A1.fillna(value=0, inplace=True)
df['new_A1'] = (1+df['A1'])
df['log_A1'] = np.log(df.['new_A1'])

But, I don't want to write down all columns(>900 columns). 但是，我不想写下所有列（> 900列）。
How can I iterate and create new columns? 如何迭代和创建新列？
Thanks in advance! 提前致谢！

Answer 1

Here's a cleaned up version of what I think you are trying to do: 这是我认为您要执行的操作的清理版本：

# Include only variables with the "column" stub
cols = [c for c in df.columns if 'column' in c]

for i, c in enumerate(cols):
    a = f"A{i+1}"
    df[a] = 1 + df.loc[df[c] > 0.1, 'foo'].rank(ascending=False)
    df[f'log_{a}'] = np.log(df[a]).fillna(value=0)

I'm assuming that you didn't need the variable new_A# column and was just using it as an intermediate column for the log calculation. 我假设您不需要变量new_A＃列，而只是将其用作日志计算的中间列。

Answer 2

You can iterate through the different column names and perform the +1 and the log operations. 您可以遍历不同的列名称，并执行+1和log操作。 When you use df.columns , you then receive a list of the different column headers. 使用df.columns ，您将收到不同列标题的列表。 So you can do something like this for example: 因此，您可以例如执行以下操作：

for index, column in enumerate(df.columns):
  df['new_A' + str(index)] = (1+df[column])
  df['log_A' + str(index)] = np.log(df['new_A' + str(index)])

You can add the rest of the operations too inside the same loop. 您也可以在同一循环内添加其余操作。

Hope it helps 希望能帮助到你

Answer 3

You can just do: 您可以这样做：

import pandas as pd
import numpy as np


df = pd.read_csv('something.csv')


a = ['A'+str(i) for i in range(1, len(df.columns.values))]
b = [x for x in df.columns.values if x != 'foo']
to_create = list(zip(b, a))
for create in to_create:
    df[create[1]] = df.foo[df[create[0]] > 0.1].rank(ascending=False)
    df['new_'+create[1]] = (1+df[create[1]])
    df['log_'+create[1]] = np.log(df['new_'+create[1]])

print(df.fillna(value=0))

which outputs: 输出：

   foo  column1  column2   A1  new_A1    log_A1   A2  new_A2    log_A2
0    5     0.80     0.01  5.0     6.0  1.791759  0.0     0.0  0.000000
1   10     0.90     0.01  3.0     4.0  1.386294  0.0     0.0  0.000000
2   15     0.20     1.20  2.0     3.0  1.098612  2.0     3.0  1.098612
3    8     0.12     0.50  4.0     5.0  1.609438  3.0     4.0  1.386294
4   74     0.78     0.70  1.0     2.0  0.693147  1.0     2.0  0.693147

如何基于熊猫中现有列的迭代来创建新列？

问题描述

3 个解决方案

解决方案1
1 已采纳 2019-08-17 17:28:50

解决方案2
0 2019-08-17 17:21:48

解决方案3
0 2019-08-17 17:39:37

如何基于熊猫中现有列的迭代来创建新列？

问题描述

3 个解决方案

解决方案1 1 已采纳 2019-08-17 17:28:50

解决方案2 0 2019-08-17 17:21:48

解决方案3 0 2019-08-17 17:39:37

解决方案1
1 已采纳 2019-08-17 17:28:50

解决方案2
0 2019-08-17 17:21:48

解决方案3
0 2019-08-17 17:39:37