遍历熊猫数据帧连续的列

Question

我试图在熊猫中创建一个循环，以计算连续列之间的差异，并在新列中提供输出：

原始df：

**201601** **201602** **201603**  
100           200         500

所需的输出

**201601** **201602** **201603**  **201602_201601** **201603_02**
100           200         500         100          300

我的代码是我从stackoverflow帖子修改的（[ 将列添加到由python中的for循环计算的数据帧中）：

for i in df.iloc[:,2:5]:
  for j in df.iloc[:,2:5]:
    if i == j:
        break
    else:
        bina = df[i]-df[j]
        df['MOM_' + str(j) + '_' + str(i)] = bina
df.head()

但是，我得到的输出如下：

**201601** **201602** **201603**  **201602_201601** **201603_201601** **201603_201602**
100           200         500         100          400   300

我已经使用pd.diff来完成我需要的操作，但无法找出for循环代码。 任何帮助将不胜感激。

谢谢

Answer 1

使用diff和带有zip简单列表理解来构造列的名称。

cols = [f'{b}_{a}' for (a,b) in zip(df.columns, df.columns[1:])]
df[cols] = df.diff(axis=1).dropna(axis=1)

    201601  201602  201603  201602_201601   201603_201602
0   100     200     500     100             300

避免在使用熊猫时始终使用for循环

Answer 2

这只是修复您的代码

col=df.columns
for x,i in enumerate(col):
    for y,j in enumerate(col):
        if  y-x==1 and i!=j:
            bina = df[i]-df[j]
            df['MOM_' + str(j) + '_' + str(i)] = bina
df.columns
Out[1210]: 
Index(['**201601**', '**201602**', '**201603**', 'MOM_**201602**_**201601**',
       'MOM_**201603**_**201602**'],
      dtype='object')

遍历熊猫数据帧连续的列

问题描述

2 个解决方案

解决方案1
2 2018-12-07 18:29:01

解决方案2
1 已采纳 2018-12-07 18:30:14

遍历熊猫数据帧连续的列

问题描述

2 个解决方案

解决方案1 2 2018-12-07 18:29:01

解决方案2 1 已采纳 2018-12-07 18:30:14

解决方案1
2 2018-12-07 18:29:01

解决方案2
1 已采纳 2018-12-07 18:30:14