简体   繁体   English

pandas将分组行转换为列

[英]pandas convert grouped rows into columns

I have a dataframe such as: 我有一个数据框,如:

label    column1
  a         1   
  a         2
  b         6
  b         4

I would like to make a dataframe with a new column, with the opposite value from column1 where the labels match. 我想用一个新列创建一个数据帧,其中tag1的相反值与标签匹配。 Such as: 如:

label    column1    column2
  a         1          2
  a         2          1
  b         6          4
  b         4          6

I know this is probably very simple to do with a groupby command but I've been searching and can't find anything. 我知道使用groupby命令可能很简单,但我一直在搜索,找不到任何东西。

you can try the code block below: 你可以尝试下面的代码块:

#create the Dataframe
df = pd.DataFrame({'label':['a','a','b','b'],
                  'column1':[1,2,6,4]})

#Group by label
a = df.groupby('label').first().reset_index()
b = df.groupby('label').last().reset_index()

#Concat those groups to create columns2
df2 = (pd.concat([b,a])
       .sort_values(by='label')
       .rename(columns={'column1':'column2'})
       .reset_index()
       .drop('index',axis=1))

#Merge with the original Dataframe
df = df.merge(df2,left_index=True,right_index=True,on='label')[['label','column1','column2']]

Hope this helps 希望这可以帮助

Assuming their are only pairs of labels, you could use the following as well: 假设它们只是一对标签,您也可以使用以下内容:

# Create dataframe
df = pd.DataFrame(data = {'label' :['a', 'a', 'b', 'b'],
                 'column1' :[1,2, 6,4]})
# iterate over dataframe, identify matching label and opposite value
for index, row in df.iterrows():
    newvalue = int(df[(df.label == row.label) & (df.column1 != row.column1)].column1.values[0])
    # set value to new column
    df.set_value(index, 'column2', newvalue)

df.head()

The following uses groupby and apply and seems to work okay: 以下使用groupbyapply ,似乎工作正常:

x = pd.DataFrame({ 'label': ['a','a','b','b'],
                   'column1': [1,2,6,4] })

y = x.groupby('label').apply(
    lambda g: g.assign(column2 = np.asarray(g.column1[::-1])))
y = y.reset_index(drop=True)  # optional: drop weird index

print(y)

You can use groupby with apply where create new Series with back order: 您可以使用groupby with apply创建具有groupby订单的新Series

df['column2'] = df.groupby('label')["column1"] \
                  .apply(lambda x: pd.Series(x[::-1].values)).reset_index(drop=True)

print (df)
   column1 label  column2
0        1     a        2
1        2     a        1
2        6     b        4
3        4     b        6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM