基于现有数字列、字符串列表作为列名和元组列表作为值在数据框中创建新列

Question

I have a data frame that contains a numeric column and I have a list of tuples and a list of strings.我有一个包含数字列的数据框，我有一个元组列表和一个字符串列表。 The list of tuples represents the values that should be added, where each index in that list corresponds to the numeric column in the data frame.元组列表表示应该添加的值，其中该列表中的每个索引对应于数据框中的数字列。 The list of strings represents the names of the to be added columns.字符串列表表示要添加的列的名称。

Example:例子：

import pandas as pd

df = pd.DataFrame({'number':[0,0,1,1,2,2,3,3]})

# a list of keys and a list of tuples
keys = ['foo','bar']
combinations = [('99%',0.9),('99%',0.8),('1%',0.9),('1%',0.8)]

Expected output:预期输出：

   number  foo  bar
0       0  99%  0.9
1       0  99%  0.9
2       1  99%  0.8
3       1  99%  0.8
4       2   1%  0.9
5       2   1%  0.9
6       3   1%  0.8
7       3   1%  0.8

Answer 1

Original post原帖

To get that output, you can just try要获得该输出，您可以尝试

df2 = pd.DataFrame(combinations, columns = keys)
pd.concat([df, df2], axis=1)

which returns返回

   number   foo   bar
0       0   99%   0.9
1       1   99%   0.8
2       2   1%    0.9
3       3   1%    0.8

Edit编辑

Based on your new requirements, you can use the following根据您的新要求，您可以使用以下内容

df.set_index('number', inplace=True)
df = df.merge(df2, left_index = True, right_index=True)
df = df.reset_index().rename(columns={'index':'number'})

This also works for different duplicates amounts, ie这也适用于不同的重复数量，即

df = pd.DataFrame({'number':[0,0,1,1,1,2,2,3,3,3]})

returns返回

   number   foo   bar
0       0   99%   0.9
1       0   99%   0.9
2       1   99%   0.8
3       1   99%   0.8
4       1   99%   0.8
5       2   1%    0.9
6       2   1%    0.9
7       3   1%    0.8
8       3   1%    0.8
9       3   1%    0.8

Answer 2

You can use list comprehension, in a for loop, I think it's a pretty fast and straightforward approach:您可以在for循环中使用列表理解，我认为这是一种非常快速和直接的方法：

for i in range(len(keys)):
    df[keys[i]] = [x[i] for x in combinations]

Output:输出：

   number  foo  bar
0       0  99%  0.9
1       1  99%  0.8
2       2   1%  0.9
3       3   1%  0.8

Answer 3

I found one solution using:我找到了一种解决方案：

df_new = pd.DataFrame()

for model_number,df_subset in df.groupby('number'):

    for key_idx,key in enumerate(keys):
        df_subset[key] = combinations[model_number][key_idx]

    df_new = df_new.append(df_subset)

But this seems pretty 'dirty' for me, there might be better and more efficient solutions?但这对我来说似乎很“脏”，可能有更好更有效的解决方案吗？

基于现有数字列、字符串列表作为列名和元组列表作为值在数据框中创建新列

问题描述

3 个解决方案

解决方案1
2 已采纳 2020-02-28 15:58:03

Original post原帖

Edit编辑

解决方案2
1 2020-02-28 16:00:09

解决方案3
1 2020-02-28 17:11:50

基于现有数字列、字符串列表作为列名和元组列表作为值在数据框中创建新列

问题描述

3 个解决方案

解决方案1 2 已采纳 2020-02-28 15:58:03

Original post原帖

Edit编辑

解决方案2 1 2020-02-28 16:00:09

解决方案3 1 2020-02-28 17:11:50

解决方案1
2 已采纳 2020-02-28 15:58:03

解决方案2
1 2020-02-28 16:00:09

解决方案3
1 2020-02-28 17:11:50