简体   繁体   English

基于现有数字列、字符串列表作为列名和元组列表作为值在数据框中创建新列

[英]Create new columns in a data frame based on an existing numeric column, a list of strings as column names and a list of tuples as values

I have a data frame that contains a numeric column and I have a list of tuples and a list of strings.我有一个包含数字列的数据框,我有一个元组列表和一个字符串列表。 The list of tuples represents the values that should be added, where each index in that list corresponds to the numeric column in the data frame.元组列表表示应该添加的值,其中该列表中的每个索引对应于数据框中的数字列。 The list of strings represents the names of the to be added columns.字符串列表表示要添加的列的名称。

Example:例子:

import pandas as pd

df = pd.DataFrame({'number':[0,0,1,1,2,2,3,3]})

# a list of keys and a list of tuples
keys = ['foo','bar']
combinations = [('99%',0.9),('99%',0.8),('1%',0.9),('1%',0.8)]

Expected output:预期输出:

   number  foo  bar
0       0  99%  0.9
1       0  99%  0.9
2       1  99%  0.8
3       1  99%  0.8
4       2   1%  0.9
5       2   1%  0.9
6       3   1%  0.8
7       3   1%  0.8

Original post原帖

To get that output, you can just try要获得该输出,您可以尝试

df2 = pd.DataFrame(combinations, columns = keys)
pd.concat([df, df2], axis=1)

which returns返回

   number   foo   bar
0       0   99%   0.9
1       1   99%   0.8
2       2   1%    0.9
3       3   1%    0.8

Edit编辑

Based on your new requirements, you can use the following根据您的新要求,您可以使用以下内容

df.set_index('number', inplace=True)
df = df.merge(df2, left_index = True, right_index=True)
df = df.reset_index().rename(columns={'index':'number'})

This also works for different duplicates amounts, ie这也适用于不同的重复数量,即

df = pd.DataFrame({'number':[0,0,1,1,1,2,2,3,3,3]})

returns返回

   number   foo   bar
0       0   99%   0.9
1       0   99%   0.9
2       1   99%   0.8
3       1   99%   0.8
4       1   99%   0.8
5       2   1%    0.9
6       2   1%    0.9
7       3   1%    0.8
8       3   1%    0.8
9       3   1%    0.8

You can use list comprehension, in a for loop, I think it's a pretty fast and straightforward approach:您可以在for循环中使用列表理解,我认为这是一种非常快速和直接的方法:

for i in range(len(keys)):
    df[keys[i]] = [x[i] for x in combinations]

Output:输出:

   number  foo  bar
0       0  99%  0.9
1       1  99%  0.8
2       2   1%  0.9
3       3   1%  0.8

I found one solution using:我找到了一种解决方案:

df_new = pd.DataFrame()

for model_number,df_subset in df.groupby('number'):

    for key_idx,key in enumerate(keys):
        df_subset[key] = combinations[model_number][key_idx]

    df_new = df_new.append(df_subset)

But this seems pretty 'dirty' for me, there might be better and more efficient solutions?但这对我来说似乎很“脏”,可能有更好更有效的解决方案吗?

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用字符串列表或字典创建基于数据框中现有列的新列 - Using a List or Dictionary of Strings to create a new column based on an existing column within a Data frame 基于现有列在 pandas 数据框中创建新列 - Create new columns in pandas data frame based on existing column 根据列表中的列列表创建新数据框 - Create new data frame based on list of column from list 根据其他 pandas 列中列表中的值数创建新列? - Create new columns based on number of values in list in other pandas column? 连接数据框中的值和列名以创建新的数据框 - Concatenate values and column names in a data frame to create a new data frame 如何基于将现有列值与值列表匹配来简洁地创建新的 dataframe 列? - how do I succinctly create a new dataframe column based on matching existing column values with list of values? 如何根据同一数据框中其他列名称中匹配的 ID 和字符串创建新列? - How to create a new column based on matching ID's and string's in names of other columns in the same data frame? 使用 Pandas 从现有列创建新列到数据框 - Create a new column to data frame from existing columns using Pandas 如何从熊猫的数据框中创建列名,行名和值的列表? - How to create a list of column names, row names, and values from data frame in panda? 转换数据框中具有值的列作为数字列表的列表 - Convert a column in a data frame that has values as a list of lists as numeric
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM