简体   繁体   English

如何基于包含列名的列表创建数据框?

[英]How to create a dataframe based on list containing column names?

How to create a dataframe based on list containing column names?如何基于包含列名的列表创建数据框?

Situation情况

I've got a list of column names stored in variable named data:我有一个存储在名为 data 的变量中的列名列表:

  • values_c1_114 values_c1_114
  • values_c1_84 values_c1_84
  • values_c1_37 values_c1_37
  • values_c1_126 ... values_c1_126 ...

In total there are 552 elements in the list data.列表数据中总共有 552 个元素。

Now i want to create a dataframe based on this list of column names.现在我想根据这个列名列表创建一个数据框。

I thought i could access the values behind the column names by using:我以为我可以使用以下方法访问列名后面的值:

for element in data:
    print(element)

But this coding only returns the column names and not the values behind the named column.但是这种编码只返回列名,而不返回命名列后面的值。

Result using for element approach结果使用 for 元素方法

It is possible to access the values behind the column names.可以访问列名后面的值。

Accessing single column values访问单列值

SOLUTION : The following approach solved my problem.解决方案:以下方法解决了我的问题。 The columns contains the list of the column names that should be added to the dataframe. columns包含应添加到数据框中的列名称列表。

data = {k: eval(k) for k in columns}
df = pd.DataFrame(data)
print(df)

You need get .你需要get Try for example:尝试例如:

variable <- 125
variable.name <- "variable"
get(variable.name)
# 125

So to build a data frame from your list of variable names stored in data you'd do something like因此,要从存储在data中的变量名称列表中构建数据框,您可以执行以下操作

data.values <- lapply(data, get)  # get a list of data values
names(data.values ) <- data  # sets the names of each variable in the list
df <- data.frame(data.values)
# importing library
import pandas as pd

# list of column names
data = ['values_c1_114','values_c1_84','values_c1_37','values_c1_126']

# data inside each columns
values_c1_114_list = [i for i in range(1,11)]
values_c1_84_list = [i for i in range(11,21)]
values_c1_37_list = [i for i in range(21,31)]
values_c1_126_list = [i for i in range(31,41)]


# creating dict 
dict = {
    'values_c1_114':values_c1_114_list,
    'values_c1_84':values_c1_84_list,
    'values_c1_37':values_c1_37_list,
    'values_c1_126':values_c1_126_list
}

# creating dataframe
df = pd.DataFrame(dict)

#printing dataframe
print(df)

See if this is what you need.看看这是否是您所需要的。 If I understand OP's question right, OP's key problem is how to get the variable name as a string, then use the set of string as the dataframe column.如果我理解 OP 的问题,那么 OP 的关键问题是如何将变量名作为字符串获取,然后使用字符串集作为dataframe列。

def namestr(obj, namespace):
    return [name for name in namespace if namespace[name] is obj][0]
    
    import pandas as pd
    
    # to simulate the data you have
    col1 = [1, 2, 3]
    col2 = [4, 5, 6]
    data = [col1, col2]
    
    df = pd.DataFrame(data).T
    df.columns = [namestr(i, globals()) for i in data]
    print(df)

Output:输出:

    col1  col2
0     1     4
1     2     5
2     3     6

Or the other way around, you have column names as strings in a list, then you could do something like this:或者反过来,您将列名作为列表中的字符串,然后您可以执行以下操作:

columns = ['col1','col2']
col1 = [1, 2, 3]
col2 = [3, 4, 5]

data = { k: eval(k) for k in columns }

df = pd.DataFrame(data)
print(df)

Output:输出:

    col1  col2
0     1     3
1     2     4
2     3     5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM