简体   繁体   中英

How to create a dataframe based on list containing column names?

How to create a dataframe based on list containing column names?

Situation

I've got a list of column names stored in variable named data:

  • values_c1_114
  • values_c1_84
  • values_c1_37
  • values_c1_126 ...

In total there are 552 elements in the list data.

Now i want to create a dataframe based on this list of column names.

I thought i could access the values behind the column names by using:

for element in data:
    print(element)

But this coding only returns the column names and not the values behind the named column.

Result using for element approach

It is possible to access the values behind the column names.

Accessing single column values

SOLUTION : The following approach solved my problem. The columns contains the list of the column names that should be added to the dataframe.

data = {k: eval(k) for k in columns}
df = pd.DataFrame(data)
print(df)

You need get . Try for example:

variable <- 125
variable.name <- "variable"
get(variable.name)
# 125

So to build a data frame from your list of variable names stored in data you'd do something like

data.values <- lapply(data, get)  # get a list of data values
names(data.values ) <- data  # sets the names of each variable in the list
df <- data.frame(data.values)
# importing library
import pandas as pd

# list of column names
data = ['values_c1_114','values_c1_84','values_c1_37','values_c1_126']

# data inside each columns
values_c1_114_list = [i for i in range(1,11)]
values_c1_84_list = [i for i in range(11,21)]
values_c1_37_list = [i for i in range(21,31)]
values_c1_126_list = [i for i in range(31,41)]


# creating dict 
dict = {
    'values_c1_114':values_c1_114_list,
    'values_c1_84':values_c1_84_list,
    'values_c1_37':values_c1_37_list,
    'values_c1_126':values_c1_126_list
}

# creating dataframe
df = pd.DataFrame(dict)

#printing dataframe
print(df)

See if this is what you need. If I understand OP's question right, OP's key problem is how to get the variable name as a string, then use the set of string as the dataframe column.

def namestr(obj, namespace):
    return [name for name in namespace if namespace[name] is obj][0]
    
    import pandas as pd
    
    # to simulate the data you have
    col1 = [1, 2, 3]
    col2 = [4, 5, 6]
    data = [col1, col2]
    
    df = pd.DataFrame(data).T
    df.columns = [namestr(i, globals()) for i in data]
    print(df)

Output:

    col1  col2
0     1     4
1     2     5
2     3     6

Or the other way around, you have column names as strings in a list, then you could do something like this:

columns = ['col1','col2']
col1 = [1, 2, 3]
col2 = [3, 4, 5]

data = { k: eval(k) for k in columns }

df = pd.DataFrame(data)
print(df)

Output:

    col1  col2
0     1     3
1     2     4
2     3     5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM