简体   繁体   中英

Using Pandas data frame assign values from one column to a variable using another variable for the column name

In C# I'm sending in the following which is sys.argv 1 :

string depVar = "Cover_Type";

In Python I'm trying to accomplish the following using a Pandas data frame. The example code below fails...is there a way to do this?

import csv
import pandas as pd    
import sys

dependent_var = sys.argv[1]
df = pd.read_csv('train.csv')
y = df[dependent_var]

EDIT In my attempt to keep the details simple it sounds like I left out essential information (newbie mistake) so thank you for your patience.

(1)Here's a sample of the data: 在此处输入图片说明

Goal: The most important piece of info I left out (again sorry) was that I'm passing in the variable from another program, so my goal is definitely to use the variable value and not just print out the value.

I believe one of the answers provided is very close and actually answered my original question. BUT it doesn't solve my problem because the variable being passed in is a string and I'm thinking now that it needs to be converted to a list hence the need for the square brackets.

Error: KeyError: "['Flower_Type']"

Printing out columns:

Index(['Id', 'Elevation', 'Aspect', 'Slope',
       'Horizontal_Distance_To_Hydrology',
       'Flower_Type'],
      dtype='object')

Final Answer:

import csv
import pandas as pd    
import sys

depVar= sys.argv[1] # had to assign the incoming variable to a new variable
a = []
a.append(depVar)

df = pd.read_csv('train.csv')
y = df[a]

I believe you need below:

You need to quote column name inside [] .

dependent_var = ['Flower_Type']

then

y = df[dependent_var]

Debugging process:

You can try like [['Flower_Type']] it works if your csv contains spaces in the header or alternatively you can tim any confronting spaces of the cells.

df.columns = df.columns.to_series().apply(lambda x: x.strip())

OR:

df = pd.read_csv('train.csv', encoding="utf-8")

OR use Byte order mark as mentioned here

df = pd.read_csv('train.csv', encoding="utf-8-sig")

In this case it might make sense to use a list comprehension to strip all of the extra spaces.

df.columns = [col.strip() for col in df.columns]

Just go straight with

y = df['Flower_Type']

Why does it have to be stored in a variable?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM