简体   繁体   中英

Pandas: getting specific column from csv file

I have the following sample .csv file:

str_header  int_header
string_a       1
string_b       2
string_c       3

According to solutions on the internet, this code:

import pandas as pd
data = pd.read_csv("z.csv", names=['int_header'])
print(data['int_header'])

should only read int_header column into data . But data , when printed as above, actually contains all of the file columns. I'm using Anaconda distribution of Python. What's wrong?

try this instead:

data = pd.read_csv("z.csv", usecols=['int_header'])

assuming that your CSV file has , as a delimiter

Explanation:

Docs:

names : array-like, default None

List of column names to use. If file contains no header row, then you should explicitly pass header=None

usecols : array-like, default None

Return a subset of the columns. Results > in much faster parsing time and lower memory usage.

documentation is a bit confusing.

names - used for naming (giving columns names), especially if you don't have a header line or want to ignore/skip it.

usecols - used for choosing only "interesting" columns

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM