[英]Pandas: getting specific column from csv file
I have the following sample .csv
file: 我有以下示例.csv
文件:
str_header int_header
string_a 1
string_b 2
string_c 3
According to solutions on the internet, this code: 根据互联网上的解决方案,此代码:
import pandas as pd
data = pd.read_csv("z.csv", names=['int_header'])
print(data['int_header'])
should only read int_header
column into data
. 应该只将int_header
列读入data
。 But data
, when printed as above, actually contains all of the file columns. 但是,如上所述打印的data
实际上包含所有文件列。 I'm using Anaconda distribution of Python. 我正在使用Anaconda的Python发行版。 What's wrong? 怎么了?
try this instead: 试试这个:
data = pd.read_csv("z.csv", usecols=['int_header'])
assuming that your CSV file has ,
as a delimiter 假设您的CSV文件,
作为分隔符
Explanation: 说明:
names : array-like, default None names: array-like,默认为None
List of column names to use. 要使用的列名列表。 If file contains no header row, then you should explicitly pass header=None 如果文件不包含标题行,则应显式传递header = None
usecols : array-like, default None usecols:类似于数组,默认为None
Return a subset of the columns. 返回列的子集。 Results > in much faster parsing time and lower memory usage. 结果>更快的解析时间和更低的内存使用率。
documentation is a bit confusing. 文档有点令人困惑。
names
- used for naming (giving columns names), especially if you don't have a header line or want to ignore/skip it. names
- 用于命名(给列名称),特别是如果您没有标题行或想要忽略/跳过它。
usecols
- used for choosing only "interesting" columns usecols
- 用于仅选择“有趣”列
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.