熊猫：从csv文件中获取特定列

Question

I have the following sample .csv file: 我有以下示例.csv文件：

str_header  int_header
string_a       1
string_b       2
string_c       3

According to solutions on the internet, this code: 根据互联网上的解决方案，此代码：

import pandas as pd
data = pd.read_csv("z.csv", names=['int_header'])
print(data['int_header'])

should only read int_header column into data . 应该只将int_header列读入data 。 But data , when printed as above, actually contains all of the file columns. 但是，如上所述打印的data实际上包含所有文件列。 I'm using Anaconda distribution of Python. 我正在使用Anaconda的Python发行版。 What's wrong? 怎么了？

Answer 1

try this instead: 试试这个：

data = pd.read_csv("z.csv", usecols=['int_header'])

assuming that your CSV file has , as a delimiter 假设您的CSV文件,作为分隔符

Explanation: 说明：

Docs: 文档：

names : array-like, default None names： array-like，默认为None

List of column names to use. 要使用的列名列表。 If file contains no header row, then you should explicitly pass header=None 如果文件不包含标题行，则应显式传递header = None

usecols : array-like, default None usecols：类似于数组，默认为None

Return a subset of the columns. 返回列的子集。 Results > in much faster parsing time and lower memory usage. 结果>更快的解析时间和更低的内存使用率。

documentation is a bit confusing. 文档有点令人困惑。

names - used for naming (giving columns names), especially if you don't have a header line or want to ignore/skip it. names - 用于命名（给列名称），特别是如果您没有标题行或想要忽略/跳过它。

usecols - used for choosing only "interesting" columns usecols - 用于仅选择“有趣”列

熊猫：从csv文件中获取特定列

问题描述

1 个解决方案

解决方案1
5 已采纳 2016-04-27 21:36:03

熊猫：从csv文件中获取特定列

问题描述

1 个解决方案

解决方案1 5 已采纳 2016-04-27 21:36:03

解决方案1
5 已采纳 2016-04-27 21:36:03