简体   繁体   English

熊猫:从csv文件中获取特定列

[英]Pandas: getting specific column from csv file

I have the following sample .csv file: 我有以下示例.csv文件:

str_header  int_header
string_a       1
string_b       2
string_c       3

According to solutions on the internet, this code: 根据互联网上的解决方案,此代码:

import pandas as pd
data = pd.read_csv("z.csv", names=['int_header'])
print(data['int_header'])

should only read int_header column into data . 应该只将int_header列读入data But data , when printed as above, actually contains all of the file columns. 但是,如上所述打印的data实际上包含所有文件列。 I'm using Anaconda distribution of Python. 我正在使用Anaconda的Python发行版。 What's wrong? 怎么了?

try this instead: 试试这个:

data = pd.read_csv("z.csv", usecols=['int_header'])

assuming that your CSV file has , as a delimiter 假设您的CSV文件,作为分隔符

Explanation: 说明:

Docs: 文档:

names : array-like, default None names: array-like,默认为None

List of column names to use. 要使用的列名列表。 If file contains no header row, then you should explicitly pass header=None 如果文件不包含标题行,则应显式传递header = None

usecols : array-like, default None usecols:类似于数组,默认为None

Return a subset of the columns. 返回列的子集。 Results > in much faster parsing time and lower memory usage. 结果>更快的解析时间和更低的内存使用率。

documentation is a bit confusing. 文档有点令人困惑。

names - used for naming (giving columns names), especially if you don't have a header line or want to ignore/skip it. names - 用于命名(给列名称),特别是如果您没有标题行或想要忽略/跳过它。

usecols - used for choosing only "interesting" columns usecols - 用于仅选择“有趣”列

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM