简体   繁体   English

Python Pandas:如何在读取文件时跳过列?

[英]Python Pandas : How to skip columns when reading a file?

I have table formatted as follow : 我的表格格式如下:

foo - bar - 10 2e-5 0.0 some information
quz - baz - 4 1e-2 1 some other description in here

When I open it with pandas doing : 当我用熊猫打开它时:

a = pd.read_table("file", header=None, sep=" ")

It tells me : 它告诉我:

CParserError: Error tokenizing data. C error: Expected 9 fields in line 2, saw 12

What I'd basically like to have is something similar to the skiprows option which would allow me to do something like : 我基本上喜欢的是类似于skiprows选项,它可以让我做类似的事情:

a = pd.read_table("file", header=None, sep=" ", skipcolumns=[8:])

I'm aware that I could re-format this table with awk , but I'd like to known whether a Pandas solution exists or not. 我知道我可以用awk重新格式化这个表,但是我想知道是否存在Pandas解决方案。

Thanks. 谢谢。

The usecols parameter allows you to select which columns to use: usecols参数允许您选择要使用的列:

a = pd.read_table("file", header=None, sep=" ", usecols=range(8))

However, to accept irregular column counts you need to also use engine='python' . 但是,要接受不规则的列数,您还需要使用engine='python'

If you are using Linux/OS X/Windows Cygwin, you should be able to prepare the file as follows: 如果您使用的是Linux / OS X / Windows Cygwin,则应该能够按如下方式准备文件:

cat your_file |  cut -d' ' -f1,2,3,4,5,6,7 > out.file

Then in Python: 然后在Python中:

a = pd.read_table("out.file", header=None, sep=" ")

Example: 例:

Input: 输入:

foo - bar - 10 2e-5 0.0 some information
quz - baz - 4 1e-2 1 some other description in here

Output: 输出:

foo - bar - 10 2e-5 0.0
quz - baz - 4 1e-2 1

You can run this command manually on the command-line, or simply call it from within Python using the subprocess module . 您可以在命令行上手动运行此命令,或者只需使用subprocess模块从Python中调用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python和Pandas:读取数据时如何跳过创建中间数据文件? - Python & Pandas: How can I skip creating intermediate data file when reading data? 读取CSV文件时如何跳过列-Python - How to skip a column when reading a CSV file - Python 使用python脚本阅读时如何跳过csv文件中的空行? - How to skip empty lines in csv file when reading with python script? 如何在python中读取yaml文件时跳过行? - How to skip lines when reading a yaml file in python? Python如何在读取文本文件时跳过空行 - Python How to skip empty lines when reading a text file 读取excel框架时跳过特定的一组列 - 熊猫 - Skip specific set of columns when reading excel frame - pandas 熊猫:读取文件时跳过包含特定字符串的行 - Pandas: skip lines containing a certain string when reading a file 如何检测在pandas上读取excel文件时要跳过的行数 - How to detect the number of the rows to skip in reading excel file on pandas Python pandas:读取 Excel 文件时如何指定数据类型? - Python pandas: how to specify data types when reading an Excel file? 在Python(可能是pandas)中从文件读取数组时,处理和跳过第一行(包含元数据)的优雅方法? - Elegant way to process and skip first line(s) (containing metadata) when reading an array from file in Python (possibly pandas)?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM