Pandas Dataframe中的空白列

Question

How do I ignore last whitespace in a line when converting to Pandas DataFrame? 转换为Pandas DataFrame时，如何忽略行中的最后一个空格？

I have a CSV file in the following format: 我有一个CSV文件，格式如下：

Column #1   : Type
Column #2   : Total Length
Column #3   : Found
Column #4   : Grand Total

1;2;1;7.00;
2;32;2;0.76;
3;4;6;6.00;
4;1;5;4.00;

I loop through the 'Column #' lines to create my column names first (so 4 columns), then I parse the following lines to create my DataFrame using ';' 我遍历'Column＃'行首先创建我的列名（所以4列），然后我解析以下行以使用';'创建我的DataFrame as the separator. 作为分隔符。 However some of my files contain a trailing ';' 但是我的一些文件包含一个尾随';' on the end of each line as shown above, so my Pandas DataFrame thinks there is a 5th column containing whitespace, and consequently throws an error to say there aren't enough column names specified 在如上所示的每一行的末尾，所以我的Pandas DataFrame认为有一个包含空格的第5列，因此抛出一个错误，指出没有指定足够的列名

Is there a mechanism in Pandas to remove/ignore the trailing ';', or whitespace when creating a DataFrame? Pandas中是否有一种机制可以在创建DataFrame时删除/忽略尾随';'或空格？ I am using read_csv to create the DataFrame. 我正在使用read_csv来创建DataFrame。

Thanks. 谢谢。

Answer 1

Just pass param for usecols : 只需通过param for usecols ：

In [160]:
t="""1;2;1;7.00;
2;32;2;0.76;
3;4;6;6.00;
4;1;5;4.00;"""
import pandas as pd
import io
df = pd.read_csv(io.StringIO(t), sep=';', header=None, usecols=range(4))
df

Out[160]:
   0   1  2     3
0  1   2  1  7.00
1  2  32  2  0.76
2  3   4  6  6.00
3  4   1  5  4.00

Here I generate the list [0,1,2,3] to indicate which columns I'm interested in. 在这里，我生成列表[0,1,2,3]以指示我感兴趣的列。

Pandas Dataframe中的空白列

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-07-07 11:20:38

Pandas Dataframe中的空白列

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-07-07 11:20:38

解决方案1
1 已采纳 2015-07-07 11:20:38