簡體   English   中英

如何在pandas中使用read_fwf跳過空白行?

[英]How to skip blank lines with read_fwf in pandas?

我在Python pandas 0.19.2中使用pandas.read_fwf()函數來讀取具有以下內容的文件fwf.txt

# Column1 Column2
      123     abc

      456     def

#
#

我的代碼如下:

import pandas as pd
file_path = "fwf.txt"
widths = [len("# Column1"), len(" Column2")]
names = ["Column1", "Column2"]
data = pd.read_fwf(filepath_or_buffer=file_path, widths=widths, 
                   names=names, skip_blank_lines=True, comment="#")

打印的數據框如下:

    Column1 Column2
0   123.0   abc
1   NaN     NaN
2   456.0   def
3   NaN     NaN

看起來像skip_blank_lines=True參數被忽略,因為數據幀包含NaN。

什么應該是pandas.read_fwf()參數的有效組合,以確保跳過空行?

import io
import pandas as pd
file_path = "fwf.txt"
widths = [len("# Column1 "), len("Column2")]
names = ["Column1", "Column2"]

class FileLike(io.TextIOBase):
    def __init__(self, iterable):
        self.iterable = iterable
    def readline(self):
        return next(self.iterable)

with open(file_path, 'r') as f:
    lines = (line for line in f if line.strip())
    data = pd.read_fwf(FileLike(lines), widths=widths, names=names, 
                       comment='#')
    print(data)

版畫

   Column1 Column2
0      123     abc
1      456     def

with open(file_path, 'r') as f:
    lines = (line for line in f if line.strip())

定義一個生成器表達式(即一個可迭代的),它從文件中產生一行,並刪除空白行。

pd.read_fwf函數可以接受TextIOBase對象。 您可以TextIOBase子類,以便其readline方法從iterable返回行:

class FileLike(io.TextIOBase):
    def __init__(self, iterable):
        self.iterable = iterable
    def readline(self):
        return next(self.iterable)

將這兩者放在一起為您提供了一種在將文件傳遞給pd.read_fwf之前操作/修改文件行的pd.read_fwf

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM