讀取 pandas 中制表符分隔的 txt 文件（python）

Question

嘗試讀取 pandas 中的 a.txt 文件時，出現錯誤，其中導入的文件只有一行，但列太多。

這是數據中的一行

    1   182154.6-025557   18:21:54.63   -02:55:57.2  0.0   8.25e-03  1.5e-02       0.20   1.02e-01   -1.95e-01  1.5e-02      55        37      189   0.0   1.53e-01  3.3e-02       0.16   6.32e-01    7.24e-01  6.5e-02      46        29   59   6.2   2.91e-01  5.8e-02       0.17   4.62e-01    6.83e-01  7.0e-02      37        20   54   6.3   3.27e-01  6.2e-02       0.19   3.92e-01    5.51e-01  6.6e-02      37        26   47   0.0   2.28e-01  9.8e-02       0.12  2.50e-01  9.8e-02  46    36       43      7.6        1.1    0.24         0.5     4.6         40    22   36  2     0      starless

我正在使用以下代碼導入數據：

data = pd.read_csv("data.txt", header=None, sep='\t', lineterminator='\r')

這輸出：

   0                              1      ... 26254                   26255
0      1  182154.6-025557   18:21:54.63  ...   NaN         CO high-V_LSR\n

[1 rows x 26256 columns]

有關如何正確導入此數據的任何建議都將非常有幫助

Answer 1

也許 your.txt 文件不是完全分隔的制表符。 此代碼應該適用於從文件中讀取多行。 如果它們之間有空格，它只會拆分項目。

with open('data.txt', 'r') as f:
    raw_data = f.readlines()
    data = []
    for line in raw_data:
        data.append([l for l in line.strip().split(' ') if l !=''])
pd.DataFrame(data)

我得到以下 output（具有 63 列的數據框）

                0            1            2    3         4        5     6   \
0  182154.6-025557  18:21:54.63  -02:55:57.2  0.0  8.25e-03  1.5e-02  0.20   

         7          8        9   ...   53    54   55   56  57  58  59 60 61  \
0  1.02e-01  -1.95e-01  1.5e-02  ...  1.1  0.24  0.5  4.6  40  22  36  2  0   

         62  
0  starless  

[1 rows x 63 columns]

要么，要么你想嘗試......


data = pd.read_csv("data.txt", header=None, sep='\t', lineterminator='\n')

讀取 pandas 中制表符分隔的 txt 文件（python）

問題描述

1 個解決方案

解決方案1
0 2022-02-15 12:14:07

讀取 pandas 中制表符分隔的 txt 文件（python）

問題描述

1 個解決方案

解決方案1 0 2022-02-15 12:14:07

解決方案1
0 2022-02-15 12:14:07