簡體   English   中英

pd.read_csv 列錯位

[英]pd.read_csv column misalignment

我有一個包含兩列的文本文件,但是當我想使用 Pandas 讀取數據時,列和數據未對齊。

這是我的代碼供參考:

import pandas as pd
filename = r'data.txt'
#read text file into pandas DataFrame
df = pd.read_csv(
    filename,
    sep="\t",
    skiprows=3,
    names=['Frequency / MHz', 'S2,1 [SPara1]/abs,dB'],
)

df

結果:

Frequency / MHz S2,1 [SPara1]/abs,dB
0   0.1423125 -...  NaN
1   0.146625 -...   NaN
2   0.1509375 -...  NaN
3   0.15525 -...    NaN
4   0.1595625 -...  NaN
... ... ...
4059    17.64675 ...    NaN
4060    17.651063 ...   NaN
4061    17.655375 ...   NaN
4062    17.659688 ...   NaN
4063    17.664 ...  NaN

我的樣本數據.txt:

  Frequency / MHz                S2,1 [SPara1]/abs,dB
----------------------------------------------------------------------
                   0.138                     -0.92293553
               0.1423125                     -0.93264485
                0.146625                     -0.94201416
               0.1509375                     -0.95106676
                 0.15525                     -0.95982484
               0.1595625                     -0.96830956
                0.163875                     -0.97654091
               0.1681875                     -0.98453781
                  0.1725                     -0.99231804
               0.1768125                     -0.99989837
                0.181125                      -1.0072945
               0.1854375                      -1.0145212
                 0.18975                      -1.0215921
               0.1940625                      -1.0285203
                0.198375                      -1.0353177
               0.2026875                      -1.0419956
                   0.207                      -1.0485644
               0.2113125                      -1.0550339

我希望數據“NaN”更改為其原始數據。 我應該在我的代碼中更改什么?

您應該將您使用的分隔符更改為"\s\s+"

df = pd.read_csv(
    filename,
    sep="\s+",
    skiprows=3,
    names=['Frequency / MHz', 'S2,1 [SPara1]/abs,dB'],
    engine="python"
)

這輸出:

    Frequency / MHz  S2,1 [SPara1]/abs,dB
0          0.142313             -0.932645
1          0.146625             -0.942014
2          0.150938             -0.951067
3          0.155250             -0.959825
4          0.159562             -0.968310
5          0.163875             -0.976541
6          0.168187             -0.984538
7          0.172500             -0.992318
8          0.176813             -0.999898
9          0.181125             -1.007294
10         0.185438             -1.014521
11         0.189750             -1.021592
12         0.194062             -1.028520
13         0.198375             -1.035318
14         0.202687             -1.041996
15         0.207000             -1.048564
16         0.211312             -1.055034

我認為最好使用delim_whitespace=True

df = pd.read_csv(
    filename,
    delim_whitespace=True,
    skiprows=3,
    names=['Frequency / MHz', 'S2,1 [SPara1]/abs,dB'],
)

這輸出:

    Frequency / MHz  S2,1 [SPara1]/abs,dB
0          0.142313             -0.932645
1          0.146625             -0.942014
2          0.150938             -0.951067
3          0.155250             -0.959825
4          0.159562             -0.968310
5          0.163875             -0.976541
6          0.168187             -0.984538
7          0.172500             -0.992318
8          0.176813             -0.999898
9          0.181125             -1.007294
10         0.185438             -1.014521
11         0.189750             -1.021592
12         0.194062             -1.028520
13         0.198375             -1.035318
14         0.202687             -1.041996
15         0.207000             -1.048564
16         0.211312             -1.055034

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM