简体   繁体   English

Pandas,读入文件而列之间没有分隔符

[英]Pandas, read in file without a separator between columns

I want to read in a file that looks like this: 我想读一个看起来像这样的文件:

 1.49998061E-01 2.49996769E-01 3.99994830E-01 5.99992245E-01 9.99987075E-01
 1.49998061E+00 2.49996769E+00 5.99992245E+00 9.99987075E+00 1.99997415E+01
 4.99993537E+01 9.99987075E+01  .00000000E+00-2.70636350E+03-6.37027451E+03
-1.97521328E+04-4.64928272E+04-1.09435407E+05-3.39323088E+05-7.98702345E+05
-1.87999269E+06-5.82921376E+06-1.37207895E+07-2.26385807E+07-4.25429547E+07
-7.60167523E+07-1.25422049E+08-2.35690283E+08-3.88862033E+08-7.30701955E+08
-1.30546599E+09-2.15348023E+09-4.04455001E+09-4.54896210E+09-5.32533888E+09

So, each column is denoted by a 15 character sequence, but there's no official separator. 因此,每列由15个字符序列表示,但没有官方分隔符。 Does pandas have a way of doing this? 大熊猫有办法做到这一点吗?

Yes! 是! its called pd.read_fwf 它叫做pd.read_fwf

from io import StringIO
import pandas as pd

txt = """ 1.49998061E-01 2.49996769E-01 3.99994830E-01 5.99992245E-01 9.99987075E-01
 1.49998061E+00 2.49996769E+00 5.99992245E+00 9.99987075E+00 1.99997415E+01
 4.99993537E+01 9.99987075E+01  .00000000E+00-2.70636350E+03-6.37027451E+03
-1.97521328E+04-4.64928272E+04-1.09435407E+05-3.39323088E+05-7.98702345E+05
-1.87999269E+06-5.82921376E+06-1.37207895E+07-2.26385807E+07-4.25429547E+07
-7.60167523E+07-1.25422049E+08-2.35690283E+08-3.88862033E+08-7.30701955E+08
-1.30546599E+09-2.15348023E+09-4.04455001E+09-4.54896210E+09-5.32533888E+09"""

pd.read_fwf(StringIO(txt), widths=[15] * 5, header=None)

              0             1             2             3             4
0  1.499981e-01  2.499968e-01  3.999948e-01  5.999922e-01  9.999871e-01
1  1.499981e+00  2.499968e+00  5.999922e+00  9.999871e+00  1.999974e+01
2  4.999935e+01  9.999871e+01  0.000000e+00 -2.706363e+03 -6.370275e+03
3 -1.975213e+04 -4.649283e+04 -1.094354e+05 -3.393231e+05 -7.987023e+05
4 -1.879993e+06 -5.829214e+06 -1.372079e+07 -2.263858e+07 -4.254295e+07
5 -7.601675e+07 -1.254220e+08 -2.356903e+08 -3.888620e+08 -7.307020e+08
6 -1.305466e+09 -2.153480e+09 -4.044550e+09 -4.548962e+09 -5.325339e+09

我们来看看使用pd.read_fwf

df = pd.read_fwf(csv_file,widths=[15]*5,header=None)

Let's do like that: for example: housing.data 我们这样做:例如: housing.data

在此输入图像描述

dataset = pd.read_csv('c:/1/housing.data', engine = 'python', sep='\s+', header=None)

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM