简体   繁体   English

[熊猫,蟒蛇]; 在空间分隔的数据框中保留空列

[英][Pandas, Python]; Retain Empty Columns in Space Separated Data Frame

I have a data frame with values like below我有一个数据框,其值如下

A B C D
1 2 3 4
5   6 7
8     9

When i read the above frame into Pandas using the below当我使用以下内容将上述框架读入 Pandas 时

pd.read_csv(io.StringIO(raw_2), sep='\s+')

It is read as它被读作

A B C   D
1 2 3   4
5 6 7   NaN
8 9 NaN NaN

Is there a way i can retain the blank columns and have the 9 under column D instead of B有没有办法可以保留空白列并在 D 列下使用 9 而不是 B

You need a reader that reads fixed-width columns:您需要一个读取固定宽度列的阅读器:

pd.read_fwf(io.StringIO(raw_2))
#   A    B    C  D
#0  1  2.0  3.0  4
#1  5  NaN  6.0  7
#2  8  NaN  NaN  9

This procedure is not guaranteed to work in general.不保证此过程在一般情况下有效。 You may have to specify the columns widths by hand.您可能必须手动指定列宽。

You can use:您可以使用:

pd.read_csv(io.StringIO(raw_2), sep=r'\s{1,2}')

    A   B   C   D
0   1   2.0 3.0 4
1   5   NaN 6.0 7
2   8   NaN NaN 9

Which uses the regex pattern \\s{1,2} as the separator.它使用正则表达式模式 \\s{1,2} 作为分隔符。 This regex matches 1-or-2 whitespace characters.此正则表达式匹配 1 或 2 个空白字符。

\\s{1,2} matches any whitespace character (equal to [\\r\\n\\t\\f\\v ]) \\s{1,2} 匹配任何空白字符(等于 [\\r\\n\\t\\f\\v ])

{1,2} Quantifier — Matches between 1 and 2 times, as many times as possible, giving back as needed {1,2} 量词 - 匹配 1 到 2 次,尽可能多,根据需要回馈

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM