[英][Pandas, Python]; Retain Empty Columns in Space Separated Data Frame
I have a data frame with values like below我有一个数据框,其值如下
A B C D
1 2 3 4
5 6 7
8 9
When i read the above frame into Pandas using the below当我使用以下内容将上述框架读入 Pandas 时
pd.read_csv(io.StringIO(raw_2), sep='\s+')
It is read as它被读作
A B C D
1 2 3 4
5 6 7 NaN
8 9 NaN NaN
Is there a way i can retain the blank columns and have the 9 under column D instead of B有没有办法可以保留空白列并在 D 列下使用 9 而不是 B
You need a reader that reads fixed-width columns:您需要一个读取固定宽度列的阅读器:
pd.read_fwf(io.StringIO(raw_2))
# A B C D
#0 1 2.0 3.0 4
#1 5 NaN 6.0 7
#2 8 NaN NaN 9
This procedure is not guaranteed to work in general.不保证此过程在一般情况下有效。 You may have to specify the columns widths by hand.
您可能必须手动指定列宽。
You can use:您可以使用:
pd.read_csv(io.StringIO(raw_2), sep=r'\s{1,2}')
A B C D
0 1 2.0 3.0 4
1 5 NaN 6.0 7
2 8 NaN NaN 9
Which uses the regex pattern \\s{1,2} as the separator.它使用正则表达式模式 \\s{1,2} 作为分隔符。 This regex matches 1-or-2 whitespace characters.
此正则表达式匹配 1 或 2 个空白字符。
\\s{1,2} matches any whitespace character (equal to [\\r\\n\\t\\f\\v ])
\\s{1,2} 匹配任何空白字符(等于 [\\r\\n\\t\\f\\v ])
{1,2} Quantifier — Matches between 1 and 2 times, as many times as possible, giving back as needed
{1,2} 量词 - 匹配 1 到 2 次,尽可能多,根据需要回馈
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.