I have a data frame with values like below
A B C D
1 2 3 4
5 6 7
8 9
When i read the above frame into Pandas using the below
pd.read_csv(io.StringIO(raw_2), sep='\s+')
It is read as
A B C D
1 2 3 4
5 6 7 NaN
8 9 NaN NaN
Is there a way i can retain the blank columns and have the 9 under column D instead of B
You need a reader that reads fixed-width columns:
pd.read_fwf(io.StringIO(raw_2))
# A B C D
#0 1 2.0 3.0 4
#1 5 NaN 6.0 7
#2 8 NaN NaN 9
This procedure is not guaranteed to work in general. You may have to specify the columns widths by hand.
You can use:
pd.read_csv(io.StringIO(raw_2), sep=r'\s{1,2}')
A B C D
0 1 2.0 3.0 4
1 5 NaN 6.0 7
2 8 NaN NaN 9
Which uses the regex pattern \\s{1,2} as the separator. This regex matches 1-or-2 whitespace characters.
\\s{1,2} matches any whitespace character (equal to [\\r\\n\\t\\f\\v ])
{1,2} Quantifier — Matches between 1 and 2 times, as many times as possible, giving back as needed
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.