当值有空格时，将空格分隔的文件转换为 Pandas

Question

I have a space separated text file.我有一个空格分隔的文本文件。 The first 3 columns include spaces in the values , but they have fixed width (7 characters).前 3 列在 values 中包含空格，但它们具有固定宽度（7 个字符）。

Example:例子：

A123456 B123456 C123456 12 158 325 0 14
D123456 E123456 F123456 1 147 23 711 0
G1 3456 H123456 F 23456 158 11 7 574 12589
J1234 6 K   456 L123456 1458 2 0.45 1 78

Desired output:期望的输出：

0 0	1 1	2 2	3 3	4 4	5 5	6 6	7 7
0 0	A123456 A123456	B123456 B123456	C123456 C123456	12 12	158 158	325 325	0 0
1 1	D123456 D123456	E123456 E123456	F123456 F123456	1 1	147 147	23 23	711 711
2 2	G1 3456 G1 3456	H123456 H123456	F 23456 F 23456	158 158	11 11	7 7	574 574
3 3	J1234 6 J1234 6	K 456 K 456	L123456 L123456	1458 1458	2 2	0.45 0.45	1 1

Can I read this file with pandas?我可以用熊猫读取这个文件吗？

Answer 1

We can use pd.read_fwf to "Read a table of fixed-width formatted lines into DataFrame"我们可以使用pd.read_fwf来“将固定宽度格式化行的表格读入 DataFrame”

df = pd.read_fwf('data.txt', colspecs='infer', header=None)

df : df ：

         0        1        2                   3
0  A123456  B123456  C123456     12 158 325 0 14
1  D123456  E123456  F123456      1 147 23 711 0
2  G1 3456  H123456  F 23456  158 11 7 574 12589
3  J1234 6  K   456  L123456    1458 2 0.45 1 78

Column 3 can be str.split on spaces if the rest of the frame is to be space separated:如果框架的其余部分要以空格分隔，则第3列可以在空格上进行str.split ：

df = pd.read_fwf('data.txt', colspecs='infer', header=None)
# Replace 3 with new columns
df = df.drop(3, axis=1).join(df[3].str.split(expand=True), rsuffix='_x')
# Rename columns
df.columns = range(len(df.columns))

df : df ：

         0        1        2     3    4     5    6      7
0  A123456  B123456  C123456    12  158   325    0     14
1  D123456  E123456  F123456     1  147    23  711      0
2  G1 3456  H123456  F 23456   158   11     7  574  12589
3  J1234 6  K   456  L123456  1458    2  0.45    1     78

data.txt : data.txt ：

A123456 B123456 C123456 12 158 325 0 14
D123456 E123456 F123456 1 147 23 711 0
G1 3456 H123456 F 23456 158 11 7 574 12589
J1234 6 K   456 L123456 1458 2 0.45 1 78

Answer 2

You can use any of these: -您可以使用以下任何一种：-

data = pd.read_csv('data.txt',
                   sep=";|:|,",
                   header=None,
                   engine='python')

Or use read_fwf或者使用read_fwf

df = pd.read_fwf('data.txt', colspecs='infer', header=None)

This will write every value in a new column.这将在新列中写入每个值。 Hope this could be helpful.希望这会有所帮助。

当值有空格时，将空格分隔的文件转换为 Pandas

问题描述

2 个解决方案

解决方案1
1 2021-07-26 16:05:24

解决方案2
0 2021-07-26 16:09:06

当值有空格时，将空格分隔的文件转换为 Pandas

问题描述

2 个解决方案

解决方案1 1 2021-07-26 16:05:24

解决方案2 0 2021-07-26 16:09:06

解决方案1
1 2021-07-26 16:05:24

解决方案2
0 2021-07-26 16:09:06