熊猫read_csv忽略每个值前面的列索引

Question

Is there a way to read a file like this in and skip the column index (1-5) like this example? 有没有办法像这样的示例读取这样的文件并跳过列索引（1-5）？ I'm using read_csv. 我正在使用read_csv。

24.0 1:0.00632 2:18.00 3:2.310 4:0 5:0.5380 
21.6 1:0.02731 2:0.00 3:7.070 4:0 5:0.4690

Expected table read: 预期表读取：

24.0 0.00632 18.00 2.310 0 0.5380

Answer 1

read_csv won't handle this the way you want because it's not a CSV. read_csv不能以您想要的方式处理，因为它不是CSV。

You can do eg 你可以做例如

pd.DataFrame([[chunk.split(':')[-1] for chunk in line.split()] for line in f])

Answer 2

Your data is oddly structured. 您的数据结构奇怪。 Given the colon index separator, you can read the file mostly as text via the usual read_csv . 给定冒号索引分隔符，您可以通过通常的read_csv以文本形式读取文件。 Then, loop through each column in the dataframe (except for the first one), split the string on ':', take the second element which represents your desired value, and convert that value to a float (all done via a list comprehension). 然后，遍历数据帧中的每一列（第一列除外），在'：'上拆分字符串，获取代表所需值的第二个元素，然后将该值转换为浮点数（全部通过列表推导完成）。

df = pd.read_csv('data.txt', sep=' ', header=None)

>>> df
      0          1        2        3    4         5
0  24.0  1:0.00632  2:18.00  3:2.310  4:0  5:0.5380
1  21.6  1:0.02731   2:0.00  3:7.070  4:0  5:0.4690

df.iloc[:, 1:] = df.iloc[:, 1:].applymap(lambda s: float(s.split(':')[1]))

>>> df
      0        1   2     3  4      5
0  24.0  0.00632  18  2.31  0  0.538
1  21.6  0.02731   0  7.07  0  0.469

熊猫read_csv忽略每个值前面的列索引

问题描述

2 个解决方案

解决方案1
2 2016-02-17 03:16:43

解决方案2
1 已采纳 2016-02-17 03:16:25

熊猫read_csv忽略每个值前面的列索引

问题描述

2 个解决方案

解决方案1 2 2016-02-17 03:16:43

解决方案2 1 已采纳 2016-02-17 03:16:25

解决方案1
2 2016-02-17 03:16:43

解决方案2
1 已采纳 2016-02-17 03:16:25