[英]pandas read_csv to ignore the column index in front of each value
Is there a way to read a file like this in and skip the column index (1-5) like this example? 有没有办法像这样的示例读取这样的文件并跳过列索引(1-5)? I'm using read_csv.
我正在使用read_csv。
24.0 1:0.00632 2:18.00 3:2.310 4:0 5:0.5380
21.6 1:0.02731 2:0.00 3:7.070 4:0 5:0.4690
Expected table read: 预期表读取:
24.0 0.00632 18.00 2.310 0 0.5380
read_csv
won't handle this the way you want because it's not a CSV. read_csv
不能以您想要的方式处理,因为它不是CSV。
You can do eg 你可以做例如
pd.DataFrame([[chunk.split(':')[-1] for chunk in line.split()] for line in f])
Your data is oddly structured. 您的数据结构奇怪。 Given the colon index separator, you can read the file mostly as text via the usual
read_csv
. 给定冒号索引分隔符,您可以通过通常的
read_csv
以文本形式读取文件。 Then, loop through each column in the dataframe (except for the first one), split the string on ':', take the second element which represents your desired value, and convert that value to a float (all done via a list comprehension). 然后,遍历数据帧中的每一列(第一列除外),在':'上拆分字符串,获取代表所需值的第二个元素,然后将该值转换为浮点数(全部通过列表推导完成) 。
df = pd.read_csv('data.txt', sep=' ', header=None)
>>> df
0 1 2 3 4 5
0 24.0 1:0.00632 2:18.00 3:2.310 4:0 5:0.5380
1 21.6 1:0.02731 2:0.00 3:7.070 4:0 5:0.4690
df.iloc[:, 1:] = df.iloc[:, 1:].applymap(lambda s: float(s.split(':')[1]))
>>> df
0 1 2 3 4 5
0 24.0 0.00632 18 2.31 0 0.538
1 21.6 0.02731 0 7.07 0 0.469
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.