简体   繁体   English

用空格分隔的 csv 列名和值中有空格

[英]Space separated csv with spaces in column names and values

I have to load a csv file to dataframe but the columns are separated with single spaces and also contain spaces in columns/values names.我必须将 csv 文件加载到 dataframe 但列用单个空格分隔,并且列/值名称中也包含空格。 File looks like that:文件看起来像这样:

'Mod Ports Card Type                              Model              Serial No.',
'  3   20  7600 ES+                               7600-ES+20G3C      SAL1550Y9DL',
'  5    2  Route Switch Processor 720 (Active)    RSP720-3C-GE       SAL16095Q9W',
etc.

My best idea so far was to check for length of the word in the column name and then check if the corresponding values lower has bigger of lower number of characters but in some cases like 'Card Type' and '7600 ES+' could be potentially recognized as 2 separate columns.到目前为止,我最好的想法是检查列名中单词的长度,然后检查相应的值是否较低的字符数较大,但在某些情况下,例如“卡类型”和“7600 ES+”可能会被识别作为2个单独的列。

What's important is that this solution has to be universal and work not only for this example but for different ones too.重要的是这个解决方案必须是通用的,并且不仅适用于这个例子,也适用于不同的例子。 My goal is to read this file to dataframe or any other data structure.我的目标是将此文件读入 dataframe 或任何其他数据结构。

I tried to use the pd.read_fwf() function but it gives incorrect results.我尝试使用pd.read_fwf() function 但它给出了不正确的结果。 The output dataframe for my file looks like that:我的文件的 output dataframe 如下所示:

错误的输出

So not only it didn't catch the Card type correctly but it merged it with ports and created some Unnamed columns.因此,它不仅没有正确捕获Card type ,而且将其与端口合并并创建了一些未命名的列。

You can use read_fwf() :您可以使用read_fwf()

df = pd.read_fwf('my_file.csv')

It will work the best if you provide it with widths parameter for each column.如果您为每列提供widths参数,它将工作得最好。

EDIT编辑

Using the data you provided you can get results with colspecs parameter:使用您提供的数据,您可以使用colspecs参数获得结果:

df = pd.read_fwf(a, colspecs=[(0, 4), (4, 10), (10, 49), (49, 68), (68, 1000)])
df

   Mod  Ports                            Card Type          Model   Serial No.
0    3     20                             7600 ES+  7600-ES+20G3C  SAL1550Y9DL
1    5      2  Route Switch Processor 720 (Active)   RSP720-3C-GE  SAL16095Q9W

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM