简体   繁体   English

如何将使用逗号作为分隔符但其中一列有逗号的文件导入熊猫?

[英]How to import into pandas a file that is using a comma as delimiter but one of its columns has commas?

I have a text file that is separated by commas, but several columns have commas inside them so it creates columns where they are not needed.我有一个用逗号分隔的文本文件,但有几列里面有逗号,所以它会在不需要它们的地方创建列。 I have tried eliminating all the commas, then using regex to find only the numbers and add a comma (not worked) using the following solution ( Put comma after a pattern in python regex ).我尝试消除所有逗号,然后使用正则表达式仅查找数字并使用以下解决方案添加逗号(不起作用)( 在 python regex 中的模式后放置逗号)。

Excel has the same problem, and other text editors as well. Excel 有同样的问题,其他文本编辑器也有。

0111,Cultivo de cereales y otros cultivos n.c.p.,011,Cultivos en general; cultivo de productos de mercado; hortic,01,AGRICULTURA, GANADERIA, CAZA Y ACTIVIDADES DE SERVICIOS CONE,01,**AGRICULTURA, GANADERIA, CAZA Y SILVICULTURA**  

If you can see in the ** text, Python will not create one column but 3.如果可以在**文本中看到,Python 不会创建一列而是创建 3 列。

Another solution would be to place " " marks, but I have not found a solution that creates.另一种解决方案是放置“”标记,但我还没有找到创建的解决方案。

Your data source is buggy.您的数据源有问题。 It should put quotes " " around such values, then pandas would be able to parse it.它应该在这些值周围加上引号" " ,然后 pandas 就可以解析它。 Without that, there is now no reliable logical way to tell the data apart now because the meaning of a comma now became ambiguous.没有它,现在就没有可靠的逻辑方法来区分数据,因为逗号的含义现在变得模棱两可。

A heuristic solution could be to assume that any comma followed by a space should be removed while the others should be retained, you could try that, but there can still be cases in which it may fail.一个启发式的解决方案可能是假设应该删除任何后跟空格的逗号,而应该保留其他逗号,您可以尝试这样做,但仍然存在可能失败的情况。

data.replace(", ", " ")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何用带有逗号分隔符和空格的pandas解析csv? - How do I parse a csv with pandas that has a comma delimiter and space? 逗号分隔符 CSV 导入为 Pandas 数据框 - Comma Delimiter CSV Import as Pandas Data Frame 如何导入没有分隔符且逗号表示缺少值的csv或txt文件 - How to import csv or txt file where there is no delimiter and commas represent missing values 使用 python 将文本转换为带有逗号分隔符的列 - text to columns with comma delimiter using python 在导入带有额外逗号的熊猫的csv文件时,如何使用正则表达式作为分隔符? - How can I use regex as a delimiter when importing a csv file with pandas with extra commas? 使用 Pandas 读取带分隔符的文件 - Reading file with delimiter using pandas 如何使用python将具有不同列的TXT文件转换为带有定界符“,”的文件a? - How to convert TXT file with various columns to file a with delimiter “,” using python? 使用 URL 导入 CSV 文件,使用 ZA7F5F35426B927411FC9231B53Z 中的 Pandas 文件 - import a CSV file using its URL using Pandas in Python 无法在 Pandas 中导入逗号分隔的引用文件 - Cannot import comma delimited quoted file in Pandas 使用熊猫读取csv文件,其中列由不同数量的空格和逗号分隔 - Reading csv file using pandas where columns are separated by varying amounts of whitespace and commas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM