简体   繁体   English

我如何消除 CSV for Pandas Dataframe 中行尾的逗号,而只有一些逗号?

[英]How can I eliminate comma at end of lines in CSV for Pandas Dataframe when only some have them?

I am trying to convert a csv file to pandas df.我正在尝试将 csv 文件转换为 pandas df。 The data is of the following type (SROIE dataset) (this is just a small part of total file):数据属于以下类型(SROIE 数据集)(这只是整个文件的一小部分):

76,50,323,50,323,84,76,84,TAN WOON YANN
110,165,315,165,315,188,110,188,INDAH GIFT & HOME DECO
126,191,297,191,297,214,126,214,27,JALAN DEDAP 13,
129,218,287,218,287,236,129,236,TAMAN JOHOR JAYA,
100,243,324,243,324,261,100,261,81100 JOHOR BAHRU,JOHOR.
70,268,201,268,201,285,70,285,TEL:07-3507405

THE ISSUE LIES ONLY IN THE LAST COLUMN, WHICH DOESN'T DISPLAY THE ENTIRE TEXT INFORMATION I NEED.问题仅存在于最后一列,它没有显示我需要的全部文本信息。 Based on an answer I found on pandas dataframe read csv with rows that have/not have comma at the end , I used the following code:根据我在pandas dataframe 上找到的答案,读取 csv 的行末尾有/没有逗号,我使用了以下代码:

pd.read_csv(r'D:\E_Drive\everything else\C2\SROIE2019\0325updated.task1train(626p)\X00016469619.txt',usecols=np.arange(0,9), header=None)

This gave the following output:这给出了以下 output: 我得到的熊猫数据框结果

The problem is that, for example in line 3 (row labelled 2 in pd dataframe)ie问题是,例如在第 3 行(在 pd 数据框中标记为 2 的行)即

126,191,297,191,297,214,126,214,27,JALAN DEDAP 13,

I need我需要

27,JALAN DEDAP 13, 27, JALAN DEDAP 13,

but I am getting但我得到

27 27

only.只要。 Same is the issue in line 5 (row labelled 4 in pd dataframe):第 5 行(在 pd 数据框中标记为 4 的行)中的问题也是如此:

100,243,324,243,324,261,100,261,81100 JOHOR BAHRU,JOHOR.

I need我需要

81100 JOHOR BAHRU,JOHOR. 81100 新山,柔佛.

but I am getting但我得到

81100 JOHOR BAHRU 81100 新山

The following approach might be sufficient?以下方法可能就足够了吗? It first reads the rows using a standard CSV reader and rejoins the end columns before loading it into pandas.它首先使用标准的 CSV 读取器读取行,并在将其加载到 pandas 之前重新连接结束列。

import pandas as pd
import csv

with open('X00016469619.txt', newline='') as f_input:
    csv_input = csv.reader(f_input)
    data = [row[:8] + [', '.join(row[8:])] for row in csv_input]
        
df = pd.DataFrame(data)
print(df)

Giving you:给你:

     0    1    2    3    4    5    6    7                          8
0   76   50  323   50  323   84   76   84              TAN WOON YANN
1  110  165  315  165  315  188  110  188     INDAH GIFT & HOME DECO
2  126  191  297  191  297  214  126  214       27, JALAN DEDAP 13, 
3  129  218  287  218  287  236  129  236         TAMAN JOHOR JAYA, 
4  100  243  324  243  324  261  100  261  81100 JOHOR BAHRU, JOHOR.
5   70  268  201  268  201  285   70  285             TEL:07-3507405

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 pandas dataframe读取csv,其末尾有/不带逗号的行 - pandas dataframe read csv with rows that have/not have comma at the end Pandas Dataframe,当某些行可能超过1“时,如何将列拆分为”,“ - Pandas Dataframe, How can I split a column into two by “,” when some rows may have more than 1 “,” 如何在熊猫数据框中编辑线? - How can I edit lines in pandas dataframe? When converting a pandas dataframe to csv, how can i seperate the headers of the dataframe into different columns of the csv-file? - When converting a pandas dataframe to csv, how can i seperate the headers of the dataframe into different columns of the csv-file? Python对pandas.core.frame.Dataframe的错误索引/如何删除csv文件每行末尾的逗号? - Python incorrect indexing of pandas.core.frame.Dataframe / how to delete a comma on the end of each row of a csv file? 如何过滤所有带有 something1 后跟 xyz 的行并将它们存储在单独的 Pandas 数据框中 - How can I filter all rows that have something1 followed by xyz and to store them in a separate pandas dataframe 将 csv 导入 PANDAS 时,如何只导入其中包含指定字符串的列? - When importing csv to PANDAS, how can you only import columns that contain a specified string within them? 如何跳过xls末尾的pandas数据帧中的行 - how to skip lines in pandas dataframe at the end of the xls 如何在 Pandas Dataframe 中仅打印具有特定浮点值的行 - How can I print only lines with a certain float value in a Pandas Dataframe 我有带输入的csv,我想输出csv。 输入生成一些URL,我想将它们附加到现有数据框 - I have a csv with inputs, I want output csv. inputs generates some urls and I want to append them to existing dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM