[英]How can I eliminate comma at end of lines in CSV for Pandas Dataframe when only some have them?
I am trying to convert a csv file to pandas df.我正在尝试将 csv 文件转换为 pandas df。 The data is of the following type (SROIE dataset) (this is just a small part of total file):
数据属于以下类型(SROIE 数据集)(这只是整个文件的一小部分):
76,50,323,50,323,84,76,84,TAN WOON YANN
110,165,315,165,315,188,110,188,INDAH GIFT & HOME DECO
126,191,297,191,297,214,126,214,27,JALAN DEDAP 13,
129,218,287,218,287,236,129,236,TAMAN JOHOR JAYA,
100,243,324,243,324,261,100,261,81100 JOHOR BAHRU,JOHOR.
70,268,201,268,201,285,70,285,TEL:07-3507405
THE ISSUE LIES ONLY IN THE LAST COLUMN, WHICH DOESN'T DISPLAY THE ENTIRE TEXT INFORMATION I NEED.问题仅存在于最后一列,它没有显示我需要的全部文本信息。 Based on an answer I found on pandas dataframe read csv with rows that have/not have comma at the end , I used the following code:
根据我在pandas dataframe 上找到的答案,读取 csv 的行末尾有/没有逗号,我使用了以下代码:
pd.read_csv(r'D:\E_Drive\everything else\C2\SROIE2019\0325updated.task1train(626p)\X00016469619.txt',usecols=np.arange(0,9), header=None)
This gave the following output:这给出了以下 output:
The problem is that, for example in line 3 (row labelled 2 in pd dataframe)ie问题是,例如在第 3 行(在 pd 数据框中标记为 2 的行)即
126,191,297,191,297,214,126,214,27,JALAN DEDAP 13,
I need我需要
27,JALAN DEDAP 13,
27, JALAN DEDAP 13,
but I am getting但我得到
27
27
only.只要。 Same is the issue in line 5 (row labelled 4 in pd dataframe):
第 5 行(在 pd 数据框中标记为 4 的行)中的问题也是如此:
100,243,324,243,324,261,100,261,81100 JOHOR BAHRU,JOHOR.
I need我需要
81100 JOHOR BAHRU,JOHOR.
81100 新山,柔佛.
but I am getting但我得到
81100 JOHOR BAHRU
81100 新山
The following approach might be sufficient?以下方法可能就足够了吗? It first reads the rows using a standard CSV reader and rejoins the end columns before loading it into pandas.
它首先使用标准的 CSV 读取器读取行,并在将其加载到 pandas 之前重新连接结束列。
import pandas as pd
import csv
with open('X00016469619.txt', newline='') as f_input:
csv_input = csv.reader(f_input)
data = [row[:8] + [', '.join(row[8:])] for row in csv_input]
df = pd.DataFrame(data)
print(df)
Giving you:给你:
0 1 2 3 4 5 6 7 8
0 76 50 323 50 323 84 76 84 TAN WOON YANN
1 110 165 315 165 315 188 110 188 INDAH GIFT & HOME DECO
2 126 191 297 191 297 214 126 214 27, JALAN DEDAP 13,
3 129 218 287 218 287 236 129 236 TAMAN JOHOR JAYA,
4 100 243 324 243 324 261 100 261 81100 JOHOR BAHRU, JOHOR.
5 70 268 201 268 201 285 70 285 TEL:07-3507405
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.