![](/img/trans.png)
[英]How to read data from csv file if all the values are in the same column?
[英]How to remove dupliate values from csv file whe 5 columns values 4 are same and one column values is diff
我有一個以打擊格式顯示的csv文件,我想使用python腳本進行如下表所述的更改,因此您可以建議我使用相應的方法進行更改。
Sheet1 :(輸入文件)
Columns: 1 2 3 4 5
row1 : abc fff v1 hhh jjj
row2 : abc fff v2 hhh jjj
row3 : efg ooo h1 ppp www
row3 : efg ooo h2 ppp www
Sheet2 :(輸出文件)
Columns: 1 2 3 4 5
row1 : abc fff v1|v2 hhh jjj
row2 : efg ooo h1|h2 ppp www
能否請任何人幫助我做到這一點?
要閱讀csv
並將其獲取到所需的位置,可以使用pandas
:
import pandas as pd
df = pd.read_csv('input_file_name.csv', header=None, sep='\s+')
#sep is the delimiter so change it if it is ',' for instance
#header is set to None as you seem not to have column names
df = df.groupby(['1', '2', '4', '5'])['3'].agg(lambda x: '|'.join(x)).reset_index()
df
#1 2 4 5 3
#abc fff hhh jj jv1|v2
#efg ooo ppp www h1|h2
另外,您可以使用csv
模塊,但是您會發現pandas
使它變得更加簡單:
import csv
with open('myfile.csv') as infile, open('output.csv', 'wb') as outfile:
value_place = 2
result = {}
for line in infile:
line = line.strip().split(',')
value = line[value_place]
key = tuple(x for i, x in enumerate(line) if i != value_place)
if key in result:
result[key].append(value)
else:
result[key] = [value]
desired = {k: '|'.join(v) for k, v in result.items()}
writer = csv.writer(outfile)
for k, v in desired.items():
writer.writerow(list(k)+[v])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.