简体   繁体   English

在python中使用列数据解析csv文件

[英]Parsing a csv file with column data in Python

I want to read the first 3 columns of a csv file and do some modification before storing them. 我想阅读一个csv文件的前3列,并在存储它们之前进行一些修改。 Data in csv file: CSV文件中的数据:

{::[name]str1_str2_str3[0]},1,U0.00 - Sensor1 Not Ready\\nTry Again,1,0,12

{::[name]str1_str2_str3[1]},2,U0.00 - Sensor2 Not Ready\\nTry Again,1,0,12

From the column1, I just want to parse the value 0 or 1 within the [ ]. 从column1,我只想解析[]中的值0或1。 Then the value in column2 From column3, I want to parse the substring "Sensor1 Not Ready". 然后,从column3的column2中的值,我想解析子字符串“ Sensor1 Not Ready”。 Then convert to upper case and replace the space with underscore (eg - SENSOR1_NOT_READY). 然后转换为大写并用下划线替换空格(例如-SENSOR1_NOT_READY)。 And then print the string in a new column. 然后在新列中打印字符串。

Parsing format - 解析格式-

**<value from column 1>.<value from column 2>.<string from column 3>**

I am new to coding in Python. 我是Python编码的新手。 Can someone help me with this? 有人可以帮我弄这个吗? What is the best and the most efficient way to do this? 最佳和最有效的方法是什么? TIA TIA

What I have tried so far - 到目前为止,我已经尝试过-

import csv
from collections import defaultdict

columns = defaultdict(list)

with open('filename.csv','rb') as f:
    reader = csv.reader(f, delimiter=',')
    for row in reader:
        for i in range(len(row)):
            columns[i].append(row[i])
    columns = dict(columns)

Is this a good way for Column 3? 这是专栏3的好方法吗?

x = # Parsed data from Column 3'
a, b = x.split("\n") # 'a' denotes the substring before \n
c, d = a.split("-") # 'd' denotes the substring after '-'
e = d.upper()
new_str = str.replace(" ", "_")
print new_str

My suggestion is to read a whole line as a string, and then extract desired data with re module like this: 我的建议是将整行读取为字符串,然后使用re模块提取所需的数据,如下所示:

import re

term = '\[(\d)\].*,(\d+),.*-\s([\w\s]+)\\n'

line = '{::[name]str1_str2_str3[0]},1,U0.00 - Sensor1 Not Ready\nTry Again,1,0,12'
capture = list(re.search(term, line).groups())
capture[-1] = '_'.join(capture[-1].split()).upper()
result = ','.join(capture)
#0,1,Sensor1_Not_Ready

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM