简体   繁体   English

从Python中的字符串对象读取逗号分隔的值

[英]reading comma separated value from string object in python

I have output from http request which is of string type but the data is like csv. 我从http请求中获得了字符串类型的输出,但是数据就像csv一样。 As the output type in my request header is csv ('Accept':"application/csv"). 由于我的请求标头中的输出类型是csv(“接受”:“ application / csv”)。 As this the format supported by the source.But the response content type is a string. 由于源支持这种格式,但是响应内容类型是字符串。 res=request.content type(res)` gives me string. res=request.content type(res)`给我字符串。

Here is the sample output from the object(res): 这是对象的输出示例:

QueryTime
start,end
144488,144490

Data

Data - AData
id,G_id,name,type,time,sid,channel
23,-1,"B1",type1,144488,11,CH23
23,-1,"B1",type1,144488,11,CH23
Data - BData
id,G_id,time,se
23,-1,144488,undefined
23,-1,144488,undefined

If you see the data is in form of csv and there are multiple tables like you see "AData" & "BData" I am not getting which approach to take to read this. 如果您看到的数据是csv格式的,并且有多个表,就像您看到的“ AData”和“ BData”一样,那么我就不会采用哪种方法来读取它。 I have tried csv module but no help. 我已经尝试过csv模块,但是没有帮助。 I have tried dict.csv to convert but again same. 我尝试了dict.csv进行转换,但再次相同。 Not getting desired output. 无法获得所需的输出。 May be I am doing something wrong as I am new with python. 可能是我做错了什么,因为我是python新手。 Need is to read each table from the output object. 需要的是从输出对象读取每个表。

with open('file.csv', 'wb') as csvfile:
  spamwriter = csv.writer(csvfile, delimiter=',',quoting=csv.QUOTE_NONE)
  spamwriter.writerow(rec)

with open('file.csv') as csvfile:
   reader = csv.DictReader(csvfile)
   for row in reader:
   print row

Experts please guide :-) 专家请指导:-)

You could pre-parse the output using a regular expression to extract the various sections, and then use StringIO to parse each section to a csv.reader as follows: 您可以使用正则表达式预解析输出以提取各个部分,然后使用StringIO将每个部分解析为csv.reader ,如下所示:

import csv
import StringIO
from collections import OrderedDict

output = """
QueryTime
start,end
144488,144490

Data

Data - AData
id,G_id,name,type,time,sid,channel
23,-1,"B1",type1,144488,11,CH23
23,-1,"B1",type1,144488,11,CH23
Data - BData
id,G_id,time,se
23,-1,144488,undefined
23,-1,144488,undefined"""

sections = ['QueryTime', 'Data - AData', 'Data - BData', 'Data']
re_sections = '|'.join([re.escape(s) for s in sections])
tables = re.split(r'(' + re_sections + ')', output)
tables = [t.strip() for t in tables[1:]]

d_tables = OrderedDict()

for section, table in zip(*[iter(tables)]*2):
    if len(table):
        csv_input = csv.reader(StringIO.StringIO(table))
        d_tables[section] = list(csv_input)

for section, entries in d_tables.items():
    print section
    print entries
    print

Giving you the following output: 提供以下输出:

QueryTime
[['start', 'end'], ['144488', '144490']]

Data - AData
[['id', 'G_id', 'name', 'type', 'time', 'sid', 'channel'], ['23', '-1', 'B1', 'type1', '144488', '11', 'CH23'], ['23', '-1', 'B1', 'type1', '144488', '11', 'CH23']]

Data - BData
[['id', 'G_id', 'time', 'se'], ['23', '-1', '144488', 'undefined'], ['23', '-1', '144488', 'undefined']]

I came up with this function to parse the data: 我想出了这个功能来解析数据:

def parse_data(data):
 parsed = {}
 current_section = None

 for line in data.split('\n'):
  line = line.strip()
  if line:
   if ',' in line:
    current_section.append(line.split(','))
   else:
    parsed[line] = []
    current_section = parsed[line]
 return parsed

It returns a dictionary where each key refers to a section of the input. 它返回一个字典,其中每个键都引用输入的一部分。 Its value is a list where each member represents a row of input. 它的值是一个列表,其中每个成员代表一行输入。 Each row is also a list of the individual values as strings. 每行也是作为字符串的各个值的列表。 It does not treat the first row in a section specially. 它不会对节中的第一行进行特殊处理。

Running it on your input produces this (reformatted for readability): 在您的输入上运行它会生成以下内容(为了可读性而重新格式化):

{
 'Data - AData': [
  ['id', 'G_id', 'name', 'type', 'time', 'sid', 'channel'],
  ['23', '-1', '"B1"', 'type1', '144488', '11', 'CH23'],
  ['23', '-1', '"B1"', 'type1', '144488', '11', 'CH23']
 ],
 'Data - BData': [
  ['id', 'G_id', 'time', 'se'],
  ['23', '-1', '144488', 'undefined'],
  ['23', '-1', '144488', 'undefined']
 ],
 'Data': [
 ],
 'QueryTime': [
  ['start', 'end'],
  ['144488', '144490']
 ]
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM