简体   繁体   English

使用python regex创建温度数据的JSON数组

[英]Using python regex to create a JSON array of temperature data

I have a log file that stores temperatures in the format: 我有一个日志文件,以以下格式存储温度:

2013/09/30 11:23:01 Temperature 41.34F 5.19C
2013/09/30 11:23:01 Temperature 99.84F 37.69C
2013/09/30 11:23:01 Temperature 65.86F 18.81C
2013/09/30 11:25:02 Temperature 41.67F 5.38C
2013/09/30 11:25:02 Temperature 65.64F 18.69C
2013/09/30 11:25:02 Temperature 98.83F 37.12C

There are a variable number of values corresponding to a given minute, from 1-3. 对应于给定分钟的值有可变数量,从1-3开始。 How would I use Python regular expressions to convert the data to JSON format, such that a series of values is given for each time and Fahrenheit value? 我如何使用Python正则表达式将数据转换为JSON格式,这样每次给出一系列值和华氏值?

{"c":[{"v":"Date(2013, 8, 30, 11, 23)"},{"v":41.34},{"v":99.84},{"v":65.86}]},

So the script would open "temperatures.log", read through the file, take the time value and put it in the format: 因此脚本将打开“Temperats.log”,读取文件,获取时间值并将其放入以下格式:

{"c":[{"v":"Date(2013, 8, 30, 11, 23)"}, 

(with the month offset by -1) (月份偏移-1)

and then loop through all the temperature values at that time and include each like: 然后循环遍历当时的所有温度值并包括每个温度值:

{"v":41.34},

Until it found a date/time expression that was different from the previous line, and then close the expression with 直到找到与前一行不同的日期/时间表达式,然后用表达式关闭表达式

]}, 

write the output file, and start the next series, until the end of the log file. 写入输出文件,然后启动下一个系列,直到日志文件结束。

You don't need regular expressions for this, since your data is pretty straightforward. 您不需要正则表达式,因为您的数据非常简单。 First, note that you can organize the data without even parsing the date, because you can use simple string comparison: 首先,请注意您可以在不解析日期的情况下组织数据,因为您可以使用简单的字符串比较:

def proc_lines(lines):
    cur_date = None
    cur_temps = []

    results = []

    for line in lines:
        parts = line.split()
        date = "%s %s" % (parts[0], parts[1])
        if date != cur_date:
            if cur_temps:
                #save current data
                results.append((cur_date, cur_temps))
            #reset state
            cur_date = date
            cur_temps = []
        #add the line's temperature in fahrenheit, stripping out the 'F'
        cur_temps.append(float(parts[3][:-1]))

    #process the last line
    if cur_temps:
        results.append((cur_date, cur_temps))

    return results

Now results will be a list of (date, temperature) tuples with an unparsed date: 现在results将是一个具有未解析日期的(date, temperature)元组列表:

>>> lines = """2013/09/30 11:23:01 Temperature 41.34F 5.19C
2013/09/30 11:23:01 Temperature 99.84F 37.69C
2013/09/30 11:23:01 Temperature 65.86F 18.81C
2013/09/30 11:25:02 Temperature 41.67F 5.38C
2013/09/30 11:25:02 Temperature 65.64F 18.69C
2013/09/30 11:25:02 Temperature 98.83F 37.12C""".split("\n")
>>> results = proc_lines(lines)
>>> results
[('2013/09/30 11:23:01', [41.340000000000003, 99.840000000000003, 
                          65.859999999999999]), 
 ('2013/09/30 11:25:02', [41.670000000000002, 65.640000000000001, 
                          98.829999999999998])]

You can use datetime.datetime.strptime to actually parse the date and process the date (subtracting the month as you asked): 您可以使用datetime.datetime.strptime实际解析日期并处理日期(按您的要求减去月份):

>>> import datetime
>>> def proc_datestr(date):
        dt = datetime.datetime.strptime(date, "%Y/%m/%d %H:%M:%S")
    return "Date(%d, %d, %d, %d, %d, %d)" % (
        dt.year, dt.month - 1, dt.day, dt.hour, dt.minute, dt.second)

>>> proc_datestr(results[0][0])
'Date(2013, 8, 30, 11, 23, 1)'

Note the format string "%Y/%m/%d %H:%M:%S" which parses dates as detailed here . 请注意格式字符串"%Y/%m/%d %H:%M:%S" ,它按此处详述解析日期。 This lovely built-in function obviates the need for you to write your own regexp to deal with the date. 这个可爱的内置函数不需要你编写自己的正则表达式来处理日期。

Then you just process the results & dump to json as follows: 然后你只需处理结果并转储到json,如下所示:

>>> import json
>>> def proc_result(result):
    date, temps = result
    res = {'c': [{'v': proc_datestr(date)}]}
    for temp in temps:
        res['c'].append({'v': temp})
    return json.dumps(res)

>>> proc_result(results[0])
'{"c": [{"v": "Date(2013, 8, 30, 11, 23, 1)"}, {"v": 41.340000000000003}, {"v": 99.840000000000003}, {"v": 65.859999999999999}]}'
>>> proc_result(results[1])
'{"c": [{"v": "Date(2013, 8, 30, 11, 25, 2)"}, {"v": 41.670000000000002}, {"v": 65.640000000000001}, {"v": 98.829999999999998}]}'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM