简体   繁体   中英

parse response time minute wise python

I have an input file which looks like this and I want to calculate the response times by minutes.

datapoint,time,transaction,PT,Responsetime,errorcode
a06i0000003uNQOAA2,2013-09-26T19:15:55.873+0000,EditMode,57,109.877193,0
a06i0000003uNQOAA2,2013-09-26T19:15:55.875+0000,Update,58,733.741379,0
a06i0000003uNQOAA2,2013-09-26T19:15:55.875+0000,ViewObject,94,386.893617,0
a06i0000003uNQOAA2,2013-09-26T19:16:25.889+0000,EditMode,110,109.209091,0
a06i0000003uNQOAA2,2013-09-26T19:16:25.889+0000,Update,109,743.660550,0
a06i0000003uNQOAA2,2013-09-26T19:16:25.890+0000,ViewObject,181,376.198895,0
a06i0000003uNQOAA2,2013-09-26T19:16:55.904+0000,EditMode,162,109.080247,0
a06i0000003uNQOAA2,2013-09-26T19:16:55.904+0000,Update,161,738.683230,0
a06i0000003uNQOAA2,2013-09-26T19:16:55.904+0000,ViewObject,266,372.627820,0
a06i0000003uNQOAA2,2013-09-26T19:17:25.918+0000,EditMode,212,108.580189,0
a06i0000003uNQOAA2,2013-09-26T19:17:25.919+0000,Update,213,735.244131,0
a06i0000003uNQOAA2,2013-09-26T19:17:25.919+0000,ViewObject,350,362.394286,0
a06i0000003uNQOAA2,2013-09-26T19:17:55.933+0000,EditMode,263,107.954373,0
a06i0000003uNQOAA2,2013-09-26T19:17:55.933+0000,Update,264,732.598485,0
a06i0000003uNQOAA2,2013-09-26T19:17:55.934+0000,ViewObject,431,359.965197,0
a06i0000003uNQOAA2,2013-09-26T19:18:25.947+0000,EditMode,314,107.815287,0
a06i0000003uNQOAA2,2013-09-26T19:18:25.948+0000,Update,315,733.292063,0
a06i0000003uNQOAA2,2013-09-26T19:18:25.948+0000,ViewObject,516,360.098837,0
a06i0000003uNQOAA2,2013-09-26T19:18:55.961+0000,EditMode,368,107.559783,0
a06i0000003uNQOAA2,2013-09-26T19:18:55.961+0000,Update,366,731.808743,0
a06i0000003uNQOAA2,2013-09-26T19:18:55.962+0000,ViewObject,600,359.780000,0
a06i0000003uNQOAA2,2013-09-26T19:19:25.975+0000,EditMode,418,107.406699,0
a06i0000003uNQOAA2,2013-09-26T19:19:25.976+0000,Update,419,731.613365,0
a06i0000003uNQOAA2,2013-09-26T19:19:25.976+0000,ViewObject,686,358.169096,0
a06i0000003uNQOAA2,2013-09-26T19:19:55.989+0000,EditMode,470,107.265957,0
a06i0000003uNQOAA2,2013-09-26T19:19:55.990+0000,Update,467,732.107066,0
a06i0000003uNQOAA2,2013-09-26T19:19:55.990+0000,ViewObject,768,360.317708,0
a06i0000003uNQOAA2,2013-09-26T19:20:26.003+0000,EditMode,521,107.149712,0
a06i0000003uNQOAA2,2013-09-26T19:20:26.004+0000,Update,521,733.990403,0
a06i0000003uNQOAA2,2013-09-26T19:20:26.004+0000,ViewObject,853,361.735053,0
a06i0000003uNQOAA2,2013-09-26T19:20:56.018+0000,EditMode,572,107.117133,0
a06i0000003uNQOAA2,2013-09-26T19:20:56.018+0000,Update,572,733.139860,0
a06i0000003uNQOAA2,2013-09-26T19:20:56.018+0000,ViewObject,937,361.497332,0
a06i0000003uNQOAA2,2013-09-26T19:21:26.032+0000,EditMode,623,106.855538,0
a06i0000003uNQOAA2,2013-09-26T19:21:26.032+0000,Update,623,732.057785,0
a06i0000003uNQOAA2,2013-09-26T19:21:26.032+0000,ViewObject,1020,361.191176,0
a06i0000003uNQOAA2,2013-09-26T19:21:56.046+0000,EditMode,674,107.112760,0
a06i0000003uNQOAA2,2013-09-26T19:21:56.046+0000,Update,674,731.721068,0
a06i0000003uNQOAA2,2013-09-26T19:21:56.046+0000,ViewObject,1106,360.622966,0
a06i0000003uNQOAA2,2013-09-26T19:22:26.059+0000,EditMode,724,107.041436,0

This is the program I came up with however, this is giving me the entire response time and not for every minute in specific. Not sure where I am going wrong. Any pointers would be greatly appreciated.

import numpy as np
from scipy import stats

rtlist = []
reqpslist = []

newFile = open('100ulog.csv','r')
FILE = newFile.readlines()
newFile.close()


for line in FILE:
    newline1 = line.split(":")
    newline2 = line.split(",")
    min = newline1[1]
    if newline1[1] == min:
        rtlist.append(newline2[4])
        reqpslist.append(newline2[3])
        print rtlist

    else:
        rtlist[:] = []
        min = min+1

I'm just gonna go ahead and guess what I think you wanted. If you edit your question I'll edit my answer. You wanted to get the response times by minutes. Let's first parse the entire file and get the interesting parts -- a) the minute, b) the PT, c) the response time.

We'll use re :

>>> import re
>>> data = open('100ulog.csv','r').read()
>>> lst = re.findall('.+?,.+?T\d+:(\d+):.+?,.+?,(\d+),(\d+\.\d+),', data)
>>> # This will return a list of interesting tuples like: [('16', '181', '376.198895'),...]

Now we can do whatever we want with it. Let's say we want to build a dictionary that the minutes are it's keys, and the values are the tuples of the pt and response times (we'll use collections.defaultdict for that:

>>> from collections import defaultdict
>>> dic = defaultdict(list)
>>> for item in lst:
...     dic[int(item[0])].append(item[1:3])

EDIT :

Example:

>>> data
'a06i0000003uNQOAA2,2013-09-26T19:15:55.873+0000,EditMode,57,109.877193,0\na06i0
000003uNQOAA2,2013-09-26T19:15:55.875+0000,Update,58,733.741379,0\na06i0000003uN
QOAA2,2013-09-26T19:15:55.875+0000,ViewObject,94,386.893617,0\na06i0000003uNQOAA
2,2013-09-26T19:16:25.889+0000,EditMode,110,109.209091,0\na06i0000003uNQOAA2,201
3-09-26T19:16:25.889+0000,Update,109,743.660550,0\na06i0000003uNQOAA2,2013-09-26
T19:16:25.890+0000,ViewObject,181,376.198895,0\na06i0000003uNQOAA2,2013-09-26T19
:16:55.904+0000,EditMode,162,109.080247,0\na06i0000003uNQOAA2,2013-09-26T19:16:5
5.904+0000,Update,161,738.683230,0\na06i0000003uNQOAA2,2013-09-26T19:16:55.904+0
000,ViewObject,266,372.627820,0\na06i0000003uNQOAA2,2013-09-26T19:17:25.918+0000
,EditMode,212,108.580189,0\na06i0000003uNQOAA2,2013-09-26T19:17:25.919+0000,Upda
te,213,735.244131,0\na06i0000003uNQOAA2,2013-09-26T19:17:25.919+0000,ViewObject,
350,362.394286,0\na06i0000003uNQOAA2,2013-09-26T19:17:55.933+0000,EditMode,263,1
07.954373,0\na06i0000003uNQOAA2,2013-09-26T19:17:55.933+0000,Update,264,732.5984
85,0\na06i0000003uNQOAA2,2013-09-26T19:17:55.934+0000,ViewObject,431,359.965197,
0\na06i0000003uNQOAA2,2013-09-26T19:18:25.947+0000,EditMode,314,107.815287,0\na0
6i0000003uNQOAA2,2013-09-26T19:18:25.948+0000,Update,315,733.292063,0\na06i00000
03uNQOAA2,2013-09-26T19:18:25.948+0000,ViewObject,516,360.098837,0\na06i0000003u
NQOAA2,2013-09-26T19:18:55.961+0000,EditMode,368,107.559783,0\na06i0000003uNQOAA
2,2013-09-26T19:18:55.961+0000,Update,366,731.808743,0\na06i0000003uNQOAA2,2013-
09-26T19:18:55.962+0000,ViewObject,600,359.780000,0'
>>> import re
>>> from collections import defaultdict
>>> lst = re.findall('.+?,.+?T\d+:(\d+):.+?,.+?,(\d+),(\d+\.\d+),', data)
>>> dic = defaultdict(list)
>>> for item in lst:
...     dic[int(item[0])].append(item[1:3])
...
>>> dic
defaultdict(<type 'list'>, {16: [('110', '109.209091'), ('109', '743.660550'), (
'181', '376.198895'), ('162', '109.080247'), ('161', '738.683230'), ('266', '372
.627820')], 17: [('212', '108.580189'), ('213', '735.244131'), ('350', '362.3942
86'), ('263', '107.954373'), ('264', '732.598485'), ('431', '359.965197')], 18:
[('314', '107.815287'), ('315', '733.292063'), ('516', '360.098837'), ('368', '1
07.559783'), ('366', '731.808743'), ('600', '359.780000')], 15: [('57', '109.877
193'), ('58', '733.741379'), ('94', '386.893617')]})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM