简体   繁体   中英

Converting CSV from two rows into columns with a timestamp

I have a CSV file which is in the format of two rows with 24hr of data in it: (Comma separated. I failed to specify this in my original question, sorry!)

Site ID,Meter Reference,Date,Units,00:30,A,01:00,A,01:30,A,02:00,A,02:30,A,03:00,A,03:30,A,04:00,A,04:30,A,05:00,A,05:30,A,06:00,A,06:30,A,07:00,A,07:30,A,08:00,A,08:30,A,09:00,A,09:30,A,10:00,A,10:30,A,11:00,A,11:30,A,12:00,A,12:30,A,13:00,A,13:30,A,14:00,A,14:30,A,15:00,A,15:30,A,16:00,A,16:30,A,17:00,A,17:30,A,18:00,A,18:30,A,19:00,A,19:30,A,20:00,A,20:30,A,21:00,A,21:30,A,22:00,A,22:30,A,23:00,A,23:30,A,00:00,A
Building,A,12/06/15,kWh,1,A,2,A,2,A,1,A,2,A,2,A,1,A,2,A,2,A,1,A,2,A,2,A,1,A,2,A,2,A,1,A,2,A,1,A,2,A,2,A,2,A,1,A,2,A,2,A,0,A,1,A,0,A,1,A,3,A,2,A,2,A,1,A,0,A,0,A,0,A,1,A,0,A,0,A,0,A,1,A,0,A,0,A,0,A,1,A,0,A,0,A,0,A,1,A

I need to get it into a format which I can use in another process. Which contains a full timestamp, value and Building ref. I am going to push this into a MySQL table.

Building,   Kwh,    Timestamp
A,  2,  12/06/15 00:30
A,  3,  12/06/15 01:00
A,  4,  12/06/15 01:30
A,  4,  12/06/15 02:00
A,  2,  12/06/15 02:30
A,  3,  12/06/15 03:00

I have tried to use this to pivot the data:

import csv

from itertools import izip
a = izip(*csv.reader(open("Logger.csv", "rb")))
csv.writer(open("Long.csv", "wb")).writerows(a)

But this gives me headers over four rows and data over two. Only starting with Python today, can I modify the import csv making the conversion a bit cleaner.

I have tried to run a second python file to append each row to add a timestamp and the remove the Row with 'AA' in it. I know this is not the right approach and looking for guidance.

This is my current results

Site ID Building
Meter Reference A
Date    11/06/15
Units   kWh
00:30   2
A   A
01:00   2
A   A
01:30   2
A   A
02:00   2
A   A
02:30   2
A   A
03:00   3

Any help is appreciated

This works :

## Reading input
input_file = open("input_file_name.csv",'r')

input_data = []
for line in input_file:
  input_data.append(line.split(";"))

date = input_data[1][4]
input_file.close()

## Writting output
output_file = open("output_file_name.csv",'w')

output_file.write("Building;Kwh;Timestamp\n")

swift = 6
size_data = len(input_data[0]) - swift
for i in range(size_data/2):
  hour = input_data[0][swift + 2*i]
  reference = input_data[0][swift + 2*i + 1]
  kwh = input_data[1][swift + 2*i]
  output_file.write(reference+";"+kwh+";"+date+" "+hour+"\n")

output_file.close()

I had some fun with this one.

import re

input_file = 'in.txt'
output_file = 'output.txt'

data = {}

with open(input_file) as fh:
    for line in fh:
        line = line.rstrip()

        if 'Date' in line:
            times = []
            for i in re.split(r'\s+', line):
                if re.match(r'\d{2}:\d{2}', i):
                    times.append(i)

        else:
            building = (re.split(r'\s+', line))[1]
            data[building] = {}
            data[building]['times'] = times
            data[building]['results'] = []
            data[building]['count'] = 0

            for i in re.split(r'\s+', line):
                if re.match(r'^\d+$', i):
                    data[building]['results'].append(i)
                    data[building]['count'] += 1
            data[building]['date'] = \
                re.search(r'(\d{2}/\d{2}/\d{2})', line).group()

wfh = open(output_file, 'w')
wfh.write("Building Kwh Timestamp\n")

for k in data.keys():

    building = k

    for i in range(0, data[building]['count']):
        wfh.write("{}   {}  {}\n".format(
            k,
            data[building]['results'].pop(0),
            data[building]['times'].pop(0))
        )

wfh.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM