简体   繁体   English

在Python中将.csv文件整合到每个唯一键(示例,人)的一行

[英]Consolidate .csv File to One Row per Unique Key (Example, Person) in Python

What I Have 我有的

I have a .csv file with a list of employee's and their shifts for a given day that looks like this: 我有一个.csv文件,其中列出了给定日期的员工及其班次列表,如下所示:

Initials,Last,First,ShiftStart,ShiftEnd
BAB,Smith,Bob,10:00a,1:00p
JCJ,Jones,Jill,11:00a,3:00p
JIH,Hernandez,Jose,1:00p,4:00p
BAB,Smith,Bob,1:00p,3:00p
JIH,Hernandez,Jose,5:00p,9:00p
JCJ,Jones,Jill,3:00p,3:30p
JCJ,Jones,Jill,3:30p,5:00p
DJM,Martin,Dominique,8:00a,11:00a

Note how one person can have more than one shift, the start time for the next shift might or might not be the same as the end time for another shift, and each employee is identified by their initials as a unique identifier (suitable for use as a key.) 请注意,一个人如何可以有多个班次,下一个班次的开始时间可能与另一个班次的结束时间相同,也可能不相同,并且每个员工的姓名首字母都被标识为唯一标识符(适合用作关键。)

What I Want 我想要的是

I want to consolidate this .csv file so that there is only one row per employee. 我想合并此.csv文件,以便每个员工只有一行。 If that person has more than one shift then check to see if the end time for one shift is the same as the start time for another and combine those shifts but if not then add two new columns 2ndShiftStart and 2ndShiftEnd and put that data there. 如果该人有一个以上的班次,请检查一个班次的结束时间是否与另一个班次的开始时间相同,然后合并这些班次,但是如果没有,则添加两个新列2ndShiftStart和2ndShiftEnd并将该数据放在那里。

The result should look like this: 结果应如下所示:

Initials,Last,First,ShiftStart,ShiftEnd,2ndShiftStart,2ndShiftEnd
BAB,Smith,Bob,10:00a,3:00p,,
JCJ,Jones,Jill,11:00a,5:00p,,
JIH,Hernandez,Jose,1:00p,4:00p,5:00p,9:00p
DJM,Martin,Dominique,8:00a,11:00a,,

BAB, for example, works 10 am - 1 pm then 1 pm - 3 pm so the resulting .csv lists him as working 10am - 3 pm. 例如,BAB的工作时间是上午10点至下午1点,然后是下午1点至下午3点,因此生成的.csv将他列为工作时间是上午10点至下午3点。

#!/usr/bin/env python
import sys
##Initials,Last,First,ShiftStart,ShiftEnd
s='''BAB,Smith,Bob,10:00a,1:00p
JCJ,Jones,Jill,11:00a,3:00p
JIH,Hernandez,Jose,1:00p,4:00p
BAB,Smith,Bob,1:00p,3:00p
JIH,Hernandez,Jose,5:00p,9:00p
JCJ,Jones,Jill,3:00p,3:30p
JCJ,Jones,Jill,3:30p,5:00p
DJM,Martin,Dominique,8:00a,11:00a'''

db = {}
for line in s.split('\n'):
     Initials,Last,First,ShiftStart,ShiftEnd = line.split(',')
     if Initials in db:
         db[Initials][2].append((ShiftStart,ShiftEnd))
     else:
         db[Initials] = (Last,First,[(ShiftStart,ShiftEnd)])
for Initials,v in db.iteritems():
    Last,First,shifts = v
    sys.stdout.write(Initials + ',')
    sys.stdout.write(Last + ',' + First)
    for shift in shifts:
        ShiftStart,ShiftEnd = shift
        sys.stdout.write(',' + ShiftStart + ',' + ShiftEnd)
    sys.stdout.write('\n')

Alternatively, you could do a very object-oriented program: 另外,您可以执行一个非常面向对象的程序:

import sys
##Initials,Last,First,ShiftStart,ShiftEnd
s='''BAB,Smith,Bob,10:00a,1:00p
JCJ,Jones,Jill,11:00a,3:00p
JIH,Hernandez,Jose,1:00p,4:00p
BAB,Smith,Bob,1:00p,3:00p
JIH,Hernandez,Jose,5:00p,9:00p
JCJ,Jones,Jill,3:00p,3:30p
JCJ,Jones,Jill,3:30p,5:00p
DJM,Martin,Dominique,8:00a,11:00a'''

class Shift(object):
    def __init__(self,ShiftStart,ShiftEnd):
        self.ShiftStart,self.ShiftEnd = ShiftStart,ShiftEnd
    def __str__(self):
        return '%s,%s' % (ShiftStart,ShiftEnd)

class Person(object):
    def __eq__(self, p):
        if self.Initials != p.Initials:
            return False
        if p.Last is not None and self.Last != p.Last:
            return False
        if p.First is not None and self.First != p.First:
            return False
        return True
    def __init__(self,Initials,Last,First):
        self.Initials,self.Last,self.First = Initials,Last,First
        self.Shifts = []
    def __str__(self):
        return '%s,%s,%s' % (self.Initials,self.Last,self.First)

def AddShift(people, person, shift):
    try:
        person = people[people.index(person)]
    except ValueError:
        people.append(person)
    person.Shifts.append(shift)

people = []
for line in s.split('\n'):
     Initials,Last,First,ShiftStart,ShiftEnd = line.split(',')
     AddShift(people, Person(Initials,Last,First), Shift(ShiftStart,ShiftEnd))

for person in people:
    print '%s,%s' %(person, ','.join(map(str,person.Shifts)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM