简体   繁体   中英

python merge list of dictionaries based on key

I am looking for a python alternative to a join.

I am trying to get a list with every second of the day, and join data to that, based on the timestamp. What I have so far, is this:

keys=('DRIP_ID','DESCR','OBJECT','TIMESTAMP','DRIP_R1','DRIP_R2','RT_DISP1','RT_DISP2','DAY','TIME')

The keys are the column names

rawdata=[['242418',"242418 Rechts.BD242418: tot Oudkarspel -  - ${pijlop} N242  10 min - N508  14 min${pijlr}",'BD242418','20150701063825','N242','N508','10','14','20150701','063825'],
['242418',"242418 Rechts.BD242418: tot Oudkarspel -  - ${pijlop} N242  10 min - N508  14 min${pijlr}",'BD242418','20150701064327','N242','N508','10','14','20150701','064327'],
['242418',"242418 Rechts.BD242418: tot Oudkarspel -  - ${pijlop} N242  10 min - N508  14 min${pijlr}",'BD242418','20150701085717','N242','N508','10','14','20150701','085717'],
['242418',"242418 Rechts.BD242418: tot Oudkarspel -  - ${pijlop} N242  10 min - N508  14 min${pijlr}",'BD242418','20150701100116','N242','N508','10','14','20150701','100116'],
['242418',"242418 Rechts.BD242418: tot Oudkarspel -  - ${pijlop} N242  10 min - N508  14 min${pijlr}",'BD242418','20150701191611','N242','N508','10','14','20150701','191611'],
['242418',"242418 Rechts.BD242418: tot Oudkarspel -  - ${pijlop} N242  10 min - N508  14 min${pijlr}",'BD242418','20150701213616','N242','N508','10','14','20150701','213616']]

The rawdata is what comes out of the software

sec = ['00','01','02','03','04','05','06','07','08','09','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','37','38','39','40','41','42','43','44','45','46','47','48','49','50','51','52','53','54','55','56','57','58','59']
mm = sec
hh = ['00','01','02','03','04','05','06','07','08','09','10','11','12','13','14','15','16','17','18','19','20','21','22','23']
timestamp=()
time = []
dictData = []

# Dictionary with all seconds (HHMMSS) in 1 day
for ih, uur in enumerate(hh):
    if ih < 24:
          for im, minutes in enumerate(mm):
              if im < 60:
                  for isec, secs in enumerate(sec):
                      if isec < 60:
                        timestamp = str(uur)+str(minutes)+str(secs)
                        timeDict = dict()
                        timeDict['DRIP_ID']=""
                        timeDict['DESCR']=""
                        timeDict['OBJECT']=""
                        timeDict['TIMESTAMP']=""
                        timeDict['DRIP_R1']=""
                        timeDict['DRIP_R2']=""
                        timeDict['RT_DISP1']=""
                        timeDict['RT_DISP2']=""
                        timeDict['DAY']=""
                        timeDict['TIME']=timestamp
                        time.append(timeDict)

Here I made all the seconds in a day and gave them the same keys, for easier matching

# Turn raw data into dictionary                        
for row in rawdata:
    dictionary = dict(zip(keys, row))
    dictData.append(dictionary)

Then I take the rawdata and turn that into a dict as well

#Join, sort off
compleet=()
for t in time:
    t.update(dictData)
    compleet.append(t)

print len(compleet)
print compleet[1]

However, when I run this, I get the error:

ValueError: dictionary update sequence element #0 has length 10; 2 is required

Which lead me to believe that I can only update key:value pair at a time, but I am not sure this is correct.

Furthermore: it's a 1:1 join. 1 timestamp can have only 1 measuerement. Not every second in a day has a measurement. The 'Join on' would be on "TIME"

The documentation says:

dict.update = update(...)

D.update([E, ]**F) -> None. Update D from dict/iterable E and F.

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k]

If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v

In either case, this is followed by: for k in F: D[k] = F[k]

Because dictData is a list and dosen't have keys() method, for k, v in dictData: t[k] = v is runned inside the update method and causes the exception.

Actually I don't quite understand your code and thus I can't give concrete help to this.

I would like to help you if you can explain the code (eg. the correct t variable after the execution).

#Same result as a join, by iterating.
for iTime, t in enumerate(time):
    for iData, d in enumerate(dictData):
        if t['TIME'] == d['TIME']:
            t.update(d)

After realizing what went wrong, and seeing no join, this was the best next thing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM