简体   繁体   中英

counting and storing values in a dictionary using python

def prodInfo():
    from collections import Counter
    prodHolder = {}
    tempdict = {}
    try:
        os.chdir(copyProd)
        for root, dirs, files in os.walk('.'):
            for data in files:

                fullpath = os.path.join(root, data)
                with open(fullpath, 'rt') as fp:
                    for info in fp:
                        info = info.strip()
                        if info.startswith('prodType'):
                            info0 = info.split('=')[1]
                            info0 = info0.replace(';','')
                            info0 = info0.replace('"','')
                        if info.startswith('acq'):
                            info1 = info.split('=')[1]  
                            info1 = info1.replace(';','')
                            info1 = info1.replace('"','')
                        if info.startswith('ID_num'):
                            info2 = info.split('=')[1]
                            info2 = info2.replace(';','')
                            info2 = info2.replace('"','')

                    print info0 + info1 + info2

produces this result:

SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
Image Acq645467 356788
Image Acq645467 356788
Image Acq645467 356788
Image Acq645467 356788

SD Acq644869 356849
SD Acq644869 356849
Image Acq644869 356849

SD Acq644247 356851
SD Acq644247 356851
Image Acq644247 356851

I would like to store the results and have the ability to count the number of times 'SD' occurs for each specific Id number (356788/356849/356851) and how many 'images' for each Id number.

The results would be as follows:

9 - SD / 4 - Image for 356788

2 - SD / 1 - Image for 356849

2 - SD / 1 - Image for 356851

I though it would be best if I stored the items in a dictionary but have not been able to successfully count the values. This is the code I have used to store the info in a dictionary.

prodHolder[info2] = {'SD/Image': info0, 'Acq' : info1}
total_Acq = prodHolder
print prodHolder

Results are:

{'356788': {'SD/Image': 'SD', 'Acq': Acq645467'}} ...

Every time the function is run a different set of values will be entered thus producing a different result.

So there's two questions here.

1) How to write the results into a file:

I'd use csv (comma-separated-values). Python has a great module for that ( csv )

You can modify your code so, at the same time it reads from a file (as it already does), it writes info0 , info1 and info2 to a .csv file:

def prodInfo():
    from collections import Counter
    prodHolder = {}
    tempdict = {}
    try:
        os.chdir(copyProd)
        for root, dirs, files in os.walk('.'):
                for data in files:
                fullpath = os.path.join(root, data)
                with open(fullpath, 'r') as fp,\
                     open('./stack59.write.csv', 'w') as fw:

                    writer = csv.writer(fw)
                    for info in fp:
                    # [ . . . ]
                    # Yadda yadda yadda
                    print info0 + info1 + info2
                    writer.writerow([info0, info1, info2])

This will create a file stack59.write.csv looking like:

SD,Acq645467,356788
SD,Acq645467,356788
SD,Acq645467,356788
[ . . . ]
SD,Acq644247,356851
SD,Acq644247,356851
Image,Acq644247,356851

2) How to count common results:

For that, probably itertools.groupby would suit your needs. You might wanna look at what iterators do, as well (see this , this and this )

First, I'd store the data into a matrix:

def prodInfo():
    from collections import Counter
    prodHolder = {}
    tempdict = {}
    data_matrix = []   # NEW !
    try:
        os.chdir(copyProd)
        for root, dirs, files in os.walk('.'):
            for data in files:
                # [ . . . ]
                # Yadda, yadda, yadda...
                print info0 + info1 + info2
                data_matrix.append([info0, info1, info2])  # NEW!

And then you can group your data_matrix as you please. For instance:

# First, group by picture id (356788, 356849...), which is
# the third column of the data
for group_by_id in itertools.groupby(data_matrix,
                                     lambda x: x[2]):
    # Now, within those groups, group by type, the first column
    # of the data (SD, Image...)
    for group_by_type in itertools.groupby([a for a in group_by_id[1]],
                                           lambda y: y[0]):
        print "%s: %s %s" % (group_by_id[0],
                             len([a for a in group_by_type[1]]),
                             group_by_type[0])
    print ''

Which outputs:

356788: 9 SD
356788: 4 Image

356849: 2 SD
356849: 1 Image

356851: 2 SD
356851: 1 Image

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM