简体   繁体   中英

Python: Sum each lines with their values from dict

dict = {'A': 71.07884,
    'B': 110,
    'C': 103.14484,
    'D': 115.08864,
    'E': 129.11552,
    'F': 147.1766,
    'G': 57.05196,
    'H': 137.1412
    }


def search_replace(search, replacement, searchstring):
    p = re.compile(search)
    searchstring = p.sub(replacement, searchstring)
    return (searchstring)


def main():
    with open(sys.argv[1]) as filetoread:
    lines = filetoread.readlines()
    file = ""

    for i in range(len(lines)):
        file += lines[i]

    file = search_replace('(?<=[BC])', ' ', file)

    letterlist = re.split('\s+', file)

    for j in range(len(letterlist)):
        print(letterlist[j])

if __name__ == '__main__':
    import sys
    import re
    main()

My program open a file and split the text of letters after B or C.

The file looks like:

ABHHFBFEACEGDGDACBGHFEDDCAFEBHGFEBCFHHHGBAHGBCAFEEAABCHHGFEEEAEAGHHCF

Now I want to sum each lines with their values from dict.

For example:

AB = 181.07884
HHFB = 531.4590000000001

And so on.

I dont know how to start. Thanks a lot for all your answers.

Try to simplify things...

Given you already have a string s and a dictionary d :

ctr = 0
temp = ''
for letter in s:
    ctr += d[letter]
    temp += letter
    if letter in 'BC':
        print(temp, ctr)
        ctr = 0
        temp = ''

In the case you supplied where:

s = "ABHHFBFEACEGDGDACBGHFEDDCAFEBHGFEBCFHHHGBAHGBCAFEEAABCHHGFEEEAEAGHHCF"
d = {'A': 71.07884,
'B': 110,
'C': 103.14484,
'D': 115.08864,
'E': 129.11552,
'F': 147.1766,
'G': 57.05196,
'H': 137.1412
}

You get the results (printed to terminal):

>>> ('AB', 181.07884)
('HHFB', 531.4590000000001)
('FEAC', 450.5158)
('EGDGDAC', 647.6204)
('B', 110)
('GHFEDDC', 803.8074)
('AFEB', 457.37096)
('HGFEB', 580.4852800000001)
('C', 103.14484)
('FHHHGB', 725.6521600000001)
('AHGB', 375.272)
('C', 103.14484)
('AFEEAAB', 728.64416)
('C', 103.14484)
('HHGFEEEAEAGHHC', 1571.6099199999999)

You already did most of the work! All you miss out is the sum for each substring.

As substrings can occur more often, I'll do the summation only once, and store the values for each substring encountered in a dict (and your above dict for the relation of letter to value I renamed to mydict in order to avoid keyword confustion):

snippets = {}
for snippet in letterlist:
    if snippet not in snippets:
        value = 0
        for s in snippet:
            value += mydict.get(s)
        snippets[snippet] = value
print(snippets)

That gives me an output of

{
'AB': 181.07884, 
'HHFB': 531.4590000000001, 
'FEAC': 450.5158, 
'EGDGDAC': 647.6204, 
'B': 110, 
'GHFEDDC': 803.8074, 
'AFEB': 457.37096, 
'HGFEB': 580.4852800000001, 
'C': 103.14484, 
'FHHHGB': 725.6521600000001, 
'AHGB': 375.272, 
'AFEEAAB': 728.64416, 
'HHGFEEEAEAGHHC': 1571.6099199999999, 
'F': 147.1766}

Open you file and then read each character, then find the character on the dictionary and add the value to your total.

sum_ = 0
letters = "letters_file"
opened = open(letters, "r")
for row in opened:
    for char in row:
        sum_ += int(your_dictionary[char])

print(sum_)

You can use re.split with itertools.zip_longest in a dict comprehension:

import re
from itertools import zip_longest
i = iter(re.split('([BC])', s))
{w: sum(d[c] for c in w)for p in zip_longest(i, i, fillvalue='') for w in (''.join(p),)}

This returns:

{'AB': 181.07884, 'HHFB': 531.4590000000001, 'FEAC': 450.5158, 'EGDGDAC': 647.6204, 'B': 110, 'GHFEDDC': 803.8074, 'AFEB': 457.37096, 'HGFEB': 580.4852800000001, 'C': 103.14484, 'FHHHGB': 725.6521600000001, 'AHGB': 375.272, 'AFEEAAB': 728.64416, 'HHGFEEEAEAGHHC': 1571.6099199999999, 'F': 147.1766}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM