简体   繁体   中英

How to iterate over a dictionary and operate with its elements?

I have this dictionary, where the keys represent atom types and the values represent the atomic masses:

mass = {'H': 1.007825, 'C': 12.01, 'O': 15.9994, 'N': 14.0067, 'S': 31.972071,
        'P': 30.973762}

what I want to do is to create a function that given a molecule, for instance ('H2-N-C6-H4-CO-2H') , iterates over the mass dictionary and calculates the atomic mass on the given molecule. The value of the mass must be multiplied by the number that comes right after the atom type: H2 = H.value * 2

I know that firstly I must isolate the keys of the given molecules, for this I could use string.split('-') . Then, I think I could use and if block to stablish a condition to accomplish if the key of the given molecule is in the dictionary. But later I'm lost about how I should proceed to find the mass for each key of the dictionary.

The expected result should be something like:

mass_counter('H2-N15-P3')

out[0] 39351.14

How could I do this?

EDIT:

This is what I've tried so far

# Atomic masses
mass = {'H': 1.007825, 'C': 12.01, 'O': 15.9994, 'N': 14.0067, 'S': 31.972071, 
        'P': 30.973762}

def calculate_atomic_mass(molecule):
    """
    Calculate the atomic mass of a given molecule
    """
    mass = 0.0
    mol = molecule.split('-')

    for key in mass:
        if key in mol:
            atom = key

    return mass

print calculate_atomic_mass('H2-O')
print calculate_atomic_mass('H2-S-O4')
print calculate_atomic_mass('C2-H5-O-H')
print calculate_atomic_mass('H2-N-C6-H4-C-O-2H')

Given all components have the shape Aa123 , It might be easier here to identify parts with a regex, for example:

import re
srch = re.compile(r'([A-Za-z]+)(\d*)')
mass = {'H': 1.007825, 'C': 12.01, 'O': 15.9994, 'N': 14.0067, 'S': 31.972071, 'P': 30.973762}

def calculate_atomic_mass(molecule):
    return sum(mass[a[1]]*int(a[2] or '1') for a in srch.finditer(molecule))

Here our regular expression [wiki] thus captures a sequence of [AZaz] s, and a (possibly empty) sequence of digits ( \\d* ), these are the first and second capture group respectively, and thus can be obtained for a match with a[1] and a[2] .

this then yields:

>>> print(calculate_atomic_mass('H2-O'))
18.01505
>>> print(calculate_atomic_mass('H2-S-O4'))
97.985321
>>> print(calculate_atomic_mass('C2-H5-O-H'))
46.06635
>>> print(calculate_atomic_mass('H2-N-C6-H4-C-O-2H'))
121.130875
>>> print(calculate_atomic_mass('H2-N15-P3'))
305.037436

We thus take the sum of the mass[..] of the first capture group (the name of the atom) times the number at the end, and we use '1' in case no such number can be found.

Or we can first split the data, and then look for a atom part and a number part:

import re
srch = re.compile(r'^([A-Za-z]+)(\d*)$')

def calculate_atomic_mass(molecule):
    """
    Calculate the atomic mass of a given molecule
    """
    result = 0.0
    mol = molecule.split('-')
    if atm in mol:
        c = srch.find(atm)
        result += result[c[1]] * int(c[2] or '1')
    return result

Here is an answer without regex:

import string
# Atomic masses
masses = {'H': 1.007825, 'C': 12.01, 'O': 15.9994, 'N': 14.0067, 'S': 31.972071, 
        'P': 30.973762}

def calculate_atomic_mass(molecule):
    """
    Calculate the atomic mass of a given molecule
    """
    mass = 0.0
    for key in molecule.split('-'):
        # check if any number is available
        if not key[-1] in string.digits:
            el, n = key, 1
        # check length of element label (1 or 2)
        elif key[1] in string.digits:
            el, n = key[:1], int(key[1:])
        else:
            el, n = key[:2], int(key[2:])
        mass += masses[el]*n
    return mass

print calculate_atomic_mass('H2-O')
print calculate_atomic_mass('H2-S-O4')
print calculate_atomic_mass('C2-H5-O-H')
print calculate_atomic_mass('H2-N-C6-H4-C-O-H2')

Here's how I would do it. You don't really need to iterate over the dictionary. Instead you need to iterate over the atom(s) in the molecule and look things up (randomly) in the dictionary.

Here's an example of doing that which assumes that there'll never be more that 10 atoms of any kind making up the molecule and the each element's name is only one letter long.

# Atomic masses.
MASSES = {'H': 1.007825, 'C': 12.01, 'O': 15.9994, 'N': 14.0067, 'S': 31.972071,
          'P': 30.973762}

def calculate_atomic_mass(molecule):
    """ Calculate the atomic mass of a given molecule. """
    mass = 0.0
    for atom in molecule.split('-'):
        if len(atom) == 1:
            mass += MASSES[atom]
        else:
            atom, count = atom[0], atom[1]
            mass += MASSES[atom] * int(count)

    return mass

print calculate_atomic_mass('H2-O')               # -> 18.01505
print calculate_atomic_mass('H2-S-O4')            # -> 97.985321
print calculate_atomic_mass('C2-H5-O-H')          # -> 46.06635
print calculate_atomic_mass('H2-N-C6-H4-C-O-H2')  # -> 122.1387

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM