繁体   English   中英

Python function 关于化学式

[英]Python function about chemical formulas

我有一个包含化学物质名称和一些信息的 CSV 文件。我需要做的是添加新列并在每个公式中写入它们的公式、分子量和计数 H、C、N、O、S 原子数。我被卡住了与计数原子数部分。我有 function 相关但我不知道如何合并它并使代码工作。

import pandas as pd    
import urllib.request    
import copy    
import re    

df = pd.read_csv('AminoAcids.csv')

def countAtoms(string, dict={}):
    curDict = copy.copy(dict)
    atoms = re.findall("[A-Z]{1}[a-z]*[0-9]*", string)

    for j in atoms:
        atomGroups = re.match('([A-Z]{1}[a-z]*)([0-9]*)', j)
        atom = atomGroups.group(1)
        number = atomGroups.group(2)
        try :
            curDict[atom] = curDict[atom] + int(number)
        except KeyError:
            try :
                curDict[atom] = int(number)
            except ValueError:
                curDict[atom] = 1
        except ValueError:
            curDict[atom] = curDict[atom] + 1
    return curDict

df["Formula"] = ['C3H7NO2', 'C6H14N4O2 ','C4H8N2O3','C4H7NO4 ',
'C3H7NO2S ','C5H9NO4','C5H10N2O3','C2H5NO2 ','C6H9N3O2',
'C6H13NO2','C6H13NO2','C6H14N2O2 ','C5H11NO2S ','C9H11NO2',
'C5H9NO2 ','C3H7NO3','C4H9NO3 ','C11H12N2O2 ','C9H11NO3 ','C5H11NO2']
df["Molecular Weight"] = ['89.09','174.2','132.12',
'133.1','121.16','147.13','146.14','75.07','155.15',
'131.17','131.17','146.19','149.21','165.19','115.13',
'105.09','119.12','204.22','181.19','117.15']
df["H"] = 0
df["C"] = 0
df["N"] = 0
df["O"] = 0
df["S"] = 0
df.to_csv("AminoAcids.csv", index=False)
print(df.to_string()) 

如果我理解正确的话,你应该可以在这里使用str.extract

df["H"] = df["Formula"].str.extract(r'H(\d+)')
df["C"] = df["Formula"].str.extract(r'C(\d+)')
df["N"] = df["Formula"].str.extract(r'N(\d+)')
df["O"] = df["Formula"].str.extract(r'O(\d+)')
df["S"] = df["Formula"].str.extract(r'S(\d+)')

这是另一种具有类似结果的方法:

df.join(df['Formula'].str.findall('([A-Z])(\d*)').map(dict).apply(pd.Series).replace('', 1))

>>>
'''
        Formula Molecular Weight   C   H  N  O    S
0       C3H7NO2            89.09   3   7  1  2  NaN
1    C6H14N4O2             174.2   6  14  4  2  NaN
2      C4H8N2O3           132.12   4   8  2  3  NaN
3      C4H7NO4             133.1   4   7  1  4  NaN
4     C3H7NO2S            121.16   3   7  1  2  1.0
5       C5H9NO4           147.13   5   9  1  4  NaN
6     C5H10N2O3           146.14   5  10  2  3  NaN
7      C2H5NO2             75.07   2   5  1  2  NaN
8      C6H9N3O2           155.15   6   9  3  2  NaN
9      C6H13NO2           131.17   6  13  1  2  NaN
10     C6H13NO2           131.17   6  13  1  2  NaN
11   C6H14N2O2            146.19   6  14  2  2  NaN
12   C5H11NO2S            149.21   5  11  1  2  1.0
13     C9H11NO2           165.19   9  11  1  2  NaN
14     C5H9NO2            115.13   5   9  1  2  NaN
15      C3H7NO3           105.09   3   7  1  3  NaN
16     C4H9NO3            119.12   4   9  1  3  NaN
17  C11H12N2O2            204.22  11  12  2  2  NaN
18    C9H11NO3            181.19   9  11  1  3  NaN
19     C5H11NO2           117.15   5  11  1  2  NaN

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM