[英]Python function about chemical formulas
我有一个包含化学物质名称和一些信息的 CSV 文件。我需要做的是添加新列并在每个公式中写入它们的公式、分子量和计数 H、C、N、O、S 原子数。我被卡住了与计数原子数部分。我有 function 相关但我不知道如何合并它并使代码工作。
import pandas as pd
import urllib.request
import copy
import re
df = pd.read_csv('AminoAcids.csv')
def countAtoms(string, dict={}):
curDict = copy.copy(dict)
atoms = re.findall("[A-Z]{1}[a-z]*[0-9]*", string)
for j in atoms:
atomGroups = re.match('([A-Z]{1}[a-z]*)([0-9]*)', j)
atom = atomGroups.group(1)
number = atomGroups.group(2)
try :
curDict[atom] = curDict[atom] + int(number)
except KeyError:
try :
curDict[atom] = int(number)
except ValueError:
curDict[atom] = 1
except ValueError:
curDict[atom] = curDict[atom] + 1
return curDict
df["Formula"] = ['C3H7NO2', 'C6H14N4O2 ','C4H8N2O3','C4H7NO4 ',
'C3H7NO2S ','C5H9NO4','C5H10N2O3','C2H5NO2 ','C6H9N3O2',
'C6H13NO2','C6H13NO2','C6H14N2O2 ','C5H11NO2S ','C9H11NO2',
'C5H9NO2 ','C3H7NO3','C4H9NO3 ','C11H12N2O2 ','C9H11NO3 ','C5H11NO2']
df["Molecular Weight"] = ['89.09','174.2','132.12',
'133.1','121.16','147.13','146.14','75.07','155.15',
'131.17','131.17','146.19','149.21','165.19','115.13',
'105.09','119.12','204.22','181.19','117.15']
df["H"] = 0
df["C"] = 0
df["N"] = 0
df["O"] = 0
df["S"] = 0
df.to_csv("AminoAcids.csv", index=False)
print(df.to_string())
如果我理解正确的话,你应该可以在这里使用str.extract
:
df["H"] = df["Formula"].str.extract(r'H(\d+)')
df["C"] = df["Formula"].str.extract(r'C(\d+)')
df["N"] = df["Formula"].str.extract(r'N(\d+)')
df["O"] = df["Formula"].str.extract(r'O(\d+)')
df["S"] = df["Formula"].str.extract(r'S(\d+)')
这是另一种具有类似结果的方法:
df.join(df['Formula'].str.findall('([A-Z])(\d*)').map(dict).apply(pd.Series).replace('', 1))
>>>
'''
Formula Molecular Weight C H N O S
0 C3H7NO2 89.09 3 7 1 2 NaN
1 C6H14N4O2 174.2 6 14 4 2 NaN
2 C4H8N2O3 132.12 4 8 2 3 NaN
3 C4H7NO4 133.1 4 7 1 4 NaN
4 C3H7NO2S 121.16 3 7 1 2 1.0
5 C5H9NO4 147.13 5 9 1 4 NaN
6 C5H10N2O3 146.14 5 10 2 3 NaN
7 C2H5NO2 75.07 2 5 1 2 NaN
8 C6H9N3O2 155.15 6 9 3 2 NaN
9 C6H13NO2 131.17 6 13 1 2 NaN
10 C6H13NO2 131.17 6 13 1 2 NaN
11 C6H14N2O2 146.19 6 14 2 2 NaN
12 C5H11NO2S 149.21 5 11 1 2 1.0
13 C9H11NO2 165.19 9 11 1 2 NaN
14 C5H9NO2 115.13 5 9 1 2 NaN
15 C3H7NO3 105.09 3 7 1 3 NaN
16 C4H9NO3 119.12 4 9 1 3 NaN
17 C11H12N2O2 204.22 11 12 2 2 NaN
18 C9H11NO3 181.19 9 11 1 3 NaN
19 C5H11NO2 117.15 5 11 1 2 NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.