[英]Clean way to select from many options in Python
I work in data science and a typical problem I encounter while cleaning up Pandas dataframes is converting columns from one string format to another (in particular, the strings I'm looking at are chemical identifiers and each of them represents a molecule in a obscure way, so it's not like the strings are easily understandable just by looking at them).我从事数据科学工作,在清理 Pandas 数据帧时遇到的一个典型问题是将列从一种字符串格式转换为另一种格式(特别是,我正在查看的字符串是化学标识符,它们中的每一个都以一种模糊的方式代表一个分子,所以这些字符串并不是仅仅通过查看它们就很容易理解)。 I have many small functions (inherited from a chemical library called RDKit) to convert between formats, and there is roughly one function per conversion pair (ie input format and output format).
我有许多小函数(继承自一个名为 RDKit 的化学库)来进行格式之间的转换,每个转换对(即输入格式和 output 格式)大约有一个 function。 This is too many function names to remember.
这是太多的 function 名称要记住。 I want to write a wrapper function that aggregates all of them into a single, larger one with a clean design and user interface.
我想编写一个包装器 function 将它们全部聚合成一个更大的具有简洁设计和用户界面的包装器。
The question is: given an input and output format, what would be a clean way to select from a many possible small conversion functions?问题是:给定输入和 output 格式,从许多可能的小转换函数到 select 的干净方法是什么? Should I use a dictionary that stores the small conversion function names?
我应该使用存储小转换 function 名称的字典吗?
For example, let's say I want to convert from the format "smiles" to the format "inchi keys", which I currently do as follows:例如,假设我想从格式“微笑”转换为格式“inchi keys”,我目前这样做如下:
from rdkit import Chem
def smile2inchikey(smile):
mol = Chem.MolFromSmiles(smile)
inchikey = Chem.inchi.MolToInchiKey(mol)
return inchikey
Instead of manually calling smile2inchikey
(or Chem.MolFromSmiles
and Chem.inchi.MolToInchiKey
), I would like to write the following function:而不是手动调用
smile2inchikey
(或Chem.MolFromSmiles
和Chem.inchi.MolToInchiKey
),我想写以下function:
def fancy_multiconverter(input_string, input_format, output_format):
pass
which returns input_string
(given in the format input_format
) to the format output_format
).它将
input_string
(以input_format
格式给出)返回到output_format
格式)。
Maybe this is also what @Quinten Cabo meant, but you could use one unit that you convert to every time.也许这也是@Quinten Cabo 的意思,但您可以使用每次转换为的单位。
You could then use a dictionary with functions for converting into and from this unit:然后,您可以使用带有函数的字典来与该单元进行转换:
convert_to_reference = {
"format1": function1,
"format2": function2,
}
convert_from_reference = {
...
}
reference = convert_to_reference[input("input format: ")](input("input value: "))
output = convert_from_reference[input("output format: ")](reference)
print(output)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.