在Python中查找快速默认别名

Question

Is there a faster way to do the following for much larger dicts? 对于更大的dicts，有更快的方法来执行以下操作吗？

aliases = {
            'United States': 'USA',
            'United Kingdom': 'UK',
            'Russia': 'RUS',
          }
if countryname in aliases: countryname = aliases[countryname]

Answer 1

Your solution is fine, as "in" is 0(1) for dictionaries. 您的解决方案很好，因为字典中“in”为0（1）。

You could do something like this to save some typing: 你可以做这样的事情来节省一些打字：

countryname = aliases.get(countryname, countryname)

(But I find your code a lot easier to read than that) （但我发现你的代码比那更容易阅读）

When it comes to speed, what solution is best would depend on if there will be a majority of "hits" or "misses". 谈到速度，最好的解决方案取决于是否存在大多数“命中”或“未命中”。 But that would probably be in the nanosecond range when it comes to difference. 但是，当涉及差异时，这可能会在纳秒范围内。

Answer 2

If your list fits in memory, dicts are the fastest way to go. 如果您的列表适合内存，那么dicts是最快的方式。 As S.Mark points out, you are doing two lookups where one will do, either with: 正如S.Mark所指出的那样，你正在进行两次查找，其中一次将会执行：

countryname = aliases.get(countryname, countryname)

(which will leave countryname unchanged if it isn't in the dictionary), or: （如果不在字典中，将保留countryname不变），或者：

try:
    countryname = aliases[countryname]
except KeyError:
    pass

Answer 3

Accessing with .get could be faster than checking and assigning in variable 使用.get访问可能比检查和分配变量更快

aliases.get(countryname)

And if countryname is not exists in aliases it will return None. 如果别名中不存在countryname，则返回None。

Answer 4

If your dictionary is very large and you expect many of your checks not to find a match, then you might want to consider a Bloom filter or one of it's derivatives and allow false positives. 如果您的字典非常大并且您希望许多检查找不到匹配项，那么您可能需要考虑Bloom过滤器或其中一个派生词并允许误报。

Alternatively, because your keys can be sorted (and/or have a derived values), you could implement a bisection or other root-finding algorithm. 或者，因为您的键可以被排序（和/或具有派生值），所以您可以实现二分法或其他根查找算法。

First, I'd figure out exactly how Python implements dictionary look-ups, so you are not just re-inventing the wheel. 首先，我会弄清楚Python究竟是如何实现字典查找的，所以你不仅仅是在重新发明轮子。

Also, a pure-Python implementation of these could be quite slow if it involves a lot of iteration. 此外，如果涉及大量迭代，那么这些的纯Python实现可能会非常慢。 Consider Cython, Numpy, or F2Py to get truly optimized. 考虑使用Cython，Numpy或F2Py进行真正的优化。

(if you are dealing with just country names, then I don't think you are dealing with mappings large enough to warrant any of my suggestions), but if you are looking at doing some kind of spell-check implementation, then.. （如果你只处理国家名称，那么我认为你没有处理足够大的映射以保证我的任何建议），但是如果你正在考虑做某种拼写检查实现，那么......

Good luck. 祝好运。

在Python中查找快速默认别名

问题描述

4 个解决方案

解决方案1
6 已采纳 2010-01-24 13:39:24

解决方案2
2 2010-01-24 13:39:19

解决方案3
1 2010-01-24 13:33:42

解决方案4
0 2010-01-24 16:14:31

在Python中查找快速默认别名

问题描述

4 个解决方案

解决方案1 6 已采纳 2010-01-24 13:39:24

解决方案2 2 2010-01-24 13:39:19

解决方案3 1 2010-01-24 13:33:42

解决方案4 0 2010-01-24 16:14:31

解决方案1
6 已采纳 2010-01-24 13:39:24

解决方案2
2 2010-01-24 13:39:19

解决方案3
1 2010-01-24 13:33:42

解决方案4
0 2010-01-24 16:14:31