简体   繁体   English

不区分大小写的字典

[英]Case insensitive dictionary

I'd like my dictionary to be case insensitive.我希望我的字典不区分大小写。

I have this example code:我有这个示例代码:

text = "practice changing the color"

words = {'color': 'colour',
        'practice': 'practise'}

def replace(words,text):

    keys = words.keys()

    for i in keys:
        text= text.replace(i ,words[i])
    return  text

text = replace(words,text)

print text

Output = practise changing the colour输出 = 练习改变颜色

I'd like another string, "practice changing the Color" , (where Color starts with a capital) to also give the same output.我想要另一个字符串"practice changing the Color" (其中Color以大写字母开头)也可以提供相同的输出。

I believe there is a general way to convert to lowercase using mydictionary[key.lower()] but I'm not sure how to best integrate this into my existing code.我相信有一种通用的方法可以使用mydictionary[key.lower()]转换为小写,但我不确定如何最好地将其集成到我现有的代码中。 (If this would be a reasonable, simple approach anyway). (如果这无论如何都是一种合理、简单的方法)。

The currently approved answer doesn't work for a lot of cases, so it cannot be used as a drop-in dict replacement. 当前批准的答案 在很多情况下都行不通,因此不能用作替代dict Some tricky points in getting a proper dict replacement: 获取适当的dict替换时的一些技巧:

  • overloading all of the methods that involve keys 重载涉及键的所有方法
  • properly handling non-string keys 正确处理非字符串键
  • properly handling the constructor of the class 正确处理类的构造函数

The following should work much better: 以下应该更好地工作:

class CaseInsensitiveDict(dict):
    @classmethod
    def _k(cls, key):
        return key.lower() if isinstance(key, basestring) else key

    def __init__(self, *args, **kwargs):
        super(CaseInsensitiveDict, self).__init__(*args, **kwargs)
        self._convert_keys()
    def __getitem__(self, key):
        return super(CaseInsensitiveDict, self).__getitem__(self.__class__._k(key))
    def __setitem__(self, key, value):
        super(CaseInsensitiveDict, self).__setitem__(self.__class__._k(key), value)
    def __delitem__(self, key):
        return super(CaseInsensitiveDict, self).__delitem__(self.__class__._k(key))
    def __contains__(self, key):
        return super(CaseInsensitiveDict, self).__contains__(self.__class__._k(key))
    def has_key(self, key):
        return super(CaseInsensitiveDict, self).has_key(self.__class__._k(key))
    def pop(self, key, *args, **kwargs):
        return super(CaseInsensitiveDict, self).pop(self.__class__._k(key), *args, **kwargs)
    def get(self, key, *args, **kwargs):
        return super(CaseInsensitiveDict, self).get(self.__class__._k(key), *args, **kwargs)
    def setdefault(self, key, *args, **kwargs):
        return super(CaseInsensitiveDict, self).setdefault(self.__class__._k(key), *args, **kwargs)
    def update(self, E={}, **F):
        super(CaseInsensitiveDict, self).update(self.__class__(E))
        super(CaseInsensitiveDict, self).update(self.__class__(**F))
    def _convert_keys(self):
        for k in list(self.keys()):
            v = super(CaseInsensitiveDict, self).pop(k)
            self.__setitem__(k, v)

If I understand you correctly and you want a way to key dictionaries in a non case-sensitive fashion, one way would be to subclass dict and overload the setter / getter: 如果我对您的理解正确,并且希望以一种不区分大小写的方式键入字典的键,则一种方法是将dict子类化并重载setter / getter:

class CaseInsensitiveDict(dict):
    def __setitem__(self, key, value):
        super(CaseInsensitiveDict, self).__setitem__(key.lower(), value)

    def __getitem__(self, key):
        return super(CaseInsensitiveDict, self).__getitem__(key.lower())

Would you consider using string.lower() on your inputs and using a fully lowercase dictionary? 您是否考虑在输入中使用string.lower()并使用完全小写的字典? It's a bit of a hacky solution, but it works 这有点棘手,但是可以用

In my particular instance, I needed a case insensitive lookup, however, I did not want to modify the original case of the key. 在我的特定情况下,我需要不区分大小写的查找,但是,我不想修改密钥的原始大小写。 For example: 例如:

>>> d = {}
>>> d['MyConfig'] = 'value'
>>> d['myconfig'] = 'new_value'
>>> d
{'MyConfig': 'new_value'}

You can see that the dictionary still has the original key, however it is accessible case-insensitively. 您可以看到该词典仍然具有原始密钥,但是可以区分大小写地访问它。 Here's a simple solution: 这是一个简单的解决方案:

class CaseInsensitiveKey(object):
    def __init__(self, key):
        self.key = key
    def __hash__(self):
        return hash(self.key.lower())
    def __eq__(self, other):
        return self.key.lower() == other.key.lower()
    def __str__(self):
        return self.key

The __hash__ and __eq__ overrides are required for both getting and setting entries in the dictionary. 获取和设置字典中的条目都需要__hash__和__eq__覆盖。 This is creating keys that hash to the same position in the dictionary if they are case-insensitively equal. 如果密钥不区分大小写,这将创建散列到字典中相同位置的键。

Now either create a custom dictionary that initializes a CaseInsensitiveKey using the provided key: 现在,要么创建一个自定义词典,然后使用提供的密钥初始化CaseInsensitiveKey:

class CaseInsensitiveDict(dict):
    def __setitem__(self, key, value):
        key = CaseInsensitiveKey(key)
        super(CaseInsensitiveDict, self).__setitem__(key, value)
    def __getitem__(self, key):
        key = CaseInsensitiveKey(key)
        return super(CaseInsensitiveDict, self).__getitem__(key)

or simply make sure to always pass an instance of CaseInsensitiveKey as the key when using the dictionary. 或者只需确保在使用字典时始终将CaseInsensitiveKey的实例作为键传递。

While a case insensitive dictionary is a solution, and there are answers to how to achieve that, there is a possibly easier way in this case. 尽管不区分大小写的字典是一种解决方案,并且有实现该问题的方法的答案,但是在这种情况下,可能有一种更简单的方法。 A case insensitive search is sufficient: 不区分大小写的搜索就足够了:

import re

text = "Practice changing the Color"
words = {'color': 'colour', 'practice': 'practise'}

def replace(words,text):
        keys = words.keys()
        for i in keys:
                exp = re.compile(i, re.I)
                text = re.sub(exp, words[i], text)
        return text

text = replace(words,text)
print text

I've modified the simple yet good solution by pleasemorebacon (thanks!) making it slightly more compact, self-contained and with minor updates to allow construction from {'a':1, 'B':2} and support __contains__ protocol. 我已经通过pleasemorebacon (谢谢!)修改了简单但不错的解决方案 (使其变得更紧凑,自包含,并进行了较小的更新,以允许从{'a':1, 'B':2}构建并支持__contains__协议)。 Finally, since the CaseInsensitiveDict.Key is expected to be string (what else can be case-sensitive or not), it is a good idea to derive Key class from the str , then it is possible, for instance, to dump CaseInsensitiveDict with json.dumps out of the box. 最后,由于CaseInsensitiveDict.Key应该是字符串(无论是否区分大小写),因此从str派生Key类是一个好主意,然后可以用json.dumps转储CaseInsensitiveDict json.dumps开箱。

# caseinsensitivedict.py
class CaseInsensitiveDict(dict):

    class Key(str):
        def __init__(self, key):
            str.__init__(key)
        def __hash__(self):
            return hash(self.lower())
        def __eq__(self, other):
            return self.lower() == other.lower()

    def __init__(self, data=None):
        super(CaseInsensitiveDict, self).__init__()
        if data is None:
            data = {}
        for key, val in data.items():
            self[key] = val
    def __contains__(self, key):
        key = self.Key(key)
        return super(CaseInsensitiveDict, self).__contains__(key)
    def __setitem__(self, key, value):
        key = self.Key(key)
        super(CaseInsensitiveDict, self).__setitem__(key, value)
    def __getitem__(self, key):
        key = self.Key(key)
        return super(CaseInsensitiveDict, self).__getitem__(key)

Here is a basic test script for those who like to check things in action: 对于那些喜欢检查实际情况的人,这是一个基本的测试脚本:

# test_CaseInsensitiveDict.py
import json
import unittest
from caseinsensitivedict import *

class Key(unittest.TestCase):
    def setUp(self):
        self.Key = CaseInsensitiveDict.Key
        self.lower = self.Key('a')
        self.upper = self.Key('A')

    def test_eq(self):
        self.assertEqual(self.lower, self.upper)

    def test_hash(self):
        self.assertEqual(hash(self.lower), hash(self.upper))

    def test_str(self):
        self.assertEqual(str(self.lower), 'a')
        self.assertEqual(str(self.upper), 'A')

class Dict(unittest.TestCase):
    def setUp(self):
        self.Dict = CaseInsensitiveDict
        self.d1 = self.Dict()
        self.d2 = self.Dict()
        self.d1['a'] = 1
        self.d1['B'] = 2
        self.d2['A'] = 1
        self.d2['b'] = 2

    def test_contains(self):
        self.assertIn('B', self.d1)
        d = self.Dict({'a':1, 'B':2})
        self.assertIn('b', d)

    def test_init(self):
        d = self.Dict()
        self.assertFalse(d)
        d = self.Dict({'a':1, 'B':2})
        self.assertTrue(d)

    def test_items(self):
        self.assertDictEqual(self.d1, self.d2)
        self.assertEqual(
            [v for v in self.d1.items()],
            [v for v in self.d2.items()])

    def test_json_dumps(self):
        s = json.dumps(self.d1)
        self.assertIn('a', s)
        self.assertIn('B', s)

    def test_keys(self):
        self.assertEqual(self.d1.keys(), self.d2.keys())

    def test_values(self):
        self.assertEqual(
            [v for v in self.d1.values()],
            [v for v in self.d2.values()])

You can do a dict key case insensitive search with a one liner: 您可以使用一个划线员来执行dict键不区分大小写的搜索:

>>> input_dict = {'aBc':1, 'xyZ':2}
>>> search_string = 'ABC'
>>> next((value for key, value in input_dict.items() if key.lower()==search_string.lower()), None)
1
>>> search_string = 'EFG'
>>> next((value for key, value in input_dict.items() if key.lower()==search_string.lower()), None)
>>>

You can place that into a function: 您可以将其放入函数中:


def get_case_insensitive_key_value(input_dict, key):
    return next((value for dict_key, value in input_dict.items() if dict_key.lower() == key.lower()), None)


Note that only the first match is returned. 请注意,仅返回第一个匹配项。

If you only need to do this once in your code (hence, no point to a function), the most straightforward way to deal with the problem is this: 如果您只需要在代码中执行一次此操作(因此,没有指向函数的意思),那么解决问题的最直接方法是:

lowercase_dict = {key.lower(): value for (key, value) in original_dict} lowercase_dict = {key.lower():original_dict中(key,value)的值}

I'm assuming here that the dict in question isn't all that large--it might be inelegant to duplicate it, but if it's not large, it isn't going to hurt anything. 我在这里假设所讨论的dict并不是那么大-复制它可能不太好,但是如果它不大,就不会有任何伤害。

The advantage of this over @Fred's answer (though that also works) is that it produces the same result as a dict when the key isn't present: a KeyError. 相对于@Fred的答案,此方法的优点(尽管也可以)是,当不存在该键时,它产生与dict相同的结果:KeyError。

There are multiple approaches to this problem, each has its set of pros and cons.有多种方法可以解决这个问题,每种方法都有其优点和缺点。 Just to add to the list (looks like this option wasn't mentioned), it's possible to extend str class and use it as a key:只是为了添加到列表中(看起来没有提到这个选项),可以扩展str类并将其用作键:

class CaseInsensitiveStr(str):
    def __hash__(self) -> 'int':
        return hash(self.lower())
    def __eq__(self, other:'str') -> 'bool':
        return self.lower() == other.lower()

It can work well if dictionary in question is private and some kind of interface is used to access it.如果有问题的字典是私有的并且使用某种接口来访问它,它可以很好地工作。

class MyThing:
    def __init__(self):
        self._d: 'dict[CaseInsensitiveStr, int]' = dict()
    def set(self, key:'str', value:'int'):
        self._d[CaseInsensitiveStr(key)] = value
    def get(self, key:'str') -> 'int':
        return self._d[CaseInsensitiveStr(key)]

I just set up a function to handle this: 我只是设置一个函数来处理此问题:

def setLCdict(d, k, v):
    k = k.lower()
    d[k] = v
    return d

myDict = {}

So instead of 所以代替

myDict['A'] = 1
myDict['B'] = 2

You can: 您可以:

myDict = setLCdict(myDict, 'A', 1)
myDict = setLCdict(myDict, 'B', 2)

You can then either lower case the value before looking it up or write a function to do so. 然后,您可以在查找值之前小写该值,或者编写一个函数来这样做。

    def lookupLCdict(d, k):
        k = k.lower()
        return d[k]

    myVal = lookupLCdict(myDict, 'a')

Probably not ideal if you want to do this globally but works well if its just a subset you wish to use it for. 如果您想在全局范围内执行此操作,则可能不理想,但是如果您只是希望将其用于其中,则效果很好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM