简体   繁体   English

在Django中区分大小写的搜索,但在Mysql中被忽略

[英]Case sensitive search in Django, but ignored in Mysql

I have a field in a Django Model for storing a unique (hash) value. 我在Django模型中有一个字段用于存储唯一(哈希)值。 Turns out that the database (MySQL/inno) doesn't do a case sensitive search on this type (VARCHAR), not even if I explicitly tell Django to do a case sensitive search Document.objects.get(hash__exact="abcd123") . 事实证明,数据库(MySQL / inno)不会对此类型(VARCHAR)进行区分大小写的搜索,即使我明确告诉Django进行区分大小写的搜索也是如此Document.objects.get(hash__exact="abcd123") So "abcd123" and "ABcd123" are both returned, which I don't want. 所以都返回了“ abcd123”和“ ABcd123”,我不希望这样。

class document(models.Model):
   filename    = models.CharField(max_length=120)
   hash        = models.CharField(max_length=33 )

I can change the 'hash field' to a BinaryField , so in the DB it becomes a LONGBLOB , and it does do a case-sensitive search (and works). 我可以将'hash field'更改为BinaryField,因此在数据库中它变为LONGBLOB,并且确实进行区分大小写的搜索(并且可以工作)。 However, this doesn't seem very efficient to me. 但是,这对我来说似乎不是很有效。 Is there a better way (in Django) to do this, like adding 'utf8 COLLATE'? 有没有更好的方法(在Django中),例如添加'utf8 COLLATE'? or what would be the correct Fieldtype in this situation? 或在这种情况下正确的Fieldtype是什么? (yes, I know I could use PostgreSQL instead..) (是的,我知道我可以改用PostgreSQL。)

The default collation for character set for MySQL is latin1_swedish_ci, which is case insensitive. MySQL字符集的默认排序规则是latin1_swedish_ci,不区分大小写。 Not sure why that is. 不知道为什么。 But you should create your database like so: 但是您应该这样创建数据库:

CREATE DATABASE database_name CHARACTER SET utf8;

As @dan-klasson mentioned, the default non-binary string comparison is case insensetive by default ; 正如@ dan-klasson所提到的, 默认情况下默认的非二进制字符串比较不区分大小写 notice the _ci at the end of latin1_swedish_ci , it stands for case-insensetive. 注意_ci在结束latin1_swedish_ci ,它代表的情况下,insensetive。 You can, as Dan mentioned, create the database with a case sensitive collation and character set. 正如Dan所提到的,您可以使用区分大小写的排序规则和字符集来创建数据库。

You may be also interested to know that you can always create a single table or even set only a single column to use a different collation (for the same result). 您可能还想知道,您始终可以创建一个表,甚至可以只设置一个列以使用不同的排序规则 (以获得相同的结果)。 And you may also change these collations post creation, for instance per table: 您还可以在创建后更改这些归类,例如每个表:

ALTER TABLE documents__document CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

Additionally, if you rather not change the database/table charset/collation, Django allows to run a custom query using the raw method . 此外,如果您不想更改数据库/表的字符集/排序规则,则Django允许使用raw方法运行自定义查询 So you may be able to work around the change by using something like the following, though I have not tested this myself: 因此,尽管我自己尚未对此进行测试,但是您可以使用以下方法解决更改:

Document.objects.raw("SELECT * FROM documents__document LIKE '%s' COLLATE latin1_bin", ['abcd123'])

You got case sensitive filter query wrong. 您区分大小写的筛选器查询错误。 It should be __exact , like this: 应该是__exact ,像这样:

Document.objects.get(hash__exact="abcd123")

And for case insensitive filter query, you need to use __iexact . 对于不区分大小写的过滤器查询,您需要使用__iexact

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM