简体   繁体   中英

Unique model field in Django and case sensitivity (postgres)

Consider the following situation: -

Suppose my app allows users to create the states / provinces in their country. Just for clarity, we are considering only ASCII characters here.

In the US, a user could create the state called "Texas". If this app is being used internally, let's say the user doesn't care if it is spelled "texas" or "Texas" or "teXas"

But importantly, the system should prevent creation of "texas" if "Texas" is already in the database.

If the model is like the following:

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

The uniqueness would be case-sensitive in postgres; that is, postgres would allow the user to create both "texas" and "Texas" as they are considered unique.

What can be done in this situation to prevent such behavior. How does one go about providing case- insenstitive uniqueness with Django and Postgres

Right now I'm doing the following to prevent creation of case- insensitive duplicates.

class CreateStateForm(forms.ModelForm):
    def clean_name(self):
        name = self.cleaned_data['name']
        try:
            State.objects.get(name__iexact=name)
        except ObjectDoesNotExist:
            return name
        raise forms.ValidationError('State already exists.')

    class Meta:
        model = State

There are a number of cases where I will have to do this check and I'm not keen on having to write similar iexact checks everywhere.

Just wondering if there is a built-in or better way? Perhaps db_type would help? Maybe some other solution exists?

You could define a custom model field derived from models.CharField . This field could check for duplicate values, ignoring the case.

Custom fields documentation is here http://docs.djangoproject.com/en/dev/howto/custom-model-fields/

Look at http://code.djangoproject.com/browser/django/trunk/django/db/models/fields/files.py for an example of how to create a custom field by subclassing an existing field.

You could use the citext module of PostgreSQL https://www.postgresql.org/docs/current/static/citext.html

If you use this module, the the custom field could define "db_type" as CITEXT for PostgreSQL databases.

This would lead to case insensitive comparison for unique values in the custom field.

Alternatively you can change the default Query Set Manager to do case insensitive look-ups on the field. In trying to solve a similar problem I came across:

http://djangosnippets.org/snippets/305/

Code pasted here for convenience:

from django.db.models import Manager
from django.db.models.query import QuerySet

class CaseInsensitiveQuerySet(QuerySet):
    def _filter_or_exclude(self, mapper, *args, **kwargs):
        # 'name' is a field in your Model whose lookups you want case-insensitive by default
        if 'name' in kwargs:
            kwargs['name__iexact'] = kwargs['name']
            del kwargs['name']
        return super(CaseInsensitiveQuerySet, self)._filter_or_exclude(mapper, *args, **kwargs)

# custom manager that overrides the initial query set
class TagManager(Manager):
    def get_query_set(self):
        return CaseInsensitiveQuerySet(self.model)

# and the model itself
class Tag(models.Model):
    name = models.CharField(maxlength=50, unique=True, db_index=True)

    objects = TagManager()

    def __str__(self):
        return self.name

On the Postgres side of things, a functional unique index will let you enforce unique values without case. citext is also noted, but this will work with older versions of PostgreSQL and is a useful technique in general.

Example:

# create table foo(bar text);
CREATE TABLE
# create unique index foo_bar on foo(lower(bar));
CREATE INDEX
# insert into foo values ('Texas');
INSERT 0 1
# insert into foo values ('texas');
ERROR:  duplicate key value violates unique constraint "foo_bar"

Explicit steps for Mayuresh's answer:

  1. in postgres do: CREATE EXTENSION citext;

  2. in your models.py add:

     from django.db.models import fields class CaseInsensitiveTextField(fields.TextField): def db_type(self, connection): return "citext" 

    reference: https://github.com/zacharyvoase/django-postgres/blob/master/django_postgres/citext.py

  3. in your model use: name = CaseInsensitiveTextField(unique=True)

Besides already mentioned option to override save, you can simply store all text in lower case in database and capitalize them on displaying.

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

    def save(self, force_insert=False, force_update=False):
        self.name = self.name.lower()
        super(State, self).save(force_insert, force_update)

a very simple solution:

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

    def clean(self):
        self.name = self.name.capitalize()

You can do this by overwriting the Model's save method - see the docs . You'd basically do something like:

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

    def save(self, force_insert=False, force_update=False):
        if State.objects.get(name__iexact = self.name):
            return
        else:
            super(State, self).save(force_insert, force_update)

Also, I may be wrong about this, but the upcoming model-validation SoC branch will allow us to do this more easily.

You can use lookup='iexact' in UniqueValidator on serializer, like this:

class StateSerializer(serializers.ModelSerializer): 
    name = serializers.CharField(validators=[
    UniqueValidator(
        queryset=models.State.objects.all(),lookup='iexact'
    )]

django version: 1.11.6

Solution from suhail worked for me without the need to enable citext, pretty easy solution only a clean function and instead of capitalize I used upper() . Mayuresh's solution also works but changed the field from CharField to TextField .

class State(models.Model):

    name = models.CharField(max_length=50, unique=True)

    def clean(self):
        self.name = self.name.upper()

If you don't want to use a postgres-specific solution, you can create a unique index on the field with upper() to enforce uniqueness at the database level, then create a custom Field mixin that overrides get_lookup() to convert case-sensitive lookups to their case-insensitive versions. The mixin looks like this:

class CaseInsensitiveFieldMixin:
    """
    Field mixin that uses case-insensitive lookup alternatives if they exist.
    """

    LOOKUP_CONVERSIONS = {
        'exact': 'iexact',
        'contains': 'icontains',
        'startswith': 'istartswith',
        'endswith': 'iendswith',
        'regex': 'iregex',
    }

    def get_lookup(self, lookup_name):
        converted = self.LOOKUP_CONVERSIONS.get(lookup_name, lookup_name)
        return super().get_lookup(converted)

And you use it like this:

from django.db import models


class CICharField(CaseInsensitiveFieldMixin, models.CharField):
    pass


class CIEmailField(CaseInsensitiveFieldMixin, models.EmailField):
    pass


class TestModel(models.Model):
    name = CICharField(unique=True, max_length=20)
    email = CIEmailField(unique=True)

You can read more about this approach here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM