简体   繁体   中英

How to overwrite `id` as primary key with `timestamp` as primary key in django models?

I'm building a Django ETL engine that extracts data from GitHub using the enterprise API to gather metrics on internal company collaboration. I've designed some schema that I now realize won't scale due to the PK (primary key) that is automatically set by the ORM. One of the main features of the extraction is to get the id of the person that has created a repository, commented on a post, etc.

My initial thought was to let the ORM automatically set the id as the PK but this won't work as the GET request is going to run once a week and it will raise errors causing the overwriting of the ID primary key to fail.

I've done some research and one potential solution is to create a meta class as referenced here: Django model primary key as a pair

but I am unsure if creating a few meta classes is going to defeat the entire point of a meta class to begin with.

Here is the schema I have setup for the models.py

from django.db import models
from datetime import datetime

""" Contruction of tables in MySQL instance """


class Repository(models.Model):
    id = models.PositiveIntegerField(null=False, primary_key=True)
    repo_name = models.CharField(max_length=50)
    creation_date = models.CharField(max_length=21, null=True)
    last_updated = models.CharField(max_length=30, null=True)
    qty_watchers = models.PositiveIntegerField(null=True)
    qty_forks = models.PositiveIntegerField(null=True)
    qty_issues = models.PositiveIntegerField(null=True)
    main_language = models.CharField(max_length=30, null=True)
    repo_size = models.PositiveIntegerField(null=True)
    timestamp = models.DateTimeField(auto_now=True)

class Contributor(models.Model):
    id = models.IntegerField(null=False, primary_key=True)
    contributor_cec = models.CharField(max_length=30, null=True)
    contribution_qty = models.PositiveIntegerField(null=True)
    get_request = models.CharField(max_length=100, null=True)
    timestamp = models.DateTimeField(auto_now=True)


class Teams(models.Model):
    id = models.IntegerField(primary_key=True, null=False)
    team_name = models.CharField(max_length=100, null=True)
    timestamp = models.DateTimeField(auto_now=True)


class TeamMembers(models.Model):
    id = models.IntegerField(null=False, primary_key=True)
    team_member_cec = models.CharField(max_length=30, null=True)
    get_request = models.CharField(max_length=100, null=True)
    timestamp = models.DateTimeField(auto_now=True)


class Discussions(models.Model):
    id = models.IntegerField(null=False, primary_key=True)
    login = models.CharField(max_length=30, null=True)
    title = models.CharField(max_length=30, null=True)
    body = models.CharField(max_length=1000, null=True)
    comments = models.IntegerField(null=True)
    updated_at = models.CharField(max_length=21, null=True)
    get_request = models.CharField(max_length=100, null=True)
    timestamp = models.DateTimeField(auto_now=True)

Is there a way to overwrite the id field and make the PK the timestamp field since each time the GET request is run that field will be populated with static data that will not change over the lifetime of the app?

Alternatively, is there a way to ditch the multi-table inheritance architecture and go for something different?

The core metrics that I will be extracting away from this are things like top contributor to repository , repository with most commits , most replied to comments . I'd like to be able to run some kind of filters on the data so as to extract these metrics out but I know this is heavily reliant upon the schema setup.

Thank you!

将字段设置为主键的方法是

field_name = models.FieldType(primary_key=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM