简体   繁体   中英

What is the recommended schema design for dynamic dates in Django for PostgreSQL?

we have a Django app focusing on timeline evolution visualization. There we have conceptually the relationship of:

1 Item with 1 or more Lifecycles (more for versioning purposes)

1 Lifecycle has 0..n Milestones

1 Milestone is a date stored as a string in form YYYY-MM-DD or a special tag "today", which means daily changing date (dynamic - the date was not stated, but until today is some state valid - if today is smaller then the next milestone).

The character of the data is that there are very diverse interpretations of milestones and phases in between them. Also the amount of milestones is diverse. However there seam to be used a maximum of 7 milestones. The characteristics of lifecycle records can be grouped (same amount of milestones with the same meanings).

We are using Django on top of PostgreSQL with model schema like this:

class Item(models.Model):
    ... other attributes
    lifecycle_actual     = models.IntegerField(null=True, default=-1, help_text="Selectable actual roadmap. Can be used to override the imported data. Use the ID of particular roadmap or -1 for the latest import.")

class Lifecycle(models.Model):
    ... other attributes
    lifecycle_group = models.ForeignKey(LifecycleGroup, help_text="Vizualization group.")
    date0 = models.CharField(max_length=10, blank=True)
    date1 = models.CharField(max_length=10, blank=True)
    date2 = models.CharField(max_length=10, blank=True)
    date3 = models.CharField(max_length=10, blank=True)
    date4 = models.CharField(max_length=10, blank=True)
    date5 = models.CharField(max_length=10, blank=True)
    date6 = models.CharField(max_length=10, blank=True)
    item = models.ForeignKey(Item, null=True, blank=True)

    def __unicode__(self):
        return self.item.fullname

class LifecycleGroup(models.Model):
    name = models.CharField(max_length=220, help_text="Name of the group") 
    era0_name = models.CharField(max_length=100, blank=True)
    era1_name = models.CharField(max_length=100, blank=True)
    era2_name = models.CharField(max_length=100, blank=True)
    era3_name = models.CharField(max_length=100, blank=True)
    era4_name = models.CharField(max_length=100, blank=True)
    era5_name = models.CharField(max_length=100, blank=True)
    era6_name = models.CharField(max_length=100, blank=True)

    era0_start_name = models.CharField(max_length=100, blank=True)
    era1_start_name = models.CharField(max_length=100, blank=True)
    era2_start_name = models.CharField(max_length=100, blank=True)
    era3_start_name = models.CharField(max_length=100, blank=True)
    era4_start_name = models.CharField(max_length=100, blank=True)
    era5_start_name = models.CharField(max_length=100, blank=True)
    era6_start_name = models.CharField(max_length=100, blank=True)

    era0_css_classes = models.CharField(max_length=150, blank=True)
    era1_css_classes = models.CharField(max_length=151, blank=True)
    era2_css_classes = models.CharField(max_length=152, blank=True)
    era3_css_classes = models.CharField(max_length=153, blank=True)
    era4_css_classes = models.CharField(max_length=154, blank=True)
    era5_css_classes = models.CharField(max_length=155, blank=True)
    era6_css_classes = models.CharField(max_length=156, blank=True)

    def __unicode__(self):
        return self.name

Overall it works fine, however we have issues with reporting questions such as:

Which items will hit milestones of certain characteristics in December 2015?

Even if we changed the model code to this:

class Item(models.Model):
    ... other attributes
    lifecycle_actual     = models.IntegerField(null=True, default=-1, help_text="Selectable actual roadmap. Can be used to override the imported data. Use the ID of particular roadmap or -1 for the latest import.")

class Lifecycle(models.Model):
    ... other attributes
    # lifecycle group - not used anymore - have to duplicate info somehow in milestones
    # lifecycle_group = models.ForeignKey(LifecycleGroup, help_text="Vizualization group.")
    item = models.ForeignKey(Item, null=True, blank=True)

    def __unicode__(self):
        return self.item.fullname

class Milestone(models.Model):

    lifecycle = models.ForeignKey(Lifecycle, null=True, blank=True) 
    date = models.CharField(max_length=10, blank=True)
    name = models.CharField(max_length=100, blank=True)
    next_era = models.ForeignKey(Era, null=True, blank=True)

    impact = ... cca 4 choices
    order = models.PositiveIntegerField()

class Era(models.Model):
    name = models.CharField(max_length=100, blank=True)
    css_classes = models.CharField(max_length=150, blank=True) 

We got still several problems:

  1. we would have to always join milestones under lifecycle for every vizualization query we have (seams to be contradictory to this normalization)

What is the recommended schema design for such needs?

  1. dynamic date "today" in Milestone date field

How to store dynamic(changing) date in the DB, so it would become valid for SELECTS and comparable with stored static dates?

So we can do:

SELECT * FROM item, lifecycle, milestone 
WHERE item.id = lifecycle.item AND milestone.lifecycle = lifecycle.id 
AND milestone.impact = 'huge'
AND milestone.date between '2015-12-01' AND '2015-12-31'
  1. We would like to enhance the "today" control string

So we can store milestone definition like this:

"today +365d" or "today -20d",  resp. “YYYY-MM-DD<today<YYYY-MM-DD”.

Thanks in advance for any comments, suggestions!

EDIT

Imagine data like this:

(item lifecycle => milestone name: date, ...)

item1 => born: 2011-12-02, 
         decline: 2015-06-01, 
         end of life:2017-06-01 

item2 => lifecycle check: 2015-08-01, 
         some significant milestone: 2017-09-01,
         depreciation ends: 2019-04-15, 
         to be decommissioned: 2022-04-01

item3 => initiated: 2012-05-08, 
         life until at least: *today*, 
         end of life: not declared 

item4 => initiated: 2012-05-08, 
         productive life until at least: *today +2 years*, 
         end of life: 2032-08-01 

item5 => born: unknown but latest *today*, 
         end of life:2017-06-01 

Where today is the ongoing date, ie the every current date in the future when user uses the data.

Let's assume we should select all items, which have any milestone between 2015-10-01 and 2015-12-01. If we run the SELECT today (2015-10-29) the item3 and item5 should be in the output. If we run that SELECT on 2015-12-15 the item3 and item5 must not be in the output.

You should use models.DateTimeField(default=timezone.now) in your dates and use a models.BooleanField to define TODAY behavior milestone.

I guess that is better:

class Milestone(models.Model):
    lifecycle = models.ForeignKey(Lifecycle, null=True, blank=True) 
    date = models.DateTimeField(max_length=10, blank=True)
    today = models.BooleanField(default=False)
    name = models.CharField(max_length=100, blank=True)
    next_era = models.ForeignKey(Era, null=True, blank=True)

Seconding arkadyzalko's recommendation on DateTimeField but would note a few additional things.

First I would recommend reading this documentation and focus on range types. If each era hits a range (you known in advance when the era will end) then it becomes easy to add indexes to determine what is in an era -- ie thequestion is whether the date falls within a range and you can join on that as well.

So from a database design perspective, I would look at

  1. using a range type for era boundaries
  2. using an exclusion constraint to ensure they do not overlap
  3. Joining on the overlap between the date of the event and the era.

Django should support all of these (though the exclusion constraint you might have to do yourself).

As an example of some daterange queries:

test=# select '[2011-01-01,2011-02-01)'::daterange @> '2011-01-15'::date;
 ?column? 
----------
 t
(1 row)

test=# select '[2011-01-01,2011-02-01)'::daterange @> '2011-01-1'::date;
 ?column? 
----------
  t
(1 row)

test=# select '[2011-01-01,2011-02-01)'::daterange @> '2011-02-1'::date;
 ?column? 
 ----------
 f
(1 row)

But this means you can join on a value being in a range, s FROM dates JOIN epoch ON epoch.range @> dates.date

GiST indexes also allow you to do this with index lookups.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM