简体   繁体   中英

Fastest way of retrieving records from Entity Framework DbSet on varchar Unix Timestamp

I have to retrieve all the records from a database that have been added since the last execution, this should happen daily.

The only thing that can identify those records from the rest is a Unix Timestamp (in milliseconds) or a Time (hhmmss) and a Date (yyyyMMdd). My problem is that all these columns are of type varchar.

The database is very big, and only getting bigger. Is there any way of getting only the rows with a Unix Timestamp higher than X without having to load the entire thing and parsing the timestamp?

What I do now is:

var sales = context.SALES.Select(s =>
  new Sale {
   Product = s.SC_PRDCT,
    Terminal = s.SC_TERM,
    Operator = s.MC_OP,
    UnixString = s.SC_TIMESTAMP
  })
 .ToList()
 .Where(m => terminals.ContainsKey(m.Terminal) && m.UnixTime > lastExecution);
        public string UnixString
        {
            get { return unixString; }
            set { unixString = value; UnixTime = long.Parse(value); }
        }

Options that come to mind: If you have the ability to alter the schema while preserving the current fields I would consider adding a computed column to the database for a DateTime equivalent to the timestamp. Barring that, using a View to source the data for this type of search/report which can provide the translated timestamp.

If you don't have the ability to adjust the schema, then things will get a bit trickier. When you say the timestamp can be milliseconds or datetime in a string, does that mean the value can be something like either "1435234353353" (ms since Date X) or "20190827151530" for 2019-08-27 3:15:30 PM? If that is the case, as long as the length of the 2 strings, however formatted, is different then you can potentially still query against the field, it just won't be ideal:

Assuming the date option formatting is "YYYYMMDDHHMMSS":

string lastExecutionDate = targetDate.ToString("yyyyMMddHHmmss");
string lastExecutionMs = targetDate.ToUniversalTime().Subtract(
    new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc)
    ).TotalMilliseconds.ToString();

var terminalKeys = terminals.Keys.ToList(); // You probably cannot use ContainsKey in Linq2EF...
var sales = context.SALES
    .Where(s => terminalKeys.Contains(s.SC_TERM)
       && ((s.SC_TIMESTAMP.Length == 14 && s.SC_TIMESTAMP.CompareTo(lastExecutionDate) > 0)
          || (s.SC_TIMESTAMP.Length != 14 && s.SC_TIMESTAMP.CompareTo(lastExecutionMs) > 0 )))
    .Select(s =>
       new Sale 
       {
           Product = s.SC_PRDCT,
           Terminal = s.SC_TERM,
           Operator = s.MC_OP,
           UnixString = s.SC_TIMESTAMP
       }).ToList();

If the SC_TIMESTAMP column only stores the timestamps in ms, and the time/date are in separate columns, then you don't need the conditional, just format your target datetime to a timestamp string (ms since 1970-01-01) and use that.

string lastExecutionMs = targetDate.ToUniversalTime().Subtract(
    new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc)
    ).TotalMilliseconds.ToString();

var terminalKeys = terminals.Keys.ToList(); // You probably cannot use ContainsKey in Linq2EF...
var sales = context.SALES
    .Where(s => terminalKeys.Contains(s.SC_TERM)
       && s.SC_TIMESTAMP.CompareTo(lastExecutionMs) > 0)
    .Select(s =>
       new Sale 
       {
           Product = s.SC_PRDCT,
           Terminal = s.SC_TERM,
           Operator = s.MC_OP,
           UnixString = s.SC_TIMESTAMP
       }).ToList();

The caveat for this to work with the ms or datetime in the same field is that if you require timestamp or datetime is that the datetime string must be an ISO format "year Month Day Time24" which is sortable, otherwise it cannot be used in a comparison.

The lucky fact that the the unix timestamps are all equal in length, thus sortable, you can query them by the > operator in SQL.

Of course, as you' may have tried, m.SC_TIMESTAMP > lastExecution doesn't compile in C# but fortunately, EF6 (and EF3, more or less) translate the following predicate into the desired SQL predicate:

Where(m => m.SC_TIMESTAMP.CompareTo(lastExecution) > 0)

where lastExecution is a string.

Remember to add a comment to your code that this works until 2286-11-20 17:46:39, when the UNIX timestamp is 9999999999999 . After that, your successors should use a more generic method that takes length into account.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM