简体   繁体   中英

Getting a Primary Key error in Rails using Sidekiq and Sidekiq-Cron

I have a Rails project that uses Sidekiq for worker tasks, and Sidekiq-Cron to handle scheduling. I am running into a problem, though. I built a controller (below) that handled all of my API querying, validation of data, and then inserting data into the database. All of the logic functioned properly.

I then tore out the section of code that actually inserts API data into the database, and moved it into a Job class. This way the Controller method could simply pass all of the heavy lifting off to a job. When I tested it, all of the logic functioned properly.

Finally, I created a Job that would call the Controller method every minute, do the validation checks, and then kick off the other Job to save the API data (if necessary). When I do this the first part of the logic seems to work, where it inserts new event data, but the logic where it checks to see if this is the first time we've seen an event for a specific object seems to be failing. The result is a Primary Key violation in PG.

Code below:

Controller

require 'date'

class MonnitOpenClosedSensorsController < ApplicationController

    def holderTester()
        #MonnitschedulerJob.perform_later(nil)
    end

    # Create Sidekiq queue to process new sensor readings
    def queueNewSensorEvents(auth_token, network_id)

        m = Monnit.new("iMonnit", 1)

        # Construct the query to select the most recent communication date for each sensor in the network
        lastEventForEachSensor = MonnitOpenClosedSensor.select('"SensorID", MAX("LastCommunicationDate") as "lastCommDate"')
        lastEventForEachSensor = lastEventForEachSensor.group("SensorID")
        lastEventForEachSensor = lastEventForEachSensor.where('"CSNetID" = ?', network_id)

        todaysDate = Date.today
        sevenDaysAgo = (todaysDate - 7)

        lastEventForEachSensor.each do |event|
            # puts event["lastCommDate"]
            recentEvent = MonnitOpenClosedSensor.select('id, "SensorID", "LastCommunicationDate"')
            recentEvent = recentEvent.where('"CSNetID" = ? AND "SensorID" = ? AND "LastCommunicationDate" = ?', network_id, event["SensorID"], event["lastCommDate"])

            recentEvent.each do |recent|
                message = m.get_extended_sensor(auth_token, recent["SensorID"])
                if message["LastDataMessageMessageGUID"] != recent["id"]
                    MonnitopenclosedsensorJob.perform_later(auth_token, network_id, message["SensorID"])
                    # puts "hi inner"
                    # puts message["LastDataMessageMessageGUID"]
                    # puts recent['id']
                    # puts recent["SensorID"]
                    # puts message["SensorID"]
                    # raise message
                end
            end
        end

        # Queue up any Sensor Events for new sensors
        # This would be sensors we've never seen before, from a Postgres standpoint
        sensors = m.get_sensor_ids(auth_token)
        sensors.each do |sensor|
            sensorCheck = MonnitOpenClosedSensor.select(:SensorID)
            # sensorCheck = MonnitOpenClosedSensor.select(:SensorID)
            sensorCheck = sensorCheck.group(:SensorID)
            sensorCheck = sensorCheck.where('"CSNetID" = ? AND "SensorID" = ?', network_id, sensor)
            # sensorCheck = sensorCheck.where('id = "?"', sensor["LastDataMessageMessageGUID"])

            if sensorCheck.any? == false
                MonnitopenclosedsensorJob.perform_later(auth_token, network_id, sensor) 
            end
        end

    end

end

The above code breaks Sensor Events for new sensors. It doesn't recognize that a sensor already exists, first issue, and then doesn't recognize that the event it is trying to create is already persisted to the database (uses a GUID for comparison).

Job to persist data

class MonnitopenclosedsensorJob < ApplicationJob
  queue_as :default

  def perform(auth_token, network_id, sensor)
    m = Monnit.new("iMonnit", 1)
    newSensor = m.get_extended_sensor(auth_token, sensor)

    sensorRecord = MonnitOpenClosedSensor.new
    sensorRecord.SensorID = newSensor['SensorID']
    sensorRecord.MonnitApplicationID = newSensor['MonnitApplicationID']
    sensorRecord.CSNetID = newSensor['CSNetID']

    lastCommunicationDatePretty = newSensor['LastCommunicationDate'].scan(/[0-9]+/)[0].to_i / 1000.0
    nextCommunicationDatePretty = newSensor['NextCommunicationDate'].scan(/[0-9]+/)[0].to_i / 1000.0
    sensorRecord.LastCommunicationDate = Time.at(lastCommunicationDatePretty)
    sensorRecord.NextCommunicationDate = Time.at(nextCommunicationDatePretty)

    sensorRecord.id = newSensor['LastDataMessageMessageGUID']
    sensorRecord.PowerSourceID = newSensor['PowerSourceID']
    sensorRecord.Status = newSensor['Status']
    sensorRecord.CanUpdate = newSensor['CanUpdate'] == "true" ? 1 : 0
    sensorRecord.ReportInterval = newSensor['ReportInterval']
    sensorRecord.MinimumThreshold = newSensor['MinimumThreshold']
    sensorRecord.MaximumThreshold = newSensor['MaximumThreshold']
    sensorRecord.Hysteresis = newSensor['Hysteresis']
    sensorRecord.Tag = newSensor['Tag']
    sensorRecord.ActiveStateInterval = newSensor['ActiveStateInterval']
    sensorRecord.CurrentReading = newSensor['CurrentReading']
    sensorRecord.BatteryLevel = newSensor['BatteryLevel']
    sensorRecord.SignalStrength = newSensor['SignalStrength']
    sensorRecord.AlertsActive = newSensor['AlertsActive']
    sensorRecord.AccountID = newSensor['AccountID']
    sensorRecord.CreatedOn = Time.now.getutc
    sensorRecord.CreatedBy = "Monnit Open Closed Sensor Job"
    sensorRecord.LastModifiedOn = Time.now.getutc
    sensorRecord.LastModifiedBy = "Monnit Open Closed Sensor Job"

    sensorRecord.save

    sensorRecord = nil
  end
end

Job to call controller every minute

class MonnitschedulerJob < ApplicationJob
  queue_as :default

  def perform(*args)
    m = Monnit.new("iMonnit", 1)
    getImonnitUsers = ImonnitCredential.select('"auth_token", "username", "password"')
    getImonnitUsers.each do |user|
        # puts user["auth_token"]
        # puts user["username"]
        # puts user["password"]

        if user["auth_token"] != nil
            m.logon(user["auth_token"])
        else
            auth_token = m.get_auth_token(user["username"], user["password"])
            auth_token = auth_token["Result"]
        end

        network_list = m.get_network_list(auth_token)
        network_list.each do |network|
            # puts network["NetworkID"]
            MonnitOpenClosedSensorsController.new.queueNewSensorEvents(auth_token, network["NetworkID"])
        end
    end
  end
end

Sorry about the length of the post. I tried to include as much information as I could about the code involved.

EDIT

Here is the code for the extended sensor, along with the JSON response:

def get_extended_sensor(auth_token, sensor_id)
        response = self.class.get("/json/SensorGetExtended/#{auth_token}?SensorID=#{sensor_id}")

        if response['Result'] != "Invalid Authorization Token"
            response['Result']
        else
            response['Result']
        end
    end


{
    "Method": "SensorGetExtended",
    "Result": {
        "ReportInterval": 180,
        "ActiveStateInterval": 180,
        "InactivityAlert": 365,
        "MeasurementsPerTransmission": 1,
        "MinimumThreshold": 4294967295,
        "MaximumThreshold": 4294967295,
        "Hysteresis": 0,
        "Tag": "",
        "SensorID": 189092,
        "MonnitApplicationID": 9,
        "CSNetID": 24391,
        "SensorName": "Open / Closed - 189092",
        "LastCommunicationDate": "/Date(1500999632000)/",
        "NextCommunicationDate": "/Date(1501010432000)/",
        "LastDataMessageMessageGUID": "d474b3db-d843-40ba-8e0e-8c4726b61ec2",
        "PowerSourceID": 1,
        "Status": 0,
        "CanUpdate": true,
        "CurrentReading": "Open",
        "BatteryLevel": 100,
        "SignalStrength": 84,
        "AlertsActive": true,
        "CheckDigit": "QOLP",
        "AccountID": 14728
    }
}

Some thoughts:

recentEvent = MonnitOpenClosedSensor.select('id, "SensorID", "LastCommunicationDate"') - 

this is not doing any ordering; you are presuming that the records you retrieve here are the latest records.

m = Monnit.new("iMonnit", 1)
newSensor = m.get_extended_sensor(auth_token, sensor)

without the implementation details of get_extended_sensor it's impossible to tell you how

sensorRecord.id = newSensor['LastDataMessageMessageGUID']

is resolving.

It's highly likely that you are getting duplicate messages. It's almost never a good idea to use input data as a primary key - rather autogenerate a GUID in your job, use that as the primary key, and then use the LastDataMessageMessageGUID as a correlation id.

So the issue that I was running into, as it turns out, is as follows:

  1. A sensor event was pulled from the API and queued up in as a worker job in Sidekiq.
  2. If the queue is running a bit slow, API speed or simply a lot of jobs to process, the 1 minute poll might hit again and pull the same sensor event down and queue it up.
  3. As the queue processes, the sensor event gets inserted into the database with it's GUID being the primary key
  4. As the queue continues to catch up with itself, it hits the same event that was scheduled as a secondary job. This job then fails.

My solution to this was to move my "does this SensorID and GUID exist in the database" to the actual job. So when the job ran the first thing it'd do is check AGAIN for the record to already exist. This means I am checking twice, but this quick check has low overhead.

There is still the risk that a check could happen and pass while another job is inserting the record, before it commits it to the database, and then it could fail. But the retry would catch it, and then clear it on out as a successful process when the check doesn't validate on the second round. Having said that, however, the check occurs AFTER the API data has been pulled. Since, in theory, the database persist of a single record from the API data would happen really fast (much faster than the API call would happen), it really does lower the chances of you having to hit a retry on any job....and I mean you'd have a better chance of hitting the lottery than having the second check fail and trigger a retry.

If anyone else has a better, or more clean solution, please feel free to include it as a secondary answer!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM