简体   繁体   中英

Aerospike lua scripting issue

We are running a little POC with Aerospike to understand if we can run LUA scripts doing some stuff.

In this case, we used the flights example: https://github.com/aerospike/flights-analytics

I created a new index on the flight time in order to search by it.

The script runs over all the records and finds the last arrival time of a flight. We inserted only flights to Bufalo for simplicity sake.

    local function aggregatCityToMax(result, record)

      city = string.upper(record['DEST_CITY_NAME'])
      flightTime = record['ARR_TIME']


    if result[city] == nil then

           info("CITY: |%s|      |        DATE: %d        |        MAX: null" , city, flightTime)
           result[city] = flightTime

    else

            info("CITY: |%s|      |        DATE: %d        |        MAX: %d" , city, flightTime, 
        result[city])

         if result[city] < flightTime then
           info("new MAX %s", flightTime)
           result[city] = flightTime
         end
    end

   return result

end

local function reduce_values(a, b)
   return map.merge(a, b, mergeFunction)
end


local function mergeFunction(a, b)

   info("merging:  %s VS %s ", a, b)

   if a < b then
       return b
   end

   return a
end

function mapMax(stream)
 return stream :  aggregate(map(), aggregatCityToMax) : reduce(reduce_values)
end

The log shows odd result: 1. I don't get the maximum. 2. It looks like every 10 records, the maximum value is reset to null.

LOG:

CITY: |BUFFALO| | DATE: 1253 | MAX: null CITY: |BUFFALO| | DATE: 1221 | MAX: 1253 CITY: |BUFFALO| | DATE: 1600 | MAX: 1253 CITY: |BUFFALO| | DATE: 1203 | MAX: 1600 CITY: |BUFFALO| | DATE: 1424 | MAX: 1600 CITY: |BUFFALO| | DATE: 2141 | MAX: 1600 CITY: |BUFFALO| | DATE: 1821 | MAX: 2141 CITY: |BUFFALO| | DATE: 1221 | MAX: 2141 CITY: |BUFFALO| | DATE: 1424 | MAX: 2141 CITY: |BUFFALO| | DATE: 1550 | MAX: 2141 CITY: |BUFFALO| | DATE: 1703 | MAX: null

CITY: |BUFFALO| | DATE: 2312 | MAX: 1703 CITY: |BUFFALO| | DATE: 2251 | MAX: 2312 CITY: |BUFFALO| | DATE: 19 | MAX: 2312 CITY: |BUFFALO| | DATE: 1030 | MAX: 2312 CITY: |BUFFALO| | DATE: 1257 | MAX: 2312 CITY: |BUFFALO| | DATE: 803 | MAX: 2312 CITY: |BUFFALO| | DATE: 19 | MAX: 2312 CITY: |BUFFALO| | DATE: 1502 | MAX: 2312 CITY: |BUFFALO| | DATE: 2319 | MAX: 2312 CITY: |BUFFALO| | DATE: 1735 | MAX: null CITY: |BUFFALO| | DATE: 1221 | MAX: 1735 CITY: |BUFFALO| | DATE: 1258 | MAX: 1735 CITY: |BUFFALO| | DATE: 2125 | MAX: 1735 CITY: |BUFFALO| | DATE: 2251 | MAX: 2125 CITY: |BUFFALO| | DATE: 1104 | MAX: 2251 CITY: |BUFFALO| | DATE: 2053 | MAX: 2251 CITY: |BUFFALO| | DATE: 1340 | MAX: 2251 CITY: |BUFFALO| | DATE: 2312 | MAX: 2251 CITY: |BUFFALO| | DATE: 2226 | MAX: 2312 CITY: |BUFFALO| | DATE: 2053 | MAX: null CITY: |BUFFALO| | DATE: 1637 | MAX: 2053 CITY: |BUFFALO| | DATE: 1030 | MAX: 2053 CITY: |BUFFALO| | DATE: 1618 | MAX: 2053 CITY: |BUFFALO| | DATE: 1510 | MAX: 2053 CITY: |BUFFALO| | DATE: 1510 | MAX: 2053 CITY: |BUFFALO| | DATE: 2346 | MAX: 2053 CITY: |BUFFALO| | DATE: 2343 | MAX: 2346 CITY: |BUFFALO| | DATE: 1600 | MAX: 2346 CITY: |BUFFALO| | DATE: 1550 | MAX: 2346 CITY: |BUFFALO| | DATE: 1949 | MAX: null CITY: |BUFFALO| | DATE: 1104 | MAX: 1949 CITY: |BUFFALO| | DATE: 2045 | MAX: 1949 CITY: |BUFFALO| | DATE: 2213 | MAX: 2045

Did I do something wrong? Did I miss anything?

Thanks,

Idob

Aerospike's aggregation is more of streaming in nature. ie it keeps pushing partial results out so that there is no stalling. The reduce which happens at the client will do the final job of merging all the partial results. This is a different model compared to hadoop map-reduce where the reduce/final will wait for all the local reduces to finish completely before starting itself. There is a merit in the streaming model of Aerospike.

You have a print statement in the aggregate function. Once the partial result is pushed out, the seed map will start as empty when it is working on the next batch. There is nothing wrong in your logic. The end result should be fine. Are you seeing any issue with the end-result ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM