简体   繁体   中英

Neo4j Cypher NULL values and descending sorting

I have an auditing filed for all of my entities:

createDate
updateDate

I always initialize createDate during the entity creation but updateDate can contain NULL until the first update.

I have to implement sorting feature over these fields.

With createDate everything works fine but with updateDate I have issues.

In case of a mixed set of NULLs and Dates in updateDate during the descending sort, I have the NULLs first and this is not the something I'm expecting here.

I understand that according to the Neo4j documentation, this is an expecting behavior - When sorting the result set, null will always come at the end of the result set for ascending sorting, and first when doing descending sort. but I don't know right now how to implement the proper sorting from the user perspective where the user will see the latest updated documents at the top of the list. Some time ago I have even created GitHub issue for this feature https://github.com/opencypher/openCypher/issues/238

One workaround I can see here - is to populate also updateDate together with createDate during the entity creation but I really hate this solution.

Are there any other solutions in order to properly implement this task?

You can try using the coalesce() function . It will return the first non-null value in the list of expressions passed to it.

MATCH (n:Node)
RETURN n
ORDER BY coalesce(n.updateDate, 0) DESC

EDIT:

From comments:

on the database level it is something like this: "updateDate": "2017-09-07T22:27:11.012Z". On the SDN4 level it is a Java - java.util.Date type

In this case you can change the 0 by a date representing an Start-Of-Time constant (like "1970-01-01T00:00:00.000Z").

MATCH (n:Node)
RETURN n
ORDER BY coalesce(n.updateDate, "1970-01-01T00:00:00.000Z") DESC

I'd just use the createDate as the updateDate when updateDate IS NULL:

MATCH (n:Node)
RETURN n
ORDER BY coalesce(n.updateDate, n.createDate) DESC

You may want to consider storing your ISO 8601 timestamp strings as (millisecond) integers instead. That could make most queries that involve datetime manipulations more efficient (or even possible), and would also use up less DB space compared to the equivalent string.

One way to do that conversion is to use the APOC function apoc.date.parse . For example, this converts 2017-09-07T22:27:11.012Z to an integer (in millisecond units):

apoc.date.parse('2017-09-07T22:27:11.012Z', 'ms', "yyyy-MM-dd'T'HH:mm:ss.SSSX")

With this change to your data model, you could also initialize updateDate to 0 at node creation time. This would allow you to avoid having to use COALESCE(n.updateDate, 0) for sorting purposes (as suggested by @Bruno Peres), and the 0 value would serve as an indication that the node was never updated. (But the drawback would be that all nodes would have an updateDate property, even the ones that were never updated.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM