简体   繁体   中英

Neo4J Cypher - Union & With Clause usage

I'm trying to create a invoice product subgraph using UNION in my query and then would like to pass only the invoice nodes using WITH to a subquery. It looks like below

MATCH (year:Year)-[]->(month:Month)-[]->(day:Day)-[]->(inv:Invoice)-[]->(prd:Product)
WHERE year.value='2014' and day.value IN ['27','28','29','30','31'] and month.value='January'
RETURN (inv:Invoice)-[:PRODUCT]->(prd:Product)
UNION
MATCH (year:Year)-[]->(month:Month)-[]->(day:Day)-[]->(inv:Invoice)-[]->(prd:Product)
WHERE year.value='2014' and day.value IN ['01','02','03','04'] and month.value='February'
WITH inv
MATCH (inv)-[p:PROCESSED_AT]-(time:Time) WHERE p.time > 700 and p.time < 900
RETURN inv.invid

But I'm getting this error - All sub queries in an UNION must have the same column names. Any inputs on where im going wrong here ?

I think the error is really clear that the 2 parts of your UNION is not returning the same number of columns to make a union. if I understand correctly you are trying to get all the invoices for a particular product which has a invalid timestamp. I don't think if that is the case you will need a UNION instead you may need a WITH clause.

Anyways, to fix the above query you may want to just match the 2 parts of the UNION as below

MATCH (year:Year)-[]->(month:Month)-[]->(day:Day)-[]->(inv:Invoice)-[]->(prd:Product)
WHERE year.value='2014' and day.value IN ['27','28','29','30','31'] and month.value='January'
MATCH (inv:Invoice)-[:PRODUCT]->(prd:Product)
RETURN inv, prd
UNION
MATCH (year:Year)-[]->(month:Month)-[]->(day:Day)-[]->(inv:Invoice)-[]->(prd:Product)
WHERE year.value='2014' and day.value IN ['01','02','03','04'] and month.value='February'
WITH inv
MATCH (prd:Product)<-[:PRODUCT]-(inv)-[p:PROCESSED_AT]-(time:Time) WHERE p.time > 700 and p.time < 900
RETURN inv, prd

.

EDIT

I think you have got UNION , RETURN and WITH a bit off, to be honest I don't know how we can use UNION to join 2 results and then use a WITH to stitch it to another query. However I think you can solve this very problem with a simple rearranging the joins like below

    MATCH (year:Year)-[]->(month:Month)-[]->(day:Day)-[]->(inv:Invoice)-[]->(prd:Product)
WHERE year.value='2014' and ((day.value IN ['27','28','29','30','31'] and month.value='January') or (day.value IN ['01','02','03','04'] and month.value='February'))
WITH inv
MATCH (inv)-[p:PROCESSED_AT]-(time:Time) WHERE p.time > 700 and p.time < 900
RETURN inv.invid

You can pass multiple phrases separated by commas to MATCH provided that they make up one connected graph, and that will simplify your problem.

MATCH (year:Year)-[]->(month:Month)-[]->(day:Day)-[]->(inv:Invoice)-[]->(prd:Product)
WHERE year.value='2014' and day.value IN ['01','02','03','04'] and month.value='February'
WITH inv
MATCH (prd:Product)<-[:PRODUCT]-(inv)-[p:PROCESSED_AT]-(time:Time) WHERE p.time > 700 and p.time < 900
RETURN inv, prd

can become:

MATCH (year:Year)-[]->(month:Month)-[]->(day:Day)-[]->(inv:Invoice)-[]->(prd:Product), 
(prd:Product)<-[:PRODUCT]-(inv)-[p:PROCESSED_AT]-(time:Time)
WHERE p.time > 700 and p.time < 900
RETURN inv, prd

and you can similarly simplify the first half of your query.

Also, when using union the column names returned must match for every return clause.

I finally ended up solving this by changing the overall design which will avoid UNION usages. One way is to define Timestamps as a single entity instead of storing them as Year, Month and Date separately.

ie add Time nodes like below

//Add TimeStamp Nodes CREATE (datets1: DateTS {value:201401270730}) CREATE (datets2: DateTS {value:201401270800}) CREATE (datets3: DateTS {value:201401280745}) CREATE (datets4: DateTS {value:201402020815}) CREATE (datets5: DateTS {value:201402020830}) CREATE (datets6: DateTS {value:201402030702})

Note - in my case I dont need seconds and milliseconds, so it just ends up with hour and min resolution.

Now I associate the events in my graph to these nodes and then the query becomes very simple, something like MATCH (inv)-[:DATETIME]->(dts) WHERE dts.value <=201401280900 and dts.value >= 201401280700 RETURN distinct(inv.invid);

To improve performance of this query, we can index on timestamp ie value property.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM