Currently, I have a graph stored through the DSE Graph Engine with 100K nodes. These nodes have label "customer" and a property called "age" which allows integer values. I have indexed this property with the following command:
schema.vertexLabel("customer").index("custByAge").secondary().by("age").add()
I would like to be able to use this index to answer queries that look for customers within a certain age range (eg "age" between 10 and 20). However, it doesn't seem like the index I created is actually being used when I query customers by an age interval.
When I submit the following query, a list of vertices is returned in about 40ms, which leads me to believe that the index is being used:
g.V().has('customer','age',15)
But when I submit the following query, the query times out after 30 sec (as I have specified in my configuration):
g.V().has('customer','age',inside(10,20))
Interruption of result iteration
Display stack trace? [yN]
This leads me to believe that the index is not being used for this query. Does that seem right? And if the index is not being used, does anyone have some advice for how I can speed up this query?
EDIT As suggested by an answer below, I have run .profile on each of the above queries, with the following results (only showing relevant info):
gremlin> g.V().has('customer','age',21).profile()
==>Traversal Metrics
...
index-query 14.333ms
gremlin> g.V().has('customer','age',inside(21,23)).profile()
==>Traversal Metrics
...
index-query 115.055ms
index-query 132.144ms
index-query 132.842ms
>TOTAL 53042.171ms
This leaves me with a few questions:
index-query
mean that indexes are being used for my second query? Are you using DataStax Studio? If so, you can use the .profile() feature to understand how the index is being engaged?
example .profile() use: gV().in().has('name','Julia Child').count().profile()
You want to use a search index for this case, it will be much much faster.
For example, in KillRVideo:
schema.vertexLabel("movie").index("search").search().by("year").add()
g.V().hasLabel('movie').has('year', gt(2000)).has('year', lte(2017)).profile()
Then from Studio profile() we can see:
SELECT "community_id", "member_id" FROM "killrvideo"."movie_p" WHERE
"solr_query" = '{"q":"*:*", "fq":["year:{2000 TO *}","year:{* TO
2017]"]}' LIMIT ?; with params (java.lang.Integer) 50000
By default, the profiler doesn't show the trace of all operations, so the index-query list you see may be truncated. Modify "max_profile_events" according to this documentation: https://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/graph/reference/schema/refSchemaConfig.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.