简体繁体 English

从gremlin-python直接查询gremlin

[英]Making direct gremlin queries from gremlin-python

原文 2018-10-05 18:29:43 7 1 gremlin/ gremlin-server/ gremlinpython

I've encountered several issues with gremlin-python that aren't such in pure gremlin: 我遇到了gremlin-python的几个问题，这些问题在纯gremlin中并非如此：

I can't directly select a given vertex type (gV('customer')) without iterating over all vertices (gV().hasLabel('customer')) 我不能直接选择给定的顶点类型（gV（'customer'）），而无需遍历所有顶点（gV（）。hasLabel（'customer'））
I get "Maximum Recursion reached" errors from Python. 我从Python收到“达到最大递归”错误。 The same query in gremlin works smooth and fast gremlin中的同一查询工作流畅，快速
The ".next()" command works really slow in gremlin-python while in gremlin takes 1 sec 在gremlin-python中，“。next（）”命令的工作速度非常慢，而在gremlin中则需要1秒

So, from Python/gremlin-python, I would like to be able to make a pure gremlin query to the server and directly store its result in a Python variable. 因此，从Python / gremlin-python，我希望能够对服务器进行纯gremlin查询，并将其结果直接存储在Python变量中。 Is that possible? 那可能吗？

(I'm using gremlin-python on Apache Zeppelin if that matters) （如果重要的话，我在Apache Zeppelin上使用gremlin-python）

1 个解决方案

I can't directly select a given vertex type (gV('customer')) without iterating over all vertices (gV().hasLabel('customer')) 我不能直接选择给定的顶点类型（gV（'customer'）），而无需遍历所有顶点（gV（）。hasLabel（'customer'））

gV('customer') in Gremlin means "find a vertex with the id 'customer'" not "find vertices with label 'customer'. For the latter you need what you wrote gV().hasLabel('customer') . Those rules are the same in every variation of Gremlin including Python. And, you are correct that a query like gV().hasLabel('customer') will be expensive as there aren't many graphs that optimize this type of operation. On large graphs this would typically be considered an OLAP query that you would do with with Gremlin Spark. 在Gremlin中， gV('customer')的意思是“查找ID为'customer'的顶点”，而不是“查找标签为'customer'的顶点。对于后者，您需要编写gV().hasLabel('customer') 。在包括Python在内的Gremlin的每个变体中，规则都是相同的；而且，您gV().hasLabel('customer')像gV().hasLabel('customer')这样的查询会很昂贵，因为没有很多图可以优化这种类型的操作。图形，通常将其视为使用Gremlin Spark进行的OLAP查询。

I get "Maximum Recursion reached" errors from Python. 我从Python收到“达到最大递归”错误。 The same query in gremlin works smooth and fast gremlin中的同一查询工作流畅，快速

That was a bug . 那是一个错误。 It is resolved now, but the fix is not released to pypi. 现在已解决，但该修复程序尚未发布到pypi。 A release is currently being prepared and so you will see this on 3.2.10 and 3.3.4. 当前正在准备发布版本，因此您将在3.2.10和3.3.4中看到它。 If you need an immediate patch, you can see that the fix was fairly trivial . 如果您需要立即修补程序，则可以看到该修复程序相当琐碎。

The ".next()" command works really slow in gremlin-python while in gremlin takes 1 sec 在gremlin-python中，“。next（）”命令的工作速度非常慢，而在gremlin中则需要1秒

I'm not sure what you're seeing exactly. 我不确定您所看到的是什么。 I think you might want to detail more about your environment with specifics as to how to recreate the difference. 我认为您可能想详细介绍有关您的环境的细节，以了解如何重新创建差异。 Perhaps you should bring that question to the gremlin-users mailing list. 也许您应该将这个问题带到gremlin用户的邮件列表中。

So, from Python/gremlin-python, I would like to be able to make a pure gremlin query to the server and directly store its result in a Python variable. 因此，从Python / gremlin-python，我希望能够对服务器进行纯gremlin查询，并将其结果直接存储在Python变量中。 Is that possible? 那可能吗？

That's perfectly possible and is exactly what gremlin-python is meant to do. 这是完全有可能的，而这正是gremlin-python要做的。 It enables you to write Gremlin in Python and get results back from the server to do with as you need on the client side. 它使您可以用Python编写Gremlin并从服务器取回结果，以根据需要在客户端进行处理。