简体   繁体   English

使用AQL和python在arangodb中进行全文搜索

[英]Fulltext Search in arangodb using AQL and python

i have stored the data in arangodb in the following format: 我已将数据以以下格式存储在arangodb中:

{"data": [
{
  "content": "maindb",
  "type": "string",
  "name": "db_name",
  "key": "1745085839"
},
{
  "type": "id",
  "name": "rel",
  "content": "1745085840",
  "key": "1745085839"
},
{
  "content": "user",
  "type": "string",
  "name": "rel_name",
  "key": "1745085840"
},
{
  "type": "id",
  "name": "tuple",
  "content": "174508584001",
  "key": "1745085840"
},
{
  "type": "id",
  "name": "tuple",
  "content": "174508584002",
  "key": "1745085840"
},
{
  "type": "id",
  "name": "tuple",
  "content": "174508584003",
  "key": "1745085840"
},
{
  "type": "id",
  "name": "tuple",
  "content": "174508584004",
  "key": "1745085840"
},
{
  "type": "id",
  "name": "tuple",
  "content": "174508584005",
  "key": "1745085840"
},
{
  "type": "id",
  "name": "tuple",
  "content": "174508584006",
  "key": "1745085840"
},
{
  "type": "id",
  "name": "tuple",
  "content": "174508584007",
  "key": "1745085840"
},
{
  "content": "dspclient",
  "type": "varchar",
  "name": "username",
  "key": "174508584001"
},
{
  "content": "12345",
  "type": "varchar",
  "name": "password",
  "key": "174508584001"
},
{
  "content": "12345",
  "type": "varchar",
  "name": "cpassword",
  "key": "174508584001"
},
{
  "content": "n",
  "type": "varchar",
  "name": "PostgreSQL",
  "key": "174508584001"
},
{
  "content": "n",
  "name": "IBMDB2",
  "type": "varchar",
  "key": "174508584001"
},
{
  "content": "n",
  "name": "MySQL",
  "type": "varchar",
  "key": "174508584001"
},
{
  "content": "n",
  "type": "varchar",
  "name": "SQLServer",
  "key": "174508584001"
},
{
  "content": "n",
  "name": "Hadoop",
  "type": "varchar",
  "key": "174508584001"
},
{
  "content": "None",
  "name": "dir1",
  "type": "varchar",
  "key": "174508584001"
},
{
  "content": "None",
  "name": "dir2",
  "type": "varchar",
  "key": "174508584001"
},
{
  "content": "None",
  "name": "dir3",
  "type": "varchar",
  "key": "174508584001"
},
{
  "content": "None",
  "name": "dir4",
  "type": "varchar",
  "key": "174508584001"
},
{
  "type": "inet",
  "name": "ipaddr",
  "content": "1921680103",
  "key": "174508584001"
},
{
  "content": "y",
  "name": "status",
  "type": "varchar",
  "key": "174508584001"
},
{
  "content": "None",
  "type": "varchar",
  "name": "logintime",
  "key": "174508584001"
},
{
  "content": "None",
  "type": "varchar",
  "name": "logindate",
  "key": "174508584001"
},
{
  "content": "None",
  "type": "varchar",
  "name": "logouttime",
  "key": "174508584001"
},
{
  "content": "client",
  "type": "varchar",
  "name": "user_type",
  "key": "174508584001"
},
{
  "content": "royal",
  "type": "varchar",
  "name": "username",
  "key": "174508584002"
},
{
  "content": "12345",
  "type": "varchar",
  "name": "password",
  "key": "174508584002"
},
{
  "content": "12345",
  "type": "varchar",
  "name": "cpassword",
  "key": "174508584002"
},
{
  "content": "n",
  "type": "varchar",
  "name": "PostgreSQL",
  "key": "174508584002"
},
{
  "content": "n",
  "name": "IBMDB2",
  "type": "varchar",
  "key": "174508584002"
},
{
  "content": "n",
  "name": "MySQL",
  "type": "varchar",
  "key": "174508584002"
},
{
  "content": "n",
  "type": "varchar",
  "name": "SQLServer",
  "key": "174508584002"
},
{
  "content": "n",
  "name": "Hadoop",
  "type": "varchar",
  "key": "174508584002"
},
{
  "content": "None",
  "name": "dir1",
  "type": "varchar",
  "key": "174508584002"
},
{
  "content": "None",
  "name": "dir2",
  "type": "varchar",
  "key": "174508584002"
},
{
  "content": "None",
  "name": "dir3",
  "type": "varchar",
  "key": "174508584002"
},
{
  "content": "None",
  "name": "dir4",
  "type": "varchar",
  "key": "174508584002"
},
{
  "type": "inet",
  "name": "ipaddr",
  "content": "1921680105",
  "key": "174508584002"
},
{
  "content": "y",
  "name": "status",
  "type": "varchar",
  "key": "174508584002"
},
{
  "content": "190835899000",
  "type": "varchar",
  "name": "logintime",
  "key": "174508584002"
},
{
  "content": "20151002",
  "type": "varchar",
  "name": "logindate",
  "key": "174508584002"
},
{
  "content": "None",
  "type": "varchar",
  "name": "logouttime",
  "key": "174508584002"
},
{
  "content": "client",
  "type": "varchar",
  "name": "user_type",
  "key": "174508584002"
},
{
  "content": "abc",
  "type": "varchar",
  "name": "username",
  "key": "174508584003"
},
{
  "content": "12345",
  "type": "varchar",
  "name": "password",
  "key": "174508584003"
},
{
  "content": "12345",
  "type": "varchar",
  "name": "cpassword",
  "key": "174508584003"
},
{
  "content": "n",
  "type": "varchar",
  "name": "PostgreSQL",
  "key": "174508584003"
},
{
  "content": "n",
  "name": "IBMDB2",
  "type": "varchar",
  "key": "174508584003"
}]}

In order to perform fulltext search, I have created an index on content attribute by using the syntax from a python script: 为了执行全文搜索,我已经使用python脚本中的语法在content属性上创建了一个索引:

c.DSP.ensureFulltextIndex("content");

Where, c is database, and DSP is the collection name. 其中,c是数据库,DSP是集合名称。 Now, I am trying to perform a search operation in the above data set by using the syntax: 现在,我正在尝试使用以下语法在上述数据集中执行搜索操作:

FOR doc IN FULLTEXT(DSP, "content", "username") RETURN doc

Then, an error occure: 然后,发生错误:

[1571] in function 'FULLTEXT()': no suitable fulltext index found for fulltext query on 'DSP' (while executing)

Please tell me the problem, and also tell me what will be the syntax when i will try this query with a python script. 请告诉我问题所在,并告诉我使用python脚本尝试此查询时的语法是什么。

Thanks... 谢谢...

Working with the 10 minutes tutorial and the driver documentation 使用10分钟教程驱动程序文档

I got it working like this: 我让它像这样工作:

from pyArango.connection import *
c = Connection()
db = c.createDatabase(name = "testdb")
DSP= db.createCollection(name = "DSP")

DSP.ensureFulltextIndex(fields=["content"])

doc = DSP.createDocument({"content": "test bla"})
doc.save()

print db.AQLQuery('''FOR doc IN FULLTEXT(DSP, "content", "bla") RETURN doc''', 10)

Resulting in: 导致:

[{u'_key': u'1241175138503', u'content': u'test bla', u'_rev': u'1241175138503', u'_id': u'DSP/1241175138503'}]

I've used arangosh to revalidate the steps from the python prompt: 我使用了arangosh来重新验证python提示符下的步骤:

arangosh> db._useDatabase("testdb")
arangosh [testdb]> db.DSP.getIndexes()
[ 
  { 
    "id" : "DSP/0", 
    "type" : "primary", 
    "fields" : [ 
      "_key" 
    ], 
    "selectivityEstimate" : 1, 
    "unique" : true, 
    "sparse" : false 
  }, 
  { 
    "id" : "DSP/1241140928711", 
    "type" : "hash", 
    "fields" : [ 
      "content" 
    ], 
    "selectivityEstimate" : 1, 
    "unique" : false, 
    "sparse" : true 
  }, 
  { 
    "id" : "DSP/1241142960327", 
    "type" : "fulltext", 
    "fields" : [ 
      "content" 
    ], 
    "unique" : false, 
    "sparse" : true, 
    "minLength" : 2 
  } 
]
arangosh [testdb]> db.testdb.toArray()
[ 
  { 
    "content" : "test bla", 
    "_id" : "DSP/1241175138503", 
    "_rev" : "1241175138503", 
    "_key" : "1241175138503" 
  } 
]
db._query('FOR doc IN FULLTEXT(DSP, "content", "bla") RETURN doc')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM