简体   繁体   English

ElasticSearch 文件映射使用 fscrawler 并在 C# 中通过 NEST 搜索文档

[英]ElasticSearch file mapping using fscrawler and searching doc by NEST in C#

i indexed documents that are in a folder "/tmp/es" using fscrawler 2.3-SNAPSHOT.我使用 fscrawler 2.3-SNAPSHOT 索引了文件夹“/tmp/es”中的文档。 It mapped them as :它将它们映射为:

{
  "properties" : {
    "attachment" : {
      "type" : "binary",
      "doc_values": false
    },
    "attributes" : {
      "properties" : {
        "group" : {
          "type" : "keyword"
        },
        "owner" : {
          "type" : "keyword"
        }
      }
    },
    "content" : {
      "type" : "text"
    },
    "file" : {
      "properties" : {
        "content_type" : {
          "type" : "keyword"
        },
        "filename" : {
          "type" : "keyword"
        },
        "extension" : {
          "type" : "keyword"
        },
        "filesize" : {
          "type" : "long"
        },
        "indexed_chars" : {
          "type" : "long"
        },
        "indexing_date" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "last_modified" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "checksum": {
          "type": "keyword"
        },
        "url" : {
          "type" : "keyword",
          "index" : true
        }
      }
    },
    "object" : {
      "type" : "object"
    },
    "meta" : {
      "properties" : {
        "author" : {
          "type" : "text"
        },
        "date" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "keywords" : {
          "type" : "text"
        },
        "title" : {
          "type" : "text"
        },
        "language" : {
          "type" : "keyword"
        }
      }
    },
    "path" : {
      "properties" : {
        "encoded" : {
          "type" : "keyword"
        },
        "real" : {
          "type" : "keyword",
          "fields": {
            "tree": {
              "type" : "text",
              "analyzer": "fscrawler_path",
              "fielddata": true
            }
          }
        },
        "root" : {
          "type" : "keyword"
        },
        "virtual" : {
          "type" : "keyword",
          "fields": {
            "tree": {
              "type" : "text",
              "analyzer": "fscrawler_path",
              "fielddata": true
            }
          }
        }
      }
    }
  }
}

Now, i want to search them using NEST in my C# application, i was able to get content by hit.source.content but cannot get filename by hit.source.filename ...现在,我想在我的 C# 应用程序中使用 NEST 搜索它们,我能够通过hit.source.content获取内容,但无法通过hit.source.filename获取文件名...

code :代码 :

 var response = elasticClient.Search<documents>(s => s
                .Index("tanks")
                .Type("doc")
                .Query(q => q.QueryString(qs => qs.Query(query))));

            if (rtxSearchResult.Text != " ")
            {
                rtxSearchResult.Text = " ";

                foreach (var hit in response.Hits)
                {


                    rtxSearchResult.Text = rtxSearchResult.Text + ("Name: " + hit.Source.fileName.ToString()
                    + Environment.NewLine
                    + "Content: " + hit.Source.content.ToString()
                    + Environment.NewLine
                    + "URL: " + hit.Source.url.ToString()
                    + Environment.NewLine
                    + Environment.NewLine);
                }
            }

the above throws NULLException but runs when i comment line with hit.Source.url and hit.Source.filename .以上抛出 NULLException 但在我用hit.Source.urlhit.Source.filename注释行时运行。

Kibana shows the filename field as file.filename and url as file.url and content as content . Kibana 将文件名字段显示为file.filename ,将 url 显示为file.url ,将 content 显示为content

As filename is nested under file, i am unable to retrieve it...please help stuck here for couple of days now..由于文件名嵌套在文件下,我无法检索它......请帮助卡在这里几天......

Found the error:发现错误:

My documents class was:我的文档类是:

Class documents
{
      Public string filename { get; set; }

      Public string content { get; set; }

      Public string url { get; set; }
}

As filename and url were as file.filename and file.url , we needed another class file with filename and url.由于 filename 和 url 与file.filenamefile.url ,我们需要另一个带有 filename 和 url 的类文件。

Class documents
{
      Public File file { get; set; }

      Public string content { get; set; }

}

Class File
{
          Public string filename { get; set; }

          Public string url { get; set; }
}

And therefore i was able to access them by hit.Source.file.filename and hit.Source.file.url .因此我能够通过hit.Source.file.filenamehit.Source.file.url访问它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM