繁体 English 中英

使用大文本文件构建Solr索引

[英]Building a solr index using large text file

原文 2015-03-19 11:33:53 8 2 python/ solr

我有以下格式的大文本文件：

00001,234234|234|235|7345
00005,788|298|234|735

您可以先治疗价值,如钥匙和我想要做的是快速和肮脏的方法来查询这些键，找到结果集为每个键。 阅读一番后，我发现solr提供了一个很好的框架来执行此操作。

起点是什么？
我可以使用python来使用python读取文件并建立该索引（搜索引擎）吗？
有不同的机制可以做到这一点吗？

2 个解决方案

您绝对可以使用pysolr（一个python库）来做到这一点。 如果数据为键值形式，则可以在python中读取，如下所示： https : //pypi.python.org/pypi/pysolr/3.1.0

为了更好地控制搜索，您需要修改schema.xml文件，使其具有与文本文件相同的键。

将数据吸收到SOLR中后，您可以按照上面的链接进行搜索。

您可以使用UpdateCSV处理程序在Solr中直接为数据建立索引：您只需在curl调用的fieldnames参数中指定目标字段名称（或将它们添加为文件中的第一行即可）。 无需自定义代码。

切记检查一下|的目标字段。 -分隔的值使用该字符拆分为标记。

有关详细信息，请参见https://wiki.apache.org/solr/UpdateCSV 。

无法在Windows中使用Solr，Haystack和Django创建索引文件

[英]Unable to create index file using solr and haystack and django in windows

有没有更有效的方法从大文本文件创建倒排索引？

[英]Is there a more efficient way to create an inverted index from a large text file?

使用python 2构建大型xml文件

[英]Building large xml file with python 2

Solr Search拼写检查和词干配置，无需使用文本文件

[英]Solr Search Spell check and Stemming configuration without using text file

使用python搜索极大的文本文件

[英]using python to search extremely large text file

使用正则表达式解析大文本文件

[英]Parsing large text file using regex

使用haslib的大型文本文件比较Python

[英]Large Text File Comparison Python using haslib

使用关键字定界符分割大文本文件

[英]Split large text file using keyword delimiter

用于搜索的索引文件很大

[英]Index file for searching is large

用大文本文件递归

[英]Recursion with large text file

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 无法在Windows中使用Solr，Haystack和Django创建索引文件有没有更有效的方法从大文本文件创建倒排索引？使用python 2构建大型xml文件 Solr Search拼写检查和词干配置，无需使用文本文件使用python搜索极大的文本文件使用正则表达式解析大文本文件使用haslib的大型文本文件比较Python 使用关键字定界符分割大文本文件用于搜索的索引文件很大用大文本文件递归

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM