简体   繁体   English

海量数据查询SOLR

[英]massive data query SOLR

I have a provider that provide me the information using XML through SOLR Query. 我有一个提供程序,可以通过SOLR Query使用XML向我提供信息。 In order to extract the data I am using different url query with SOLR. 为了提取数据,我在SOLR中使用了不同的URL查询。 The point is that I want to store more than 30.000.000 rows in our database. 关键是我想在我们的数据库中存储超过30.000.000行。

In order to get all the data I am using this SOLR query 为了获取所有数据,我正在使用此SOLR查询

http://provider.com/query?q=colour%3A%20(red)&group.field=manufacturer&format=xml&group.limit=1 This query is providing me 1.311.707 results. http://provider.com/query?q=colour%3A%20(red)&group.field=manufacturer&format=xml&group.limit=1此查询为我提供了1.311.707的结果。

<products>
<grouped>
<matches>1311707</matches>
<groups>
<item>
<doclist>
<start>0</start>
<numFound>36242</numFound>
<docs>

If I want to extract all the data (matches) http://provider.com/query?q=colour%3A%20(red)&group.field=manufacturer&format=xml&group.limit=1311707 如果我要提取所有数据(匹配项) http://provider.com/query?q=colour%3A%20 (red)& group.field=manufacturer&format=xml&group.limit=1311707

I receive this error: 我收到此错误:

<status_code>500</status_code>
<message/>
<error>Internal Server Error</error>

I am sure that is for the group.limit due to is too large. 我确信这是由于group.limit太大。

How can I receive all the matches? 如何获得所有比赛? How to query with SOLR large numbers of results? 如何用SOLR查询大量结果?

Thank you very much 非常感谢你


MIkpa 米卡

Using this request: 使用此请求:

query?q=colour%3A%20(red)&fq=manufacturer:[%27%27%20TO%20*]&group=true&rows=1&start=1&group.field=manufacturer&group.offset=800&group.limit=8&format=xml 查询?q = colour%3A%20(red)&fq =制造商:[%27%27%20TO%20 *]&group = true&rows = 1&start = 1&group.field = manufacturer&group.offset = 800&group.limit = 8&format = xml

The value &start is increasing the groupValue, the &rows is how many groupValue is showed, &group.limit is the number of that appears in each query. 值&start增加groupValue,&rows显示多少groupValue,&group.limit是在每个查询中出现的数量。 If I try with &group.limit=160142 I receive Internal Server Error. 如果我尝试使用&group.limit = 160142,则会收到内部服务器错误。 So the point is, How can I have all the 160.142 ? 关键是,我怎么能拥有所有的160.142?

If I increase "group.offset=800" or 900 etc the value is always the same. 如果我增加“ group.offset = 800”或900等,则该值始终相同。 How can i move the offset inside the ? 我如何在内移动偏移?

<products>
<grouped>
<matches>1311707</matches>
<groups>
<item>
<doclist>
<start>0</start>
<numFound>160142</numFound>
<docs>
<item>...</item>
<item>...</item>
<item>...</item>
<item>...</item>
<item>...</item>
<item>...</item>
<item>...</item>
<item>...</item>
</docs>
</doclist>
<groupValue>Fiat</groupValue>
</item>
</groups>

thank you 谢谢

Try using the start and limit parameters to paginate result rows. 尝试使用startlimit参数对结果行进行分页。

Try using group.offset and group.limit to paginate the result in the group. 尝试使用group.offsetgroup.limit在组中分页结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM