简体   繁体   中英

Indexing json on Solr, it indexed as a List instead of as an item

I'm trying to index a JSON file in Solr and it works, but i don't understant why Solr is indexing elements as an array instead of an element.

When I indexed the example json file "books.json" it works fine, but if I index another file "items.json" it generates a different output.

I show below:

Books.json

 [{
    "id" : "978-0641723445",
    "cat" : ["book","hardcover"],
    "name" : "The Lightning Thief",
    "author" : "Rick Riordan",
    "series_t" : "Percy Jackson and the Olympians",
    "sequence_i" : 1,
    "genre_s" : "fantasy",
    "inStock" : true,
    "price" : 12.50,
    "pages_i" : 384
  }]

 OUTPUT

{
    "id": "978-0641723445",
    "cat": [
      "book",
      "hardcover"
     ],
    "name": "The Lightning Thief",
    "author": "Rick Riordan",
    "author_s": "Rick Riordan",
    "series_t": "Percy Jackson and the Olympians",
    "sequence_i": 1,
    "genre_s": "fantasy",
    "inStock": true,
    "price": 12.5,
    "price_c": "12.5,USD",
    "pages_i": 384,
    "_version_": 1457847842153431000
},

Items.json

[{ 
    "title" : "Pruebas Carlos",
    "id" : 14,
     "desc" : "Probando como funciona el campo de descripciones"
}]

OUTPUT

{
    "title": [
       "Pruebas Carlos"
    ],
    "id": "10",
    "desc": [
      "Probando como funciona el campo de descripciones"
    ],
    "_version_": 1457849881416695800
},

My Schema , where i only added the new fields that i need.

Can someone explain to me how I have to do to index the elements without []?

Thanks

You have set both fields (title, desc) as multivalued, that is why, do this if they have a single value:

<field name="desc" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="title" type="text_general" indexed="true" stored=" true" multiValued="false"/>

In short, these fields are configured to be arrays by your schema, this is why they are written as JSON arrays to the response. Even if they only have one member in your samples.

You need to configure them as multiValued="false" if they are only single-valued.


The fields you worry about title and desc are configured as multiValued="true" as you can see in this excerpt from your schema

<field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/>
<field name="desc" type="text_general" indexed="true" stored="true" multiValued="true"/>

If you scroll up a little (to line 82) in your schema, you can read what this stands for

multiValued: true if this field may contain multiple values per document

You can read what this is good for and what the consequences are in several sources

Looks like you have a problem associated to nested Jsons, you can use -

(i) /solr/update/json?commit=true?split=/&f=txt:/**

(ii) Using Index handlers - https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM