简体   繁体   中英

How to access nested json in elasticsearch?

I have following json:

    metadata: {
    authors: [

    ],
    links: [
        {
            href: "http://www.latimes.com/opinion/readersreact/la-le-1028-wednesday-meat-cancer-20151028-story.html#navtype=outfit",
            value: "Why hot dogs and bacon aren't as dangerous as cigarettes"
        },
        {
            href: "http://www.latimes.com/opinion/readersreact/la-le-1028-wednesday-porter-ranch-lausd-20151028-story.html#navtype=outfit",
            value: "LAUSD school in Porter Ranch shows the importance of parent involvement"
        },
        {
            href: "http://www.latimes.com/opinion/readersreact/la-le-1028-wednesday-billboards-20151028-story.html#navtype=outfit",
            value: "Maine and Vermont show L.A. what life is like without billboards"
        },
        {
            href: "http://www.latimes.com/opinion/readersreact/la-le-1028-wednesday-broad-beach-20151028-story.html#navtype=outfit",
            value: "Malibu beach-front homeowners, meet King Canute"
        }
    ]
},

I would like to search only for metadata.links.value in elasticsearch:

requestBuilder.setQuery(QueryBuilders.matchQuery("metadata.links.value", "Malibu"));

But unfortunately this doesn't work. I get 0 hits when i enter a value.

What am i doing wrong?


Update:

Here is my complete code

 public List<ArticleExtraction> search(String searchQuery, SearchProvider searchProvider) {
    TransportClient client = searchProvider.getClient();
    Map<String, String> query = new HashMap<>();
    ArrayList<String> singleQuery = new ArrayList<>();
    if (searchQuery.length() > 0 && searchQuery.contains(":")) {
        String[] queries = searchQuery.split(",");
        for (String q : queries) {
            String[] jsonQuery = q.split(":");
            query.put(jsonQuery[0], jsonQuery[1]);
        }
    } else {
        String[] queries = searchQuery.split(",");
        for (String q : queries) {
            singleQuery.add(q);
        }
    }
    SearchRequestBuilder requestBuilder = client.prepareSearch("crawlbot")
            .setTypes("item")
            .setSize(100);

    for (Map.Entry<String, String> e : query.entrySet()) {
        requestBuilder.setQuery(QueryBuilders.matchQuery(e.getKey(), e.getValue()));
    }
    for (String q : singleQuery) {
        requestBuilder.setQuery(QueryBuilders.queryStringQuery(q));
    }
    SearchResponse response = requestBuilder.execute().actionGet();

    List<ArticleExtraction> articles = new ArrayList<>();

    SearchHit[] hits = response.getHits().getHits();
    for (SearchHit hit : hits) {
        String sourceAsString = hit.getSourceAsString();
        if (sourceAsString != null) {
            JsonObject json = new JsonParser().parse(sourceAsString).getAsJsonObject();
            if (json.has("article")) {
                Gson gson = new Gson();
                articles.add(gson.fromJson(json.get("article"), ArticleExtraction.class));
            }
        }
    }

    return articles;

Explanation:

The input of the searchQuery could be something like:

metadata.links.value:malibu

Or if it is a singlequery: malibu

I made some code so both queries can get accepted

Mappings (sry if this gets big)

mappings: {
    item: {
        properties: {
            article: {
                properties: {
                    description: {
                        type: "string"
                    },
                    description_html: {
                        type: "string"
                    },
                    entities: {
                        properties: {
                            count: {
                                type: "long"
                            },
                            meta: {
                                type: "object"
                            },
                            name: {
                                type: "string"
                            },
                            type: {
                                type: "string"
                            }
                        }
                    },
                    favicon_url: {
                        type: "string"
                    },
                    images: {
                        properties: {
                            colors: {
                                properties: {
                                    color: {
                                        type: "long"
                                    }
                                }
                            },
                            entropy: {
                                type: "double"
                            },
                            height: {
                                type: "long"
                            },
                            url: {
                                type: "string"
                            },
                            width: {
                                type: "long"
                            }
                        }
                    },
                    keywords: {
                        properties: {
                            label: {
                                type: "string"
                            },
                            score: {
                                type: "double"
                            }
                        }
                    },
                    language: {
                        type: "string"
                    },
                    metadata: {
                        properties: {
                            authors: {
                                properties: {
                                    name: {
                                        type: "string"
                                    }
                                }
                            },
                            links: {
                                properties: {
                                    href: {
                                        type: "string"
                                    },
                                    value: {
                                        type: "string"
                                    }
                                }
                            },
                            twitter: {
                                type: "string"
                            }
                        }
                    },
                    provider_display: {
                        type: "string"
                    },
                    provider_name: {
                        type: "string"
                    },
                    provider_url: {
                        type: "string"
                    },
                    published: {
                        type: "string"
                    },
                    published_long: {
                        type: "long"
                    },
                    summary: {
                        type: "string"
                    },
                    title: {
                        type: "string"
                    },
                    url: {
                        type: "string"
                    }
                }
            },
            id: {
                properties: {
                    _inc: {
                        type: "long"
                    },
                    _machine: {
                        type: "long"
                    },
                    _new: {
                        type: "boolean"
                    },
                    _time: {
                        type: "long"
                    }
                }
            },
            job: {
                properties: {
                    api: {
                        type: "long"
                    },
                    crawl_depth: {
                        type: "long"
                    },
                    max_pages: {
                        type: "long"
                    },
                    name: {
                        type: "string"
                    },
                    status: {
                        type: "long"
                    },
                    url: {
                        type: "string"
                    },
                    userid: {
                        type: "long"
                    }
                }
            },
            query: {
                properties: {
                    match: {
                        properties: {
                            name: {
                                type: "string"
                            }
                        }
                    }
                }
            }
        }
    }
},

metadata is contained within the article root object.

Therefore your query should be constructed as:

QueryBuilders.matchQuery("article.metadata.links.value"‌​, "Malibu");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM