简体   繁体   中英

Query JSONB column for any value where =?

I have a jsonb column which has the unfortunate case of being very unpredictable, in some cases its value may be an array with nested values:

["UserMailer", "applicant_setup_3", ["5cbffeb7-8d5e-4b52-a475-3cf320b2cee9"]]

Sometimes it will be something with key/values like this:

[{"reference_id": "5cbffeb7-8d5e-4b52-a475-3cf320b2cee9", "job_dictionary": ["StatusUpdater", "FollowTwitterUsersJob"]}]

Is there a way to write a query which just treats the whole column like text and does a like to see if I can find the uuid in the big text blob? I want to find all the records where a particular uuid string is present in the jsonb column.

The query doesn't need to be fast or efficient.

Postgres has search operator ? for jsonb, but that would require you to search the json content recursively.

A possible, although not very efficient method, would to stringify the object and use LIKE to search it:

myjsonb::text LIKE '%"5cbffeb7-8d5e-4b52-a475-3cf320b2cee9"%'
myjsonb::text LIKE '%"' || myuuid || '"%'

Demo on DB Fiddle :

You can split the array elements first by using jsonb_array_elements(json) , and then filter the casted string from those elements by like operator

select q.elm
  from
  (
    select jsonb_array_elements(js) as elm
      from tab
  ) q
 where elm::varchar like '%User%'

elm
----------------------------------------------------------------------------------------------------------------------
"UserMailer"
{"reference_id": "5cbffeb7-8d5e-4b52-a475-3cf320b2cee9", "job_dictionary": ["StatusUpdater", "FollowTwitterUsersJob"]}

Demo

The problem with the jsonb operator ? is that it only considers top-level keys (including array elements), not values , and no nested objects.

You seem to be looking for values and array elements (not keys) on any level. You can get that with a full text search on top of your json ( b ) column:

SELECT * FROM tbl
WHERE  to_tsvector('simple', jsonb_column)
    @@ tsquery '5cbffeb7-8d5e-4b52-a475-3cf320b2cee9';

db<>fiddle here

to_tsvector() extracts values and array elements on all levels - just what you need.

Requires Postgres 10 or later. json(b)_to_tsvector() in Postgres 11 offers more flexibility.

That's attractive for tables of non-trivial size as it can be supported with a full text index very efficiently:

CREATE INDEX tbl_jsonb_column_fts_gin_idx ON tbl USING GIN (to_tsvector('simple', jsonb_column));

I use the 'simple' text search configuration in the example. You might want a language-specific one, like 'english' . Doesn't matter much while you only look for UUID strings, but stemming for a particular language might make the index a bit smaller...

Related:

While you are only looking for UUIDs, you might optimize further with a custom ( IMMUTABLE ) function to extract UUIDs from the JSON document as array ( uuid[] ) and build a functional GIN index on top of it. (Considerably smaller index, yet.) Then:

SELECT * FROM tbl
WHERE  my_uuid_extractor(jsonb_column) @> '{5cbffeb7-8d5e-4b52-a475-3cf320b2cee9}';

Such a function can be expensive, but does not matter much with a functional index that stores and operates on pre-computed values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM