简体   繁体   English

如何从 sqlalchemy 中的 jsonb 列的嵌套列表中返回特定的字典键

[英]How to return specific dictionary keys from within a nested list from a jsonb column in sqlalchemy

I am attempting to return some named columns from a jsonb data set that is stored with PostgreSQL.我试图从与 PostgreSQL 一起存储的 jsonb 数据集中返回一些命名列。

I am able to run a raw query that meets my needs directly, however I am trying to run the query utilising SQLAlchemy, in order to ensure that my code is 'pythonic' and easy to read.我能够直接运行满足我需要的原始查询,但是我正在尝试使用 SQLAlchemy 运行查询,以确保我的代码是“pythonic”并且易于阅读。

The query that returns the correct result (two columns) is:返回正确结果(两列)的查询是:

SELECT  
    tmp.item->>'id',
    tmp.item->>'name'
FROM (SELECT jsonb_array_elements(t.data -> 'users') AS item FROM tpeople t) as tmp

Example json (each user has 20+ columns)示例 json(每个用户有 20 多列)

{ "results":247, "users": [
{"id":"202","regdate":"2015-12-01","name":"Bob Testing"},
{"id":"87","regdate":"2014-12-12","name":"Sally Testing"},
{"id":"811", etc etc}
...
]}

The table is simple enough, with a PK, datetime of json extraction, and the jsonb column for the extract表很简单,有一个PK,json提取的日期时间,以及提取的jsonb列


CREATE TABLE tpeople
(
    record_id bigint NOT NULL DEFAULT nextval('"tpeople_record_id_seq"'::regclass) ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 9223372036854775807 CACHE 1 ),
    scrape_time timestamp without time zone NOT NULL,
    data jsonb NOT NULL,
    CONSTRAINT "tpeople_pkey" PRIMARY KEY (record_id)
);

Additionally I have a People Class that looks as follows:另外,我有一个 People 类,如下所示:

class people(Base):
    __tablename__ = 'tpeople'

    record_id = Column(BigInteger, primary_key=True, server_default=text("nextval('\"tpeople_record_id_seq\"'::regclass)"))
    scrape_time = Column(DateTime, nullable=False)
    data = Column(JSONB(astext_type=Text()), nullable=False)

Presently my code to return the two columns looks like this:目前,我返回两列的代码如下所示:

from db.db_conn import get_session // Generic connector for my db
from model.models import people
from sqlalchemy import func,

sess = get_session()

sub = sess.query(func.jsonb_array_elements(people.data["users"]).label("item")).subquery()
test = sess.query(sub.c.item).select_entity_from(sub).all()

SQLAlchemy generates the following SQL: SQLAlchemy 生成以下 SQL:

SELECT anon_1.item AS anon_1_item 
FROM (SELECT jsonb_array_elements(tpeople.data -> %(data_1)s) AS item 
FROM tpeople) AS anon_1
{'data_1': 'users'}

But nothing I seem to do can allow me to only get certain columns within the item itself like the raw SQL I can write.但是我似乎没有做任何事情可以让我只获得项目本身中的某些列,就像我可以编写的原始 SQL。 Some of the approaches I have tried as follows (they all error out):我尝试过的一些方法如下(它们都出错了):

test = sess.query("sub.item.id").select_entity_from(sub).all()

test = sess.query(sub.item.["id"]).select_entity_from(sub).all()

aas = func.jsonb_to_recordset(people.data["users"])
res = sess.query("id").select_from(aas).all()

sub = select(func.jsonb_array_elements(people.data["users"]).label("item"))

Presently I can extract the columns I need in a simple for loop, but this seems like a hacky way to do it, and I'm sure there is something dead obvious I'm missing.目前我可以在一个简单的 for 循环中提取我需要的列,但这似乎是一种很老套的方法,而且我确信我遗漏了一些明显的东西。

for row in test:
    print(row.item['id'])

Searched for a few hours eventually found some who accidentally did this while trying to get another result.搜索了几个小时,最终发现有些人在试图获得另一个结果时不小心这样做了。

sub = sess.query(func.jsonb_array_elements(people.data["users"]).label("item")).subquery()
tmp = sub.c.item.op('->>')('id')
tmp2 = sub.c.item.op('->>')('name')
test = sess.query(tmp, tmp2).all()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM