简体   繁体   English

将多个一对多关系合并到 json - oracle - python

[英]Merge multiple one to many relationships into json - oracle - python

(I know this can be done using SQL Alchemy, but this is also an exercise in learning so I understand how it is done without tools like SQL Alchemy) My company is using a 10.1 oracle DB. (我知道这可以使用 SQL Alchemy 来完成,但这也是一个学习练习,所以我了解没有 SQL Alchemy 之类的工具是如何完成的)我的公司正在使用 10.1 oracle DB。

DB Structure Primary table (primary key) Second Table (foreign key with primary key) Third Table (foreign key with primary key) DB Structure 主表(主键) 第二表(外键带主键) 第三表(外键带主键)

Second table could have from 0 - 4 linked entries Third table could have 0 - infinite (though not likely to have more than 20)第二个表可以有 0 - 4 个链接条目 第三个表可以有 0 - 无限(虽然不太可能有超过 20 个)

My SQL query is running fine and getting me the data fine.我的 SQL 查询运行良好,数据也很好。 If we get multiple entries in the other tables it will just return multiple rows with the same primary key.如果我们在其他表中获得多个条目,它只会返回具有相同主键的多行。 Seems standard (new to SQL so I assume this is how it should be).似乎是标准的(对 SQL 来说是新的,所以我认为应该是这样)。

The problem is when I try and process the data into a singular JSON entry looping over everything.问题是当我尝试将数据处理成一个单一的 JSON 条目时,它会遍历所有内容。 Here is what I have currently as my script:这是我目前的脚本:

cursor = conn.cursor()

    sql = """
        SELECT *
        FROM RC_CAL 
               join CALENDAR_EVENT 
                 ON RC_CAL.rc_id = CALENDAR_EVENT.rc_id
               left outer join CALENDAR_EVENT CALENDAR_EVENT_1 
                            ON RC_CAL.rc_id = CALENDAR_EVENT_1.rc_id 
               left outer join RC_HISTORY RC_HISTORY_1 
                            ON RC_CAL.rc_id = RC_HISTORY_1.rc_id
    """

    has_where = False
    params = {}
    query_conditions = []
    if is_active:
        has_where = True
        query_conditions.append('WHERE CALENDAR_EVENT.ACTIVE_YN = :ACTIVE')
        params['ACTIVE'] = is_active
    if env:
        env_string = 'AND CALENDAR_EVENT.env = :ENV' if has_where else 'WHERE CALENDAR_EVENT.env = :ENV'
        has_where = True
        query_conditions.append(env_string)
        params['ENV'] = env
    if from_date:
        from_string = 'AND CALENDAR_EVENT.end_dt >= :FROM_DATE' if has_where else 'WHERE CALENDAR_EVENT.end_dt >= :FROM_DATE'
        has_where = True
        query_conditions.append(from_string)
        params['FROM_DATE'] = from_date
    if to_date:
        to_string = 'AND CALENDAR_EVENT.end_dt <= :TO_DATE' if has_where else 'WHERE CALENDAR_EVENT.end_dt <= :TO_DATE'
        query_conditions.append(to_string)
        params['TO_DATE'] = to_date

    if len(query_conditions) > 0:
        sql += " ".join(query_conditions)

    cursor.execute(sql, **params)

    rows = cursor.fetchall()

    events = []
    for rcId, data in groupby(rows, key=lambda t: t[0]):
        rc_item = None
        generic_list = []
        history_list = []
        for item in data:
            if not rc_item:
                rc_item = EventModalV2(*item[0:11])
            generic_id = item[11]
            history_id = item[17]
            generic_data = item[11:17]
            history_data = item[17:]
            if not any(generic_id in generic_item for generic_item in generic_list) and generic_id:
                rc_item.update_with_generic(generic_data)
                generic_list.append(generic_data)
            if not any(history_id in history_item for history_item in history_list) and history_id:
                rc_item.update_with_history(history_data)
                history_list.append(history_data)
                
        events.append(rc_item)

    cursor.close()
    conn.close()

    return events

an example dataset if I had multiple table 3 entries, though there could be more if table 2 gives more entries (where it says PROD, that is part of table 2. This is a table where we store dates depending on environment PROD:一个示例数据集,如果我有多个表 3 条目,但如果表 2 提供更多条目,可能会有更多数据(它说 PROD,这是表 2 的一部分。这是我们根据环境 PROD 存储日期的表:

(170079, 3, 1, None, None, None, 'Test Demo Event - DS', 'N', 'N', None, datetime.datetime(2020, 11, 5, 19, 16, 22), 237247, datetime.datetime(2020, 11, 1, 4, 0), 'PROD', datetime.datetime(2020, 11, 1, 4, 0), 91, 'Y', 5051, <cx_Orac
le.LOB object at 0x00000237AE8954B8>, 'rc_create', datetime.datetime(2020, 11, 5, 19, 16, 22))


(170079, 3, 1, None, None, None, 'Test Demo Event - DS', 'N', 'N', None, datetime.datetime(2020, 11, 5, 19, 16, 22), 237247, datetime.datetime(2020, 11, 1, 4, 0), 'PROD', datetime.datetime(2020, 11, 1, 4, 0), 91, 'Y', 5052, <cx_Orac
le.LOB object at 0x00000237AE895580>, 'rc_patch', datetime.datetime(2020, 11, 5, 19, 17, 7))


(170079, 3, 1, None, None, None, 'Test Demo Event - DS', 'N', 'N', None, datetime.datetime(2020, 11, 5, 19, 16, 22), 237247, datetime.datetime(2020, 11, 1, 4, 0), 'PROD', datetime.datetime(2020, 11, 1, 4, 0), 91, 'Y', 5053, <cx_Orac
le.LOB object at 0x00000237AE895648>, 'rc_patch', datetime.datetime(2020, 11, 5, 19, 17, 59))


(170079, 3, 1, None, None, None, 'Test Demo Event - DS', 'N', 'N', None, datetime.datetime(2020, 11, 5, 19, 16, 22), 237247, datetime.datetime(2020, 11, 1, 4, 0), 'PROD', datetime.datetime(2020, 11, 1, 4, 0), 91, 'Y', 5054, <cx_Orac
le.LOB object at 0x00000237AE895710>, 'rc_patch', datetime.datetime(2020, 11, 5, 19, 19, 9))

This is running extremely slow on larger queries.这在较大的查询上运行速度非常慢。 I am not sure how I can make this any better.我不知道如何才能让这一切变得更好。

First, as Chris suggested, find out where the slowdown is occurring.首先,正如 Chris 建议的那样,找出放缓的地方。 Depending on what you find I have a couple suggestions:根据你的发现,我有几个建议:

(1) Avoid fetching LOB values as these require an additional round-trip to fetch the data, slowing down the process. (1) 避免获取 LOB 值,因为这些值需要额外的往返来获取数据,从而减慢了进程。 You can see an example here .您可以在此处查看示例。

(2) If the previous suggestion is insufficient, you can also consider simply fetching the three (four?) queries independently and putting them together in Python (by using dicts). (2) 如果前面的建议还不够,你也可以考虑简单地独立获取三个(四个?)查询,并在Python中将它们放在一起(通过使用dicts)。 This prevents the need for fetching the top-level rows multiple times and then removing the duplicates.这可以防止需要多次获取顶级行然后删除重复项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM