为什么快速 API 需要 10 分钟以上才能将 100,000 行插入 SQL 数据库

Question

I've tried using SqlAlchemy, as well as raw mysql.connector here, but commiting an insert into a SQL database from FastAPI takes forever.我已经尝试在这里使用 SqlAlchemy 以及原始 mysql.connector，但是从 FastAPI 向 SQL 数据库提交插入需要永远。

I wanted to make sure it wasn't just my DB, so I tried it on a local script and it ran in a couple seconds.我想确保它不仅仅是我的数据库，所以我在本地脚本上尝试了它，它在几秒钟内运行。

How can I work with FastAPI to make this query possible?我怎样才能使用 FastAPI 使这个查询成为可能？

Thanks!谢谢！

''' '''

@router.post('/')
def postStockData(data:List[pydanticModels.StockPrices], raw_db = Depends(get_raw_db)):
  
    cursor = raw_db[0]
    cnxn = raw_db[1]

    # i = 0
    # for row in data:
    #   if i % 10 == 0:
    #     print(i)
    #     db.flush()
    #   i += 1
    #   db_pricing = models.StockPricing(**row.dict())
    #   db.add(db_pricing)
    # db.commit()
    SQL = "INSERT INTO " + models.StockPricing.__tablename__ + " VALUES (%s, %s, %s)"
    print(SQL)

    valsToInsert = []
    for row in data:
      rowD = row.dict()
      valsToInsert.append((rowD['date'], rowD['symbol'], rowD['value']))
    cursor.executemany(SQL, valsToInsert)
    cnxn.commit()

    return {'message':'Pricing Updated'}

''' '''

Answer 1

You are killing performances because you try a " RBAR " approach which is not suitable in RDBMS... You use a loop and execute an SQL INSERT of only one row... When the RDBMS is facing a query, the sequence of execution is the following:您正在扼杀表演，因为您尝试了一种不适合 RDBMS 的“ RBAR ”方法...您使用循环并执行 SQL INSERT 只有一行...当 RDBMS 面临查询时，执行顺序是以下：

does the user that throw the query be authenticate?抛出查询的用户是否经过身份验证？
parsing the string to verify the syntax解析字符串以验证语法
looking for metadata (tables, columns, datatypes...)寻找元数据（表、列、数据类型...）
analyzing which operations on tables and columns this user is granted分析授予该用户对表和列的哪些操作
creating an execution plan to sequences all the operations needed for the query创建执行计划以对查询所需的所有操作进行排序
setting up lock for concurrency设置并发锁
executing the query (inserting only 1 row)执行查询（仅插入 1 行）
throw back an error or a OK message抛出错误或 OK 消息

Every steps consumes time... and your are all theses steps 100 000 times because of your loop.每个步骤都消耗时间……由于您的循环，您的所有这些步骤都执行了 100 000 次。

Usually when inserting in a table many rows, there just one query to do even if the INSERT concerns 10000000000 rows from a file !通常当在表中插入多行时，即使 INSERT 涉及文件中的 10000000000 行，也只需执行一个查询！

为什么快速 API 需要 10 分钟以上才能将 100,000 行插入 SQL 数据库

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-09-02 14:27:13

为什么快速 API 需要 10 分钟以上才能将 100,000 行插入 SQL 数据库

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-09-02 14:27:13

解决方案1
0 已采纳 2022-09-02 14:27:13