[英]How to Update and/or insert to MySQL DB using SQLAlchemy from Python List
I am building a webscrape that will run over and over that will insert new data or update data based on ID.我正在构建一个网络爬虫,它将一遍又一遍地运行,它将根据 ID 插入新数据或更新数据。
if 'id' == 'id':
My goal is to avoid duplicates. if 'id' == 'id':
我的目标是避免重复。 MySQL table is ready and built. MySQL 表已准备就绪并已构建。 What is the best Pythonic way to check your python list before inserting/updating it in MySQL DB using SQLAlchemy?
在使用 SQLAlchemy 在 MySQL 数据库中插入/更新它之前检查你的 python 列表的最佳 Pythonic 方法是什么?
Below are my dependenices:以下是我的依赖项:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
import requests
from bs4 import BeautifulSoup
from time import sleep
from datetime import datetime
import time
engine = create_engine("mysql+pymysql:///blah")
I use a function to assign each <td>
from scraped data:我使用一个函数从抓取的数据中分配每个
<td>
:
def functionscrape( **kwargs ):
scrape = {
'id': '',
'owner': '',
'street': '',
'city': '',
'state': '',
}
scrape.update(kwargs)
return (scrape)
The list below is an example, but would be changing constantly with each webscrape.下面的列表是一个示例,但会随着每个网页抓取而不断变化。
myList =
[{
'id': '111',
'owner': 'Bob',
'street': '1212 North',
'city': 'Anywhere',
'state': 'TX',
},
{
'id': '222',
'owner': 'Mary',
'street': '333 South',
'city': 'Overthere',
'state': 'AZ',
}]
I am using a helper function to create the dynamic sql update queries:我正在使用辅助函数来创建动态 sql 更新查询:
def construct_update(table_name, where_vals, update_vals):
query = table_name.update()
for k, v in where_vals.items():
query = query.where(getattr(table_name.c, k) == v)
return query.values(**update_vals)
basically you pass the function the table and 2 dictionaries.基本上,您将函数传递给表和 2 个字典。 The first would just be {'id': id} in your case, and the second is all the values you want to update, like
在您的情况下,第一个只是 {'id': id},第二个是您要更新的所有值,例如
{
'owner': 'Bob',
'street': '1212 North',
'city': 'Anywhere',
etc...
}
the helper function then returns the query which can be executed with辅助函数然后返回可以执行的查询
my_session = Session(engine)
my_session.execute(query)
Unfortunately, using this method, you'll have to update every single row individually (no bulk update) - but if you can live with that this works fine不幸的是,使用这种方法,你必须单独更新每一行(没有批量更新) - 但如果你能忍受,这很好用
otherwise here's a similar post about bulk updates: Bulk update in SQLAlchemy Core using WHERE否则这里有一篇关于批量更新的类似帖子: Bulk update in SQLAlchemy Core using WHERE
You can try using https://marshmallow.readthedocs.io/en/stable/ library to make validation您可以尝试使用https://marshmallow.readthedocs.io/en/stable/库进行验证
Build Schema
and define fields with types you need.构建
Schema
并使用您需要的类型定义字段。 You can also use @pre_load
and @post_load
decorators to manipulate your data您还可以使用
@pre_load
和@post_load
装饰器来操作您的数据
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.