简体   繁体   English

MySQL DB的最佳更新方法

[英]Best Update Method for MySQL DB

I have read through the solutions to similar problems, but they all seem to involve scripts and extra tools. 我已经通读了类似问题的解决方案,但它们似乎都涉及脚本和其他工具。 I'm hoping my problem simple enough to avoid that. 我希望我的问题很简单,可以避免这种情况。

So the user uploads a csv of next week's data. 因此,用户上传了下周数据的csv。 It gets inserted into the DB, no problem. 它将插入数据库,没问题。

BUT

an hour later he gets feedback from everyone, and must make updates accordingly. 一个小时后,他得到了所有人的反馈,并且必须进行相应的更新。 He updates the csv and goes to upload it to the DB. 他更新了csv,然后将其上传到数据库。

Right now, the system I'm using checks to see if the data for that week is already there, and if it is, pulls all of that data from the DB, a script finds the differences and sends them out, and after all of this, the data the old data is deleted and replaced with the new data. 现在,我正在使用的系统检查该周的数据是否已经存在,如果存在,则从数据库中提取所有数据,脚本查找差异并将其发送出去,然后这样,将删除旧数据并将其替换为新数据。

Obviously, it is a lot easier to just wipe it clean and reenter the data, but not the best method, especially if there are lots of changes or tons of data. 显然,擦除并重新输入数据要容易得多,但不是最好的方法,尤其是在有大量更改或大量数据的情况下。 But I have to know WHAT changes have been made to send out alerts. 但是我必须知道已进行了哪些更改以发送警报。 But I don't want a transaction log, as the alerts only need to be sent out the one time and after that, the old data is useless. 但是我不需要事务日志,因为警报只需要发送一次,然后,旧数据就没有用了。

So! 所以!

Is there a smart way to compare the new data to the already existing data, get only the rows that are changed/deleted/added, and make those changes? 有没有一种聪明的方法可以将新数据与现有数据进行比较,仅获取已更改/删除/添加的行,然后进行更改? Right now it seems like I could do an update, but then I won't get any response on what has changed... 现在看来我可以进行更新了,但是对于什么变化我什么也没有得到回应。

Thanks! 谢谢!

Quick Edit: 快速编辑:

No foreign keys are currently in use. 当前没有使用外键。 This will soon change, but it shouldn't make a difference, because the foreign keys will only point to who the data effects and thus won't need to be changed. 这将很快发生变化,但不会有所不同,因为外键仅指向数据效果的对象,因此不需要更改。 As far as primary keys go, that does present a bit of a dilemma: 就主键而言,确实存在一些难题:

The data in question is everyone's work schedule. 有问题的数据是每个人的工作计划。 So it would be nice (for specific applications of this schedule beyond simple output) for each shift to have a key. 因此,对于每个班次都有一个键(对于该计划的特定应用,而不是简单的输出)会很好。 But the problem is, let's say that user1 was late on Monday. 但是问题是,假设user1在星期一晚了。 The tardiness is recorded in a separate table and is tied to the shift using the shift key. 迟滞记录在单独的表中,并使用Shift键与班次相关。 But if on Tuesday there is some need to make some changes to the week already in progress, my fear is that it will become too difficult to insure that all entries in the DB that have already happened (and thus may have associations that shouldn't be broken) will get re-keyed in the process. 但是,如果在星期二需要对正在进行的一周进行一些更改,那么我担心的是,要确保数据库中所有已发生的条目变得太困难了(因此可能会有不应该关联的关联)损坏)将在此过程中重新输入密码。 Unfortunately, it is not as simple as only updating all events occurring AFTER the current time, as this would add work (and thus make it less marketable) to the people who do the uploading. 不幸的是,这不仅仅只是更新当前时间之后发生的所有事件那样简单,因为这会给进行上传的人员增加工作量(从而使其市场化程度降低)。 Basically, they make the schedule on one program, export it to a CSV, and then upload it on a web page for all of the webapps that need that data. 基本上,他们在一个程序上制定计划,将其导出为CSV,然后将其上载到需要该数据的所有Web应用程序的网页上。 So it is simply much easier for them (and less stressful for everyone involved) to do the same routine every time of exporting the entire week and uploading it. 因此,对于他们来说(每次导出整个星期并上传它),每次执行相同的例程对他们来说都容易得多(对所涉及的每个人来说压力都较小)。

So my biggest concern is to make the upload script as smart as possible on both ends. 因此,我最大的担心是使上传脚本在两端都尽可能聪明。 It doesn't get bloated trying to find the changes, it can find the changes no matter the input AND none of the data that is unchanged risks getting re-keyed. 尝试查找更改不会感到ated肿,无论输入内容如何,​​它都能找到更改,并且所有不变的数据都没有重新输入密钥的风险。

Here's a related question: 这是一个相关的问题:

Suppose Joe User was schedule to wash dishes from 7:00 PM to 8:00 PM, but the new
data has him working 6:45 PM to 8:30 PM.  Has the shift been changed? Or has the old
one been deleted and a new one added?

And another one: 还有一个:

Say Jane was schedule to work 1:00 PM to 3:00 PM, but now everyone has a mandatory
staff meeting at 2:00 to 3:00. Has she lost one shift and gained two? Or has one
shift changed and she gained one?

I'm really interested in knowing how this kind of data is typically handled/approached, more than specific answers to the above. 我真的很想知道通常如何处理/处理这类数据,而不是上面的具体答案。

Again, thank you. 再次谢谢你。

如果在其中一个字段上具有唯一键,则可以使用:

LOAD DATA LOCAL INFILE '/path/to/data.csv' REPLACE INTO TABLE table_name

Right now, the system I'm using checks to see if the data for that week is already there, and if it is, pulls all of that data from the DB, a script finds the differences and sends them out, and after all of this, the data the old data is deleted and replaced with the new data. 现在,我正在使用的系统检查该周的数据是否已经存在,如果存在,则从数据库中提取所有数据,脚本查找差异并将其发送出去,然后这样,将删除旧数据并将其替换为新数据。

So your script knows the differences, right? 这样您的脚本就知道其中的区别吧? And you don't want to use some extra extra tools, apart from your script and MySQL, right? 而且,除了脚本和MySQL,您不想使用其他额外的工具,对吗?

I'm quite convinced that MySQL doesn't offer any 'diff' tool by itself, so the best you can achieve is making new CSV file for updates only. 我非常确信MySQL本身不提供任何“ diff”工具,因此,您可以实现的最佳效果是制作新的CSV文件,仅用于更新。 I mean - it should contain only changed rows. 我的意思是-它应该只包含已更改的行。 Updating would be quicker, and all changed data would be easily available. 更新将更快,并且所有更改的数据将很容易获得。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM