简体   繁体   English

在MySQL中使用触发器进行多行而不是单行数据转换

[英]Multi-row, instead of single row data transformation with trigger in MYSQL

I have this query: 我有这个查询:

CREATE TRIGGER move_form_data
AFTER INSERT ON schema.original_table
FOR EACH ROW
INSERT INTO schema.new_table (name, street_address, 
            street_address_line_2, city, state, zip, country, dob)
SELECT name, street_address, street_address_line_2, city, state, zip, country, dob 
from view_data_submits

with calls this view: 与调用此视图:

CREATE VIEW view_data_submits AS 

SELECT  
        MAX(CASE WHEN element_label = 0 THEN element_value end) AS name,
        MAX(CASE WHEN element_label = 1 THEN element_value end) AS street_address,
        MAX(CASE WHEN element_label = 2 THEN element_value end) AS street_address_line_2,
        MAX(CASE WHEN element_label = 3 THEN element_value end) AS city,
        MAX(CASE WHEN element_label = 4 THEN element_value end) AS state,
        MAX(CASE WHEN element_label = 5 THEN element_value end) AS zip,
        MAX(CASE WHEN element_label = 6 THEN element_value end) AS country,
        MAX(CASE WHEN element_label = 7 THEN element_value end) AS dob
FROM schema.original_table
WHERE group_id = (select MAX(group_id) from schema.original_table)
group by group_id

I want 1 row back, and the trigger works as intended without the trigger part with just this code: 我要返回1行,并且触发器仅按以下代码运行即可,无需触发器部分:

INSERT INTO schema.new_table (name, street_address, 
                street_address_line_2, city, state, zip, country, dob)
    SELECT name, street_address, street_address_line_2, city, state, zip, country, dob 
    from view_data_submits

currently, it give me back the inserted row when the user submits a form, but it transforms from the original table to the new table like this: 当前,当用户提交表单时,它会给我返回插入的行,但它会从原始表转换为新表,如下所示:

# id, name, street_address, street_address_line_2, city, state, zip, country, dob
2, fsa asdadFQ, , , , , , , 
3, fsa asdadFQ, BOOGYBOOGYBOOGY, , , , , , 
4, fsa asdadFQ, BOOGYBOOGYBOOGY, YOUdooWORK, , , , , 
5, fsa asdadFQ, BOOGYBOOGYBOOGY, YOUdooWORK, A, , , , 
6, fsa asdadFQ, BOOGYBOOGYBOOGY, YOUdooWORK, A, DD, , , 
7, fsa asdadFQ, BOOGYBOOGYBOOGY, YOUdooWORK, A, DD, 09876, , 
8, fsa asdadFQ, BOOGYBOOGYBOOGY, YOUdooWORK, A, DD, 09876, Belize, 
9, fsa asdadFQ, BOOGYBOOGYBOOGY, YOUdooWORK, A, DD, 09876, Belize, 2014-02-05  <--only row that I want (=the total form submission)

instead of just: 不仅仅是:

# id, name, street_address, street_address_line_2, city, state, zip, country, dob

9, fsa asdadFQ, BOOGYBOOGYBOOGY, YOUdooWORK, A, DD, 09876, Belize, 2014-02-05

I have a feeling it is either to do with the FOR EACH ROW syntax, or the application saves in a compounding fashion somehow. 我感觉这要么与FOR EACH ROW语法有关,要么应用程序以某种复合方式保存。 I am leaning towards the first one. 我倾向于第一个。

Anyone have any suggestions for a remedy? 有人有什么补救措施的建议吗? I almost feel as though its some noob mistake that I just forgot about....haha. 我几乎感觉到好像是我刚刚忘记的一些菜鸟般的错误。

~~EDIT per request: ~~编辑每个请求:

here is the select * from the original table where the max id is being pulled: 这是从中提取最大id的原始表中的select *:

# id, form_id, element_label, element_value, group_id
----+--------+--------------+--------------+---------
 207,       2,             0,          name,       25
 208,       2,             1,     address 1,       25
 209,       2,             2,     address 2,       25
 210,       2,             3,          city,       25
 211,       2,             4,         state,       25
 212,       2,             5,           zip,       25
 213,       2,             6,       country,       25
 214,       2,             7,           dob,       25

since the values are blob form, I replaced the values with what they represent, I just pulled the newest inserted data 由于这些值是blob形式,因此我用它们所代表的值替换了这些值,所以我只提取了最新插入的数据

I have narrowed this down to the application inserting each field in separately, which causes the trigger and FOR EACH ROW syntax making it act as a row by row basis. 我将范围缩小到应用程序分别插入每个字段的范围,这将导致触发器和FOR EACH ROW语法使其逐行地起作用。 This syntax is required in MySQL, which only allows row based triggers and not "query" based triggers as in Oracle and some other DB languages. 在MySQL中,此语法是必需的,它仅允许基于行的触发器,而不能像在Oracle和其他一些DB语言中那样基于“查询”的触发器。

I have asked a separate question on a workaround for this here: Workaround for FOR EACH ROW in MySQL 我在这里针对解决方法提出了一个单独的问题: MySQL中FOR EACH ROW的解决方法

This looks like an EAV schema (oh! the joys!). 这看起来像是一个EAV模式(哦!很高兴!)。

It looks like the root problemis that the application isn't inserting a "row" the way you want to see it; 看起来根本的问题是应用程序没有按照您希望的方式插入“行”。 it's inserting multiple rows into the same table, with each row representing a single attribute value. 它将多行插入到同一表中,每一行代表一个属性值。

The application is using Entity-Attributute-Value (EAV) model, and what you want is a row that looks like a traditional relational model. 该应用程序正在使用实体属性值(EAV)模型,并且您想要的是看起来像传统关系模型的行。

What that rather ugly "MAX(),MAX(),MAX() ... GROUP BY" query is doing is converting all those EAV rows into columns of a single row. 相当难看的“ MAX(),MAX(),MAX()... GROUP BY”查询所做的就是将所有这些EAV行转换为单行的列。


It looks like you want to do that conversion "on-the-fly" and maintain the contents of the target_table whenever rows are inserted into the original_table. 看起来您想即时进行转换并在每当将行插入到original_table中时维护target_table的内容。

If I were solving that problem, I would include the group_id in my target_table, since that's the value that is relating all the individual EAV rows together (as demonstrated in your view query.) 如果我正在解决该问题,则将group_id包含在我的target_table中,因为这是将所有单个EAV行关联在一起的值(如您的视图查询中所示)。

And I definitely would NOT use a SELECT MAX(group_id) query to reference the value on the row that was just inserted into original_table . 而且我绝对不会使用SELECT MAX(group_id)查询来引用刚刚插入original_table的行上的值。 In the context of an AFTER INSERT trigger, I already have the group_id value of the row that was just inserted; 在AFTER INSERT触发器的上下文中,我已经具有刚插入的行的group_id值; it's available to me as " NEW.group_id ". 对我来说,它是“ NEW.group_id ”。

(The real reason I would avoid using a MAX(group_id) query to get that value is that I don't have a guarantee that some other process isn't going to insert a larger value for group_id while my process is running. I'm not guaranteed the MAX(group_id) will return the value of group_id that was just inserted. (Granted, I won't ever see that problem happen in single user testing; I'd have to include some deliberate delays in my processing, and have two processes running at the same time in order to get that to happen. This is one of those problems that pops up in production, rather than in testing, basically because we don't bother to setup the test case to discover the problem.) (我会避免使用MAX(group_id)查询来获取该值的真正原因是,我无法保证在进程运行时其他进程不会为group_id插入较大的值。我不能保证MAX(group_id)会返回刚刚插入的group_id的值。(当然,我永远不会在单用户测试中看到这个问题;我必须在处理过程中包括一些故意的延迟,并且为了使这种情况同时发生,有两个进程同时运行,这是在生产中而不是在测试中弹出的问题之一,基本上是因为我们不必费心设置测试用例来发现问题。 )

If I only want a single row in my target_table for each group_id value, I would create a unique constraint on the group_id column in my target_table. 如果我只想在target_table中为每个group_id值添加一行,那么我将在target_table的group_id列上创建唯一约束。 Then I would use an "upsert"-type function to update the row if it already exists, or insert a row if one doesn't exist. 然后,我将使用“ upsert”类型的函数来更新该行(如果已存在),或者插入一行(如果不存在)。

I can easily do that with MySQL an INSERT ... ON DUPLICATE KEY ... statement. 使用MySQL的INSERT ... ON DUPLICATE KEY ...语句,我可以轻松地做到这一点。 This requires a unique constraint, but we already have that covered. 这需要一个唯一的约束,但是我们已经解决了。 One downside of this statement is that if my target_table has an AUTO_INCREMENT column, this will "burn" through an auto_increment values even when a row already exists. 该语句的缺点是,如果我的target_table具有AUTO_INCREMENT列,则即使已经存在一行,也会通过auto_increment值“刻录”。

Based on what you have in your trigger/view, I could do something like this: 根据触发器/视图中的内容,我可以执行以下操作:

INSERT INTO target_table (group_id, name, street_address, ... )
SELECT o.group_id
       MAX(CASE WHEN o.element_label = 0 THEN o.element_value end) AS name,
       MAX(CASE WHEN o.element_label = 1 THEN o.element_value end) AS street_address,
       MAX(CASE WHEN o.element_label = 2 THEN o.element_value end) AS street_address_line_2,
       MAX(CASE WHEN o.element_label = 3 THEN o.element_value end) AS city,
       MAX(CASE WHEN o.element_label = 4 THEN o.element_value end) AS state,
       MAX(CASE WHEN o.element_label = 5 THEN o.element_value end) AS zip,
       MAX(CASE WHEN o.element_label = 6 THEN o.element_value end) AS country,
       MAX(CASE WHEN o.element_label = 7 THEN o.element_value end) AS dob
  FROM schema.original_table o
 WHERE o.group_id = NEW.group_id
 GROUP BY o.group_id
    ON DUPLICATE KEY
UPDATE name                  = VALUES(name)
     , street_address        = VALUES(street_address)
     , street_address_line_2 = VALUES(street_address_line2)
     , city                  = VALUES(city)
     , state                 = VALUES(state)
     , zip                   = VALUES(zip)
     , country               = VALUES(country)
     , dob                   = VALUES(dob)

Note that I'm counting on the UNIQUE constraint on target_table(group_id) to throw a "duplicate key" exception when it attempts to insert a row with a group_id value that already exists in target_table. 请注意,当它试图插入具有target_table中已经存在的group_id值的行时,我指望target_table(group_id)上的UNIQUE约束抛出“重复键”异常。 When that happens, this statement will turn into an UPDATE statement, with an implied WHERE group_id = VALUES(group_id) (whatever columns were involved in the unique key constraint violation.) 发生这种情况时,该语句将变成一个隐含WHERE group_id = VALUES(group_id)的UPDATE语句(无论唯一键冲突是否涉及任何列。)

This is the simplest approach, as long as burning through AUTO_INCREMENT values isn't a concern. 只要不关心通过AUTO_INCREMENT值进行刻录,这就是最简单的方法。

I'm not limited to the INSERT ... ON DUPLICATE KEY statement, I can "roll my own" UPSERT function. 我不仅限于INSERT ... ON DUPLICATE KEY语句,还可以“滚动自己的” UPSERT函数。 BUT... I want to be cognizant of possible race conditions... if I perform a SELECT and then a subsequent INSERT, I leave a small window where another process can sneak in... 但是...我想知道可能的比赛条件...如果我先执行SELECT然后执行随后的INSERT操作,我将留一个小窗口供其他进程潜入...

I could instead use a NOT EXISTS predicate to test for the existence of the row: 我可以改为使用NOT EXISTS谓词来测试行的存在:

INSERT INTO target_table ( ...
SELECT ...
  FROM original_table o
 WHERE o.group_id = NEW.group_id
   AND NOT EXISTS (SELECT 1 FROM target_table d WHERE d.group_id = NEW.group_id)

Then I'd test whether a row was inserted (by checking number of affected rows), and if no row was inserted, then I could attempt an update. 然后,我将测试是否插入了一行(通过检查受影响的行数),如果没有插入行,则可以尝试进行更新。 (I'm banking on the SELECT statement returning a single row.) (我依靠SELECT语句返回一行。)

For better performance, I might use an anti-join pattern to do the same check (for existence of an existing row), but for one row, the NOT EXISTS (subquery) is fine, and I think it's easier to understand. 为了获得更好的性能,我可以使用反联接模式进行相同的检查(是否存在现有行),但是对于一行,NOT EXISTS(子查询)很好,并且我认为它更容易理解。

INSERT INTO target_table ( ...
SELECT ...
  FROM original_table o
  LEFT
  JOIN target_table t
    ON t.group_id = NEW.group_id
 WHERE o.group_id = NEW.group_id
   AND t.group_id IS NULL

(That SELECT from original-table might need to be wrapped as an inline view, since it's referencing the same table that's being inserted. Turning that query into a derived table should fix that, if its a problem.) (来自原始表的SELECT可能需要包装为内联视图,因为它引用的是插入的同一张表。如果有问题,将该查询转换为派生表应该可以解决此问题。)


I said I "could" use that query from the view in my trigger. 我说过“可以”从触发器的视图中使用该查询。 But that's not the approach I'd choose to use. 但这不是我会选择使用的方法。 It's not necessary. 这不是必需的。 I don't really need to run a MAX(), MAX(), MAX() query to get every column. 我真的不需要运行MAX(), MAX(), MAX()查询来获取每一列。

I have all the values of the row being inserted into original_table , so I already know which element_label is being inserted, and there's really only one column that has to be changed in the target_table. 我已经将所有行的值都插入了original_table ,所以我已经知道要插入哪个element_label ,并且target_table中实际上只有一列需要更改。 (Do I want the MAX(element_value), or do I really just want the value that was just inserted?) (我想要MAX(element_value),还是我真的只想要刚刚插入的值?)

Here's the approach I would use in the trigger. 这是我将在触发器中使用的方法。 I'd avoid running a query against the original_table at all, and just do the upsert on the one column in target_table: 我会完全避免对original_table进行查询,而只对target_table中的一列进行更新:

IF NEW.element_label = 0 THEN
   -- name
   INSERT INTO target_table (group_id,       `name`) 
   VALUES (NEW.group_id, NEW.element_value)
   ON DUPLICATE KEY UPDATE                   `name` = VALUES(`name`);
ELSEIF NEW.element_label = 1 THEN
   -- street_address
   INSERT INTO target_table (group_id,       `street_address`) 
   VALUES (NEW.group_id, NEW.element_value)
   ON DUPLICATE KEY UPDATE                   `street_address` = VALUES(`street_address`);
ELSEIF NEW.element_label = 2 THEN
   -- street_address2
   INSERT INTO target_table (group_id,       `street_address2`) 
   VALUES (NEW.group_id, NEW.element_value)
   ON DUPLICATE KEY UPDATE                   `street_address2` = VALUES(`street_address2`);
ELSEIF NEW.element_label = 3 THEN
   -- city
   INSERT INTO target_table (group_id,       `city`) 
   VALUES (NEW.group_id, NEW.element_value)
   ON DUPLICATE KEY UPDATE                   `city` = VALUES(`city`);
ELSEIF NEW.element_label = 4 THEN
   ...
END

I know that's not very pretty, but I think it's the best approach if the maintenance of target_table has to be done at the time rows are inserted into original table. 我知道这不是很漂亮,但是我认为如果在将行插入原始表时必须对target_table进行维护,这是最好的方法。 (The problem isn't really the database here, the problem is the EAV model, or really, the "impedance mismatch" between the EAV model (one row for each attribute value) and the relational model (one column in each row for each attribute value). (问题不是这里的数据库,而是EAV模型,或者实际上是EAV模型(每个属性值一行)和关系模型(每行每一行一列)之间的“阻抗不匹配”属性值)。

This isn't any uglier than the MAX(),MAX(),MAX() query. 这比MAX(),MAX(),MAX()查询更丑陋。

I would also ditch the AUTO_INCREMENT id in the target table, and just use group_id (value from the original_table) as the primary key in my target_table, since I only want one row for each group_id. 我还将放弃目标表中的AUTO_INCREMENT id,并仅将group_id (原始表中的值)用作我的target_table中的主键,因为我只希望每个group_id包含一行。


UPDATE 更新

You have to change the delimiter from semicolon to something else when the trigger body contains semicolons. 当触发器主体包含分号时,必须将定界符从分号更改为其他内容。 Documentation here: http://dev.mysql.com/doc/refman/5.5/en/trigger-syntax.html 此处的文档: http : //dev.mysql.com/doc/refman/5.5/en/trigger-syntax.html

eg 例如

DELIMITER $$

CREATE TRIGGER trg_original_table_ai
AFTER INSERT ON original_table
FOR EACH ROW
BEGIN
   IF NEW.element_label = 0 THEN
      -- name
      INSERT INTO target_table (group_id,       `name`) 
      VALUES (NEW.group_id, NEW.element_value)
      ON DUPLICATE KEY UPDATE                   `name` = VALUES(`name`);
   ELSEIF NEW.element_label = 1 THEN
      -- street_address
      INSERT INTO target_table (group_id,       `street_address`) 
      VALUES (NEW.group_id, NEW.element_value)
      ON DUPLICATE KEY UPDATE                   `street_address` = VALUES(`street_address`);
   END IF;
END$$

DELIMITER ;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM