简体   繁体   English

PostgreSQL从另一个表更新一个表

[英]PostgreSQL updating one table from another

Edit: Sorry, I should have explained a tad better, the data is out of salesforce, dumped from the backend, the ID fields are alphanumeric [eg. 编辑:抱歉,我应该更好地解释一下,数据已从Salesforce中淘汰,从后端转储,ID字段为字母数字[例如。 00190000010PBdSAAX], generally all are 18 in length & are always unique. 00190000010PBdSAAX],通常长度均为18,并且总是唯一的。 I'll make some changes to data types, get rid of the quoted identifiers, and make some changes to the indexes, see how I go! 我将对数据类型进行一些更改,摆脱引号引起的标识符,并对索引进行一些更改,看看如何!

I am using PostgreSQL 9.5. 我正在使用PostgreSQL 9.5。 I'm updating 1 table to another, both tables are identical in structure, one has 2 million records [target] and the other around 70k [source], which is basically just performing an update on existing and inserting any new records using a unique ID to check against. 我正在将1个表更新为另一个表,两个表的结构相同,一个表具有[目标] 200万条记录,另一个表[源]大约70k,这基本上是对现有表执行更新,并使用唯一的表插入任何新记录要检查的ID。

It's taking a lot longer than I thought, even when there is nothing to update and it just scans over the records, it still takes 5 minutes & even longer when there is something to update, tried with and without indexing, joining the 2 fields in slightly different ways [both with just a WHERE and with actual JOIN]. 它比我想象的要花费更长的时间,即使没有什么要更新并且只扫描记录,仍然有5分钟甚至更长的时间才能更新,无论有没有索引,都尝试将其中的2个字段加入其中略有不同的方式(仅使用WHERE和实际JOIN)。 just want to know if there is a better way of doing it or if I'm doing it just plain wrong, only been using Postgres for few days. 只想知道是否有更好的方法来做,或者我做错了什么,只使用了Postgres几天。

I know 5 minutes is no big deal [longer if it performs any updates], but its a similar process for about 9 other tables & this is a mid size example 我知道5分钟没什么大不了的(如果执行任何更新,则时间会更长),但是它与大约9个其他表的过程类似,这是中等大小的示例

both tables look like the below [but with different table names only] 两个表都如下所示[但仅具有不同的表名]

CREATE TABLE public."Cases"
(
  "Past_Due__c" character varying(255),
  "Case_Age__c" character varying(255),
  "Next_Step_Due_Date__c" character varying(255),
  "Id" character varying(255),
  "AccountId" character varying(255),
  "Account_Number__c" character varying(255),
  "Account_Type__c" character varying(255),
  "CaseNumber" character varying(255),
  "CaseSubTypeDetail__c" character varying(255),
  "Case_Sub_Type__c" character varying(255),
  "Case_Type__c" character varying(255),
  "ClosedDate" character varying(255),
  "Collections_Step__c" character varying(255),
  "Customer_Number__c" character varying(255),
  "Next_Collections_Step__c" character varying(255),
  "Origin" character varying(255),
  "Priority" character varying(255),
  "Related_Complaint_Case__c" character varying(255),
  "Status__c" character varying(255),
  "Subject" text,
  "Type" character varying(255),
  "CreatedDate" character varying(255),
  "OwnerId" character varying(255),
  "ContactId" character varying(255),
  "Status" character varying(255),
  "Case_Comments__c" text,
  "Subscription__c" character varying(255),
  "Description" text,
  "Case_Outcome__c" text,
  "Case_Outcome_Reason__c" text,
  "Adjustment_Amount__c" character varying(255),
  "Product_Adjustment_Amount__c" character varying(255),
  "Product_Adjustment_Reason__c" character varying(255),
  "Service__c" character varying(255),
  "ParentId" character varying(255)
)
WITH (
  OIDS=FALSE
);

The update script is below 更新脚本如下

update public."cases" t2
set past_due__c = t1.past_due__c, case_age__c = t1.case_age__c, next_step_due_date__c = t1.next_step_due_date__c, accountid = t1.accountid, account_number__c = t1.account_number__c, account_type__c = t1.account_type__c, casesubtypedetail__c = t1.casesubtypedetail__c, case_sub_type__c = t1.case_sub_type__c, case_type__c = t1.case_type__c, closeddate = t1.closeddate, collections_step__c = t1.collections_step__c, customer_number__c = t1.customer_number__c, next_collections_step__c = t1.next_collections_step__c, origin = t1.origin, priority = t1.priority, related_complaint_case__c = t1.related_complaint_case__c, status__c = t1.status__c, subject = t1.subject, type = t1.type, ownerid = t1.ownerid, contactid = t1.contactid, status = t1.status, case_comments__c = t1.case_comments__c, subscription__c = t1.subscription__c, description = t1.description, case_outcome__c = t1.case_outcome__c, case_outcome_reason__c = t1.case_outcome_reason__c, adjustment_amount__c = t1.adjustment_amount__c, product_adjustment_amount__c = t1.product_adjustment_amount__c, product_adjustment_reason__c = t1.product_adjustment_reason__c, service__c = t1.service__c, parentid = t1.parentid, billing_account__c = t1.billing_account__c, billing_account_credit_balance__c = t1.billing_account_credit_balance__c, billing_address__c = t1.billing_address__c, lastmodifiedbyid = t1.lastmodifiedbyid, lastmodifieddate = t1.lastmodifieddate
from   public."temp_update_cases" t1
where  t1.id = t2.id

Everything else I need to do I figured out but this one is killing me 我想出的所有其他事情我都想通了,但这是杀了我

Your query is basically this: 您的查询基本上是这样的:

update public."cases" t2
set  . . .
from   public."temp_update_cases" t1
where  t1.id = t2.id;

I would suggest indexes: 我建议索引:

create index idx_cases_id on public."cases"(id);
create index idx_temp_updte_cases_id on public."temp_update_cases"(id);

Notes: 笔记:

  • I agree with the comments that the quoted identifiers are not a good idea. 我同意带引号的标识符不是一个好主意的意见。
  • Your tables should have some sort of primary key. 您的表应具有某种主键。 A column called id is a good candidate. 名为id的列是不错的选择。
  • Often, serial is a better option for the primary key than a character string. 通常,与字符串相比, serial是主键更好的选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM