简体   繁体   English

使用posgres和php从数据库中的一个表中获取不在另一个数据库中另一个表中的所有ID

[英]Getting all id's from one table in database that aren't in another table in another database using posgres and php

I'm using PostgreSQL 8.1, and I have two databases with one table names clients that are identical and contain +-50k rows each.我正在使用 PostgreSQL 8.1,并且我有两个数据库,其中一个表名客户端是相同的并且每个包含 +-50k 行。

I need to obtain all ids in one table that aren't in the other, I have the following solution我需要在一个表中获取不在另一个表中的所有 id,我有以下解决方案

$sql = "SELECT id FROM clients WHERE id NOT IN(".pg_query($conn1,'SELECT id FROM clients').")";
$result = pg_query($conn2,$sql);

Before I run this, is this a good way to do it via PHP or are the other better faster ways?在我运行它之前,这是通过 PHP 完成它的好方法还是其他更好更快的方法?

Unfortunately I cant use dblink since I do not have privileges to install it.不幸的是,我无法使用 dblink,因为我没有安装它的权限。

dblink数据库链接

You can make use of dblink to refer to tables from another database in your query.您可以使用dblink在查询中引用来自另一个数据库的表。

Then your query could look like this (one of multiple different approaches):那么您的查询可能如下所示(多种不同方法之一):

SELECT id
FROM   clients c
LEFT   JOIN (
    SELECT *
    FROM   dblink('dbname=mydb', 'SELECT id FROM clients')
    AS     c(id int)
    )  x USING (id)
WHERE  x.id IS NULL;

Since PostgreSQL 9.1 installing an additional module has become even simpler:自从 PostgreSQL 9.1 安装一个额外的模块变得更加简单:

CREATE EXTENSION dblink;

Run once per database.每个数据库运行一次。 As for the necessary privileges :至于必要的特权

Loading an extension requires the same privileges that would be required to create its component objects.加载扩展需要的权限与创建其组件对象所需的权限相同。 For most extensions this means superuser or database owner privileges are needed.对于大多数扩展,这意味着需要超级用户或数据库所有者权限。

COPY复制

If dblink (or similar x-db tools are not available for you, try COPY TO / FROM to transfer contents of the foreign table:如果 dblink(或类似的 x-db 工具对您不可用,请尝试COPY TO / FROM 传输外部表的内容:

In foreign db:在国外数据库中:

COPY clients TO '/path/to/file';

In home db:在家庭数据库中:

CREATE TEMP TABLE c_tmp (LIKE clients);
COPY c_tmp FROM '/path/to/file';

SELECT id
FROM   clients c
LEFT   JOIN c_tmp x USING (id)
WHERE  x.id IS NULL;

Temp.温度。 table is dropped automatically at end of session.表在会话结束时自动删除。

If you can't use a database link I think you will have to do it in the middleware - and this middleware should have a lot of RAM.如果您不能使用数据库链接,我认为您必须在中间件中执行此操作 - 这个中间件应该有很多 RAM。

I would also recommend not to use PHP for this purpose, but if you have no other possibility, I would do it like this:我也建议不要为此目的使用 PHP,但如果您没有其他可能性,我会这样做:

$ids = array();
$idsNotExist = array();
$resultDb1 = pg_query('SELECT id FROM db1');
while ( $row = pg_fetch_dontknowtheexactfunction_row($resultDb1) ) {
  array_push($ids, $row['id']);
}

$resultDb2 = pg_query('SELECT id FROM db2');
while ( $row = pg_fetch($resultDb2) ) {
  if ( !in_array($ids, $row['id']) ) {
    array_push($idsNotExist, $row['id']);
  }
}

Best way would of course be, because ID is primary key, to write an own in_array-function which removes the ID from haystack when it's found because it will not be searched for again and the next search loop will be a bit faster.最好的方法当然是,因为 ID 是主键,编写一个自己的 in_array 函数,当它被找到时从 haystack 中删除 ID,因为它不会被再次搜索并且下一个搜索循环会快一点。

SELECT id
FROM db.clients
WHERE id NOT IN (
   SELECT id
   FROM otherdb.clients
)

one single query.一个查询。 There's no need for multiple db connections, unless the user ID you used for the first connection doesn't have access rights on one of those tables.不需要多个数据库连接,除非您用于第一个连接的用户 ID 对其中一个表没有访问权限。 SQL allows fully absolute db.table.field type specifications to allow cross-db queries. SQL 允许完全绝对的db.table.field类型规范以允许跨数据库查询。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM