简体   繁体   English

SQL查询大文件

[英]SQL query for big files

I have two text files with about 100 thousands rows. 我有两个大约有10万行的文本文件。 Every row has ten digit numbers. 每行有十位数字。 There might be same rows in the two files and I want to filter them. 两个文件中可能有相同的行,我想过滤它们。 For example: 例如:

file1: 文件1:

1256745889
1515487882 <---same
4841453222

... ...

file2: 文件2:

7748523695
1515487882 <---same
8745529699

... ...

This is my actual SQL query 这是我实际的SQL查询

SELECT
    table1.cjsz
FROM
    table1
    INNER JOIN table2 ON 
        table1.cjsz != table2.cjsz
WHERE
    LENGTH(table1.26_code)=0;

It's not giving the expected result. 它没有给出预期的结果。 Can you give me a hand on this? 你能帮我这个忙吗?

Thanks 谢谢

to get all ids in table 1 not in table2 获取表1中没有表2中的所有ID

select cjsz from table1
MINUS
select cjsz from table2

to get all ids in table 2 not in table1 获取表2中没有表1中的所有ID

select cjsz from table2
MINUS
select cjsz from table1

to get all ids in both table 1 and in table2 获取表1和表2中的所有ID

select cjsz from table1
INTERSECT
select cjsz from table2

Not exactly sure what you're trying to accomplish, but have you tried something like: 完全确定您要完成什么,但是您是否尝试过类似的方法:

SELECT table1.cjsz WHERE LENGTH(table1.[26_code]) = 0
EXCEPT
SELECT table2.cjsz

You can use the EXCEPT operator if you are using SQLServer, otherwise you can try this command: 如果使用的是SQLServer,则可以使用EXCEPT运算符,否则可以尝试以下命令:

SELECT
     table1.cjsz
FROM
     table1 LEFT OUTER JOIN table2 ON table1.cjsz = table2.cjsz
WHERE
     table2.cjsz IS NULL AND LENGTH(table1.26_code) = 0

If I get you in the right direction, you're looking the symmetric difference between Table1 and Table2. 如果我朝着正确的方向发展,那么您正在寻找Table1和Table2之间的对称差异 After searching on the Web, I found a nice blog entry giving a SQL sample about this, I prepared my own sample please give it a try and tell me if its what you need. 在网上搜索后,我发现了一个不错的博客条目,提供了有关此示例的SQL示例,我准备了自己的示例,请尝试一下并告诉我是否需要它。

CREATE TABLE Table1 (id int, value char(1));

INSERT INTO Table1 values (1, 'H');
INSERT INTO Table1 values (2, 'e');
INSERT INTO Table1 values (3, 'l');
INSERT INTO Table1 values (4, 'l');
INSERT INTO Table1 values (5, 'o');
INSERT INTO Table1 values (6, ' ');
INSERT INTO Table1 values (7, ' ');

CREATE TABLE Table2 (id int, value char(1));

INSERT INTO Table2 values (6, ' ');
INSERT INTO Table2 values (7, ' ');
INSERT INTO Table2 values (8, ' ');
INSERT INTO Table2 values (9, 'w');
INSERT INTO Table2 values (10, 'o');
INSERT INTO Table2 values (11, 'r');
INSERT INTO Table2 values (12, 'l');
INSERT INTO Table2 values (13, 'd');


SELECT *
  FROM (
    SELECT a.id, a.value FROM Table1 a
    UNION ALL
    SELECT b.id, b.value FROM Table2 b
  ) AS t
  GROUP BY t.id, t.value
  HAVING COUNT(id) = 1
ORDER BY id;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM