SQL查询大文件

Question

I have two text files with about 100 thousands rows. 我有两个大约有10万行的文本文件。 Every row has ten digit numbers. 每行有十位数字。 There might be same rows in the two files and I want to filter them. 两个文件中可能有相同的行，我想过滤它们。 For example: 例如：

file1: 文件1：

1256745889
1515487882 <---same
4841453222

... ...

file2: 文件2：

7748523695
1515487882 <---same
8745529699

... ...

This is my actual SQL query 这是我实际的SQL查询

SELECT
    table1.cjsz
FROM
    table1
    INNER JOIN table2 ON 
        table1.cjsz != table2.cjsz
WHERE
    LENGTH(table1.26_code)=0;

It's not giving the expected result. 它没有给出预期的结果。 Can you give me a hand on this? 你能帮我这个忙吗？

Thanks 谢谢

Answer 1

to get all ids in table 1 not in table2 获取表1中没有表2中的所有ID

select cjsz from table1
MINUS
select cjsz from table2

to get all ids in table 2 not in table1 获取表2中没有表1中的所有ID

select cjsz from table2
MINUS
select cjsz from table1

to get all ids in both table 1 and in table2 获取表1和表2中的所有ID

select cjsz from table1
INTERSECT
select cjsz from table2

Answer 2

Not exactly sure what you're trying to accomplish, but have you tried something like: 不完全确定您要完成什么，但是您是否尝试过类似的方法：

SELECT table1.cjsz WHERE LENGTH(table1.[26_code]) = 0
EXCEPT
SELECT table2.cjsz

Answer 3

You can use the EXCEPT operator if you are using SQLServer, otherwise you can try this command: 如果使用的是SQLServer，则可以使用EXCEPT运算符，否则可以尝试以下命令：

SELECT
     table1.cjsz
FROM
     table1 LEFT OUTER JOIN table2 ON table1.cjsz = table2.cjsz
WHERE
     table2.cjsz IS NULL AND LENGTH(table1.26_code) = 0

Answer 4

If I get you in the right direction, you're looking the symmetric difference between Table1 and Table2. 如果我朝着正确的方向发展，那么您正在寻找Table1和Table2之间的对称差异。 After searching on the Web, I found a nice blog entry giving a SQL sample about this, I prepared my own sample please give it a try and tell me if its what you need. 在网上搜索后，我发现了一个不错的博客条目，提供了有关此示例的SQL示例，我准备了自己的示例，请尝试一下并告诉我是否需要它。

CREATE TABLE Table1 (id int, value char(1));

INSERT INTO Table1 values (1, 'H');
INSERT INTO Table1 values (2, 'e');
INSERT INTO Table1 values (3, 'l');
INSERT INTO Table1 values (4, 'l');
INSERT INTO Table1 values (5, 'o');
INSERT INTO Table1 values (6, ' ');
INSERT INTO Table1 values (7, ' ');

CREATE TABLE Table2 (id int, value char(1));

INSERT INTO Table2 values (6, ' ');
INSERT INTO Table2 values (7, ' ');
INSERT INTO Table2 values (8, ' ');
INSERT INTO Table2 values (9, 'w');
INSERT INTO Table2 values (10, 'o');
INSERT INTO Table2 values (11, 'r');
INSERT INTO Table2 values (12, 'l');
INSERT INTO Table2 values (13, 'd');


SELECT *
  FROM (
    SELECT a.id, a.value FROM Table1 a
    UNION ALL
    SELECT b.id, b.value FROM Table2 b
  ) AS t
  GROUP BY t.id, t.value
  HAVING COUNT(id) = 1
ORDER BY id;

SQL查询大文件

问题描述

4 个解决方案

解决方案1
0 2012-02-25 21:17:23

解决方案2
0 2012-02-25 21:17:45

解决方案3
0 2012-02-25 21:20:24

解决方案4
0 2012-02-25 21:40:33

SQL查询大文件

问题描述

4 个解决方案

解决方案1 0 2012-02-25 21:17:23

解决方案2 0 2012-02-25 21:17:45

解决方案3 0 2012-02-25 21:20:24

解决方案4 0 2012-02-25 21:40:33

解决方案1
0 2012-02-25 21:17:23

解决方案2
0 2012-02-25 21:17:45

解决方案3
0 2012-02-25 21:20:24

解决方案4
0 2012-02-25 21:40:33