繁体   English   中英

使用java比较csv数据与oracle数据库表

[英]Comparing csv data with oracle database table using java

我需要将我的 csv 文件数据与 oracle 数据库表进行比较。 数据包含近 9000 行。 任何链接和来源我该怎么做。 我正在使用这个线程,但它在列表字符串中使用了 equals 方法,但这并没有逐行比较数据 csv 和数据库表

将 csv 文件与 MySQL 数据库进行比较

Java? 我不会说 Java。但是,因为它是一个 Oracle 数据库,所以我建议使用另一种方法 -外部表 下面是一个基于 Scott 的示例模式及其DEPT表的示例。 CSV 文件包含“适合”该表的数据,但是 - 我想看看差异。

test_dept.csv文件:

10,ACCOUNTING,NEW YORK
20,SALES,CHICAGO
30,RESEARCH,DALLAS
40,OPERATIONS,BOSTON
50,CIA,LANGLEY

外部表:为了使用它,必须有一个目录(第 8 行)(指向文件系统目录的 Oracle object,通常位于数据库服务器上。它包含 csv 文件(第 18 行)); 将要使用它的用户必须至少拥有read权限:

SQL> create table dept_ext
  2    (deptno   char(2),
  3     dname    char(20),
  4     loc      char(20)
  5    )
  6  organization external (
  7    type oracle_loader
  8    default directory ext_dir
  9    access parameters (
 10      records delimited by newline
 11      fields terminated by ','
 12      missing field values are null
 13      ( deptno  char(2),
 14        dname   char(20),
 15        loc     char(20)
 16      )
 17    )
 18    location ('test_dept.csv')
 19  )
 20  reject limit unlimited;

Table created.

看到任何数据吗?

SQL> select * from dept_ext;

DE DNAME                LOC
-- -------------------- --------------------
10 ACCOUNTING           NEW YORK
20 SALES                CHICAGO
30 RESEARCH             DALLAS
40 OPERATIONS           BOSTON
50 CIA                  LANGLEY

是的,它确实。 “原始” dept表中有什么?

SQL> select * from dept;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON

好的,那现在呢? 因为它是一个“表”,所以你可以写任何你想要的select ,将它加入到其他表中......例如:数据库表中不存在csv文件中的哪些部门?

SQL> select * from dept_ext
  2  where deptno not in (select deptno from dept);

DE DNAME                LOC
-- -------------------- --------------------
50 CIA                  LANGLEY

如果我加入deptno上的表,部门名称有什么不同吗?

SQL> select e.deptno, e.dname, e.loc, d.dname, d.loc
  2  from dept_ext e join dept d on d.deptno = e.deptno
  3                             and trim(d.dname) <> trim(e.dname);

DE DNAME                LOC                  DNAME          LOC
-- -------------------- -------------------- -------------- -------------
20 SALES                CHICAGO              RESEARCH       DALLAS
30 RESEARCH             DALLAS               SALES          CHICAGO

SQL>

等等。 看起来它可能会做你想做的事。

如果您尝试使用 Java 来执行此操作,代码将会很长。 但是使用SPL比较一个CSV文件和Oracle数据库中的一个表很方便,开源的Java package。

假设我们在 Oracle 数据库中有一张员工表:

CREATE TABLE EMPLOYEE
  (EID NUMBER(8),
  NAME VARCHAR2(255),
  SURNAME VARCHAR2(255),
  GENDER VARCHAR2(255),
  STATE VARCHAR2(255),
  BIRTHDAY DATE,
  HIREDATE DATE,
  DEPT VARCHAR2(255),
  SALARY NUMBER(8)
);

INSERT INTO EMPLOYEE VALUES (1,'Rebecca','Moore','F','California',TIMESTAMP'1974-11-20 00:00:00.0',TIMESTAMP'2005-03-11 00:00:00.0','R&D',7000);
INSERT INTO EMPLOYEE VALUES (2,'Ashley','Wilson','F','New York',TIMESTAMP'1980-07-19 00:00:00.0',TIMESTAMP'2008-03-16 00:00:00.0','Finance',11000);
INSERT INTO EMPLOYEE VALUES (3,'Rachel','Johnson','F','New Mexico',TIMESTAMP'1970-12-17 00:00:00.0',TIMESTAMP'2010-12-01 00:00:00.0','Sales',9000);
INSERT INTO EMPLOYEE VALUES (4,'Emily','Smith','F','Texas',TIMESTAMP'1985-03-07 00:00:00.0',TIMESTAMP'2006-08-15 00:00:00.0','HR',7000);
INSERT INTO EMPLOYEE VALUES (5,'Ashley','Smith','F','Texas',TIMESTAMP'1975-05-13 00:00:00.0',TIMESTAMP'2004-07-30 00:00:00.0','R&D',16000);
INSERT INTO EMPLOYEE VALUES (6,'Matthew','Johnson','M','California',TIMESTAMP'1984-07-07 00:00:00.0',TIMESTAMP'2005-07-07 00:00:00.0','Sales',11000);
INSERT INTO EMPLOYEE VALUES (7,'Alexis','Smith','F','Illinois',TIMESTAMP'1972-08-16 00:00:00.0',TIMESTAMP'2002-08-16 00:00:00.0','Sales',9000);
INSERT INTO EMPLOYEE VALUES (8,'Megan','Wilson','F','California',TIMESTAMP'1979-04-19 00:00:00.0',TIMESTAMP'1984-04-19 00:00:00.0','Marketing',11000);
INSERT INTO EMPLOYEE VALUES (9,'Victoria','Davis','F','Texas',TIMESTAMP'1983-12-07 00:00:00.0',TIMESTAMP'2009-12-07 00:00:00.0','HR',3000);
INSERT INTO EMPLOYEE VALUES (10,'Ryan','Johnson','M','Pennsylvania',TIMESTAMP'1976-03-12 00:00:00.0',TIMESTAMP'2006-03-12 00:00:00.0','R&D',13000);

和一个 CSV 文件 employee.csv:

EID,NAME,SURNAME,GENDER,STATE,BIRTHDAY,HIREDATE,DEPT,SALARY
1,Rebecca,Moore,F,California,1974-11-20 00:00:00,2005-03-11 00:00:00,R&D,7000
3,Rachel,Johnson,F,New Mexico,1970-12-17 00:00:00,2010-12-01 00:00:00,Sales,9000
5,Ashley,Smith,F,Texas,1975-05-13 00:00:00,2004-07-30 00:00:00,R&D,16000
7,Alexis,Smith,F,Illinois,1972-08-16 00:00:00,2002-08-16 00:00:00,Sales,9000
9,Victoria,Davis,F,Texas,1983-12-07 00:00:00,2009-12-07 00:00:00,HR,3000

为了获得 Oracle 员工表和 CSV 文件之间的差异(以下是预期结果):

EID,NAME,SURNAME,GENDER,STATE,BIRTHDAY,HIREDATE,DEPT,SALARY
2,Ashley,Wilson,F,New York,1980-07-19 00:00:00,2008-03-16 00:00:00,Finance,11000
4,Emily,Smith,F,Texas,1985-03-07 00:00:00,2006-08-15 00:00:00,HR,7000
6,Matthew,Johnson,M,California,1984-07-07 00:00:00,2005-07-07 00:00:00,Sales,11000
8,Megan,Wilson,F,California,1979-04-19 00:00:00,1984-04-19 00:00:00,Marketing,11000
10,Ryan,Johnson,M,Pennsylvania,1976-03-12 00:00:00,2006-03-12 00:00:00,R&D,13000

并计算 Oracle 员工表与 CSV 文件的交集:

EID,NAME,SURNAME,GENDER,STATE,BIRTHDAY,HIREDATE,DEPT,SALARY
1,Rebecca,Moore,F,California,1974-11-20 00:00:00,2005-03-11 00:00:00,R&D,7000
3,Rachel,Johnson,F,New Mexico,1970-12-17 00:00:00,2010-12-01 00:00:00,Sales,9000
5,Ashley,Smith,F,Texas,1975-05-13 00:00:00,2004-07-30 00:00:00,R&D,16000
7,Alexis,Smith,F,Illinois,1972-08-16 00:00:00,2002-08-16 00:00:00,Sales,9000
9,Victoria,Davis,F,Texas,1983-12-07 00:00:00,2009-12-07 00:00:00,HR,3000

我们只需要几行 SPL 代码:

一种
1个 =ORACLE.query@x("SELECT * FROM EMPLOYEE")
2个 =file("employee.csv").import@ct(EID:decimal,NAME,SURNAME,GENDER,STATE,BIRTHDAY,HIREDATE,DEPT,SALARY:decimal)
3个 =INTERSECT=[A1,A2].merge@oi(EID,NAME,SURNAME,GENDER,STATE,BIRTHDAY,HIREDATE,DEPT,SALARY)
4个 =MINUS=[A1,A2].merge@od(EID,NAME,SURNAME,GENDER,STATE,BIRTHDAY,HIREDATE,DEPT,SALARY)

SPL提供了JDBC驱动,Java可以调用。只需将上面的SPL脚本保存为cmp.splx,在Java调用存储过程即可:

…
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
st=con.prepareCall("call cmp()");
st.execute();
…

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM