[英]Pentaho Kettle program in java to merge multiple csv files by columns
I have two csv files employee.csv and loan.csv. 我有两个csv文件employee.csv和loan.csv。
In employee.csv I have four columns ie empid(Integer),name(String),age(Integer),education(String). 在employee.csv中我有四列,即empid(整数),name(字符串),age(整数),education(String)。
In loan.csv I have three columns ie loan(Double),balance(Double),empid(Integer). 在loan.csv我有三列,即贷款(Double),余额(Double),empid(整数)。
Now, I want to merge these two csv files into a single csv file by empid column.So in the result.csv file the columns should be, 现在,我想通过empid column将这两个csv文件合并到一个csv文件中。所以在result.csv文件中,列应该是,
Also I have to achieve this only by using kettle api program in Java. 另外,我必须通过在Java中使用kettle api程序来实现这一点。 Can anyone please help me? 谁能帮帮我吗?
First of all, you need to create a kettle transformation as below: 首先 ,您需要创建一个水壶转换,如下所示:
I have placed the ktr code in here . 我把ktr代码放在这里 。
Secondly , if you want to execute this transformation using Java, i suggest you read this blog. 其次 ,如果你想用Java执行这个转换,我建议你阅读这个博客。 I have explained how to execute a .ktr/.kjb file using Java. 我已经解释了如何使用Java执行.ktr / .kjb文件。
Extra points: 加分:
If its required that the names of the csv files need to be passed as a parameter from the Java code, you can do that by adding the below code: 如果需要将csv文件的名称作为参数从Java代码传递,则可以通过添加以下代码来实现:
trans.setParameterValue(parameterName, parameterValue);
where parameterName
is the some variable name and parameterValue
is the name of the file or the location. 其中parameterName
是一些变量名称, parameterValue
是文件的名称或位置。
I have already taken the files names as the parameter in the kettle code i have shared. 我已经将文件名作为我共享的水壶代码中的参数。
Hope it helps :) 希望能帮助到你 :)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.