[英]How to split values of an column in Pentaho Spoon?
I want to create a Spoon
transformation which will work on multiple values of an column. 我想创建一个
Spoon
转换,该转换将对列的多个值起作用。 Input to my transformation is an CSV file. 输入到我的转换中的是一个CSV文件。 In that CSV file there's one column named 'Technology' which contains 0 or more values seperated by semi colon as follows.
在该CSV文件中,有一列名为“技术”,其中包含0个或多个用半冒号分隔的值,如下所示。
+------------------------------------------------------+
row_id | Technology
+------------------------------------------------------+
1 | Cobol ; Db2 ; Jcl ; Vsam ; Cics ; Changeman ;
2 | Oracle ; Sql ; Db2 ; Oracle 9i ;
3 | Windows 2000 ; SQL ;
+------------------------------------------------------+
I have one table in database named 'Technologies' and its schema is as follows : 我在数据库中有一个名为“技术”的表,其架构如下:
+----------------------+
Technologies
+----------------------+
id | technology_name
+----------------------+
where id
column is set to auto increment. 其中
id
列设置为自动递增。
I want to insert values of technology
column only if that value is not present in Technologies
table. 我只想在
Technologies
表中不存在该值的情况下插入technology
列的值。
Can anyone please tell me 谁能告诉我
1) Which type of step to be used to split values of technology
column? 1)哪种类型的步骤可用于拆分
technology
列的值? 2) How to insert value only once? 2)如何只插入一次值? For example in
row 1
and row 2
, Db2
is repeated but I want to insert Db2
only once. 例如,在
row 1
和row 2
,重复Db2
,但我只想插入一次Db2
。
Thanks in advance ! 提前致谢 !
Use the "Split Fields" (Under "Transform") to split the contents. 使用“拆分字段”(在“转换”下)拆分内容。
CSV file input --> Split Fields --> rest of transformation CSV文件输入->拆分字段->其余的转换
Set the "Field to Split" to "Technology" and set the "Delimiter" to a semi-colon. 将“要拆分的字段”设置为“技术”,并将“定界符”设置为分号。
Regarding the non-repeating field, my suggestion would be you make the name itself the key to the table. 关于非重复字段,我的建议是让名称本身成为表的键。 Shift it to lower-case and replace any spaces / special characters with database safe equiv's and then make that the primary key.
将其转换为小写字母,并用数据库安全等效项替换所有空格/特殊字符,然后将其作为主键。 You should end up with a table full of only unique instances.
您应该最终得到一个仅包含唯一实例的表。
hth 心连心
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.