简体   繁体   English

如何在Pentaho Spoon中拆分列的值?

[英]How to split values of an column in Pentaho Spoon?

I want to create a Spoon transformation which will work on multiple values of an column. 我想创建一个Spoon转换,该转换将对列的多个值起作用。 Input to my transformation is an CSV file. 输入到我的转换中的是一个CSV文件。 In that CSV file there's one column named 'Technology' which contains 0 or more values seperated by semi colon as follows. 在该CSV文件中,有一列名为“技术”,其中包含0个或多个用半冒号分隔的值,如下所示。

+------------------------------------------------------+

 row_id |   Technology
+------------------------------------------------------+

1       | Cobol ; Db2 ; Jcl ; Vsam ; Cics ; Changeman ;

2       | Oracle ; Sql ; Db2 ; Oracle 9i ;

3       | Windows 2000 ; SQL ;
+------------------------------------------------------+

I have one table in database named 'Technologies' and its schema is as follows : 我在数据库中有一个名为“技术”的表,其架构如下:

+----------------------+

Technologies

+----------------------+
 id   | technology_name

+----------------------+

where id column is set to auto increment. 其中id列设置为自动递增。

I want to insert values of technology column only if that value is not present in Technologies table. 我只想在Technologies表中不存在该值的情况下插入technology列的值。

Can anyone please tell me 谁能告诉我

1) Which type of step to be used to split values of technology column? 1)哪种类型的步骤可用于拆分technology列的值? 2) How to insert value only once? 2)如何只插入一次值? For example in row 1 and row 2 , Db2 is repeated but I want to insert Db2 only once. 例如,在row 1row 2 ,重复Db2 ,但我只想插入一次Db2

Thanks in advance ! 提前致谢 !

Use the "Split Fields" (Under "Transform") to split the contents. 使用“拆分字段”(在“转换”下)拆分内容。

CSV file input --> Split Fields --> rest of transformation CSV文件输入->拆分字段->其余的转换

Set the "Field to Split" to "Technology" and set the "Delimiter" to a semi-colon. 将“要拆分的字段”设置为“技术”,并将“定界符”设置为分号。

Regarding the non-repeating field, my suggestion would be you make the name itself the key to the table. 关于非重复字段,我的建议是让名称本身成为表的键。 Shift it to lower-case and replace any spaces / special characters with database safe equiv's and then make that the primary key. 将其转换为小写字母,并用数据库安全等效项替换所有空格/特殊字符,然后将其作为主键。 You should end up with a table full of only unique instances. 您应该最终得到一个仅包含唯一实例的表。

hth 心连心

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pentaho勺变形-错误 - Pentaho Spoon Transformation - Errors 使用pentaho勺更新数据库中的每日数据 - updating day by day data in database using pentaho spoon Tableau 列中的自定义拆分或转换值 - Custom Split or Transform values in Tableau column Pentaho - Spoon - 如果目标表中存在行(在上一步中获得),则 Kettle 删除 - Pentaho - Spoon - Kettle delete if exist rows (obtained in a previous step) in a target table Pentaho Spoon 转换抛出:ORA-01652:无法在表空间 TEMP 中将临时段扩展 128 - Pentaho Spoon transformation throws: ORA-01652: unable to extend temp segment by 128 in tablespace TEMP 如何拆分列中的多个值并按 Pandas 中的所述值分组? - How to split multiple values in columns and groupby said values in pandas? 使用预定的备用值进行Pentaho字段名称转换 - Pentaho Field Name converting using predetemined alternative values Pentaho将表中的值与REST API中的数字进行比较 - Pentaho compare values from table to a number from REST api Pentaho 错误(插入/更新 ETL Out.0 - 不正确的 integer 值:第 1 行的列“test_column”的“N”) - Pentaho error (Insert / update ETL Out.0 - Incorrect integer value: 'N' for column 'test_column' at row 1) 如何在Pentaho中合并具有相同列的两个流? - How can I merge two streams with identical columns in Pentaho?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM