简体   繁体   English

Talend Open Studio:带半冒号的分隔文件和带引号的标头

[英]Talend Open Studio: delimited file with semi colon and header with quotes

I have a delimited file that is delimited by semi colon. 我有一个用半冒号分隔的分隔文件。 The first row in this file is the header, and the header tokens are in double quotes: an example is below: 该文件的第一行是标头,标头标记用双引号引起:下面是一个示例:

"name", "telephone", "age", "address", "y" “姓名”,“电话”,“年龄”,“地址”,“ y”

When using the tFileDelimited and tMap and you pull the fields in, they look like this with underscores around the fields: _name_, _telephone_, _age_, _address_, Column05 使用tFileDelimited和tMap并将字段拖入时,它们看起来像这样,并在字段周围带有下划线:_name_,_telephone_,_age_,_address_,Column05

SO it seems that the fields, the double quote is changed to underscore character and for some reason the last field is a single character without the quotes, but Talend seems to ignore this field name and gives its own default. 因此,似乎该字段将双引号更改为下划线字符,并且由于某种原因,最后一个字段是不带引号的单个字符,但是Talend似乎忽略了该字段名称并给出了自己的默认值。

Just wondering if anyone has encountered this kind of behaviour and whether one should use a regex to remove the double quotes, to preprocess this first. 只是想知道是否有人遇到过这种行为,以及是否应该使用正则表达式删除双引号来对此进行预处理。 Any help appreciated. 任何帮助表示赞赏。

Be sure to remove extra blank spaces in the first row, between header tokens. 确保在标题令牌之间的第一行中删除多余的空格。 If you use Metadata to import your file, you should have the right names appearing, (just check the options : 'heading rows as column names' and "\\"" as the text enclosure) 如果使用元数据导入文件,则应该显示正确的名称(只需检查以下选项:“将行标题列为列名”和“ \\””作为文本附件)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM