简体   繁体   English

如何在Informatica Cloud中使用多字符定界符?

[英]How to have a Multi Character Delimiter in Informatica Cloud?

I have a problem I need help solving. 我有一个需要帮助解决的问题。 The business I am working for is using Informatica cloud to do alot of their ETL into AWS and Other Services. 我正在从事的业务是使用Informatica云将其大量ETL转换为AWS和其他服务。

We have been given a flat file by the business where the field delimiter is "~|" 业务已将字段分隔符为“〜|”的文件提供给我们 Currently to the best of my knowledge informatica only accepts a single character delimiter. 目前,就我所知,informatica仅接受单个字符定界符。

只有一个字符。如何将其设为多重?

Does any one know how to overcome this? 有谁知道如何克服这个问题?

Informatica cannot read composite delimiters . Informatica无法读取复合定界符

First you could feed each line as one single long string into an Expression transformation. 首先,您可以将每一行作为一个单一的长字符串输入到Expression转换中。 In this case the delimiter character should be set to \\037 , I haven't seen this character (ASCII Unit Separator) in use at least since 1982. Then use repetitive invocations of InStr() within the EXP to identify the positions of those double pipe characters and split up each line into fields using SubStr(). 在这种情况下,定界符应设置为\\ 037,至少从1982年以来我就没有见过此字符(ASCII单位分隔符)。然后在EXP中使用InStr()的重复调用来识别双精度字符的位置。管道字符并使用SubStr()将每一行拆分为多个字段。

Second (easier in the mapping, more work with the session) you could feed the file into some utility which replaces those double pipe characters by the character ASCII 31 (the Unit Separator mentioned above); 其次(可以更轻松地进行映射,还可以处理更多的会话),您可以将文件输入一些实用程序,该实用程序将用ASCII 31(上述单元分隔符)替换那些双管道字符; the session has to be set up such that it reads the output from this utility (input file type = Command instead of File). 会话必须进行设置,以便从该实用程序读取输出(输入文件类型=命令而不是文件)。 Then the source definition should contain the \\037 as the field delimiter instead of any pipe character or so. 然后,源定义应包含\\ 037作为字段定界符,而不是任何管道字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM