[英]Pentaho spoon search and replace especial character in rows
I have a csv file with mime type US-ASCII and one column in the dataset look like this:我有一个 csv 文件,MIME 类型为 US-ASCII,数据集中的一列如下所示:
id ![]() |
V_name ![]() |
---|---|
210001 ![]() |
cha?ne des Puys ![]() |
210030 ![]() |
M?los![]() |
213004 ![]() |
G?ll?![]() |
213021 ![]() |
S?phan![]() |
221110 ![]() |
Afd?ra![]() |
And so on.等等。
I would like to change those characters to:我想将这些字符更改为:
id ![]() |
V_name ![]() |
---|---|
210001 ![]() |
chaine des Puys![]() |
210030 ![]() |
Milos![]() |
213004 ![]() |
Gollu![]() |
213021 ![]() |
Suphan![]() |
221110 ![]() |
Afdera![]() |
The thing is that there are 95 rows of this kind, how can I search and replace those rows?问题是有 95 行这样的行,我该如何搜索和替换这些行? I using the suite PDI spoon.
我使用套件 PDI 勺子。 Thanks in advance.
提前致谢。
As @Iłya Bursov has stated, the source file you are reading doesn't provide the correct characters, it is providing the?正如@Iłya Bursov 所说,您正在阅读的源文件没有提供正确的字符,它提供了? in the source, so if you want to correct it, you'll have to do it manually.
在源代码中,所以如果你想更正它,你必须手动完成。
I don't think it is worth it, unless you know you are going to get always the same set of V_name over time and different files.我认为这不值得,除非您知道随着时间的推移您将始终获得同一组 V_name 和不同的文件。 In that case you could create a file to correlate the V_name in your source with the?
在那种情况下,您可以创建一个文件来将源代码中的V_name与? characters to a V_name_corrected with the correct display for the characters.
字符到V_name_corrected并正确显示字符。 This seems to be an exercise, so I would let the names as they are.
这似乎是一个练习,所以我会让名字保持原样。 In real life, I would contact the provider of the file with the incorrect character set to tell them that they need to correct the generation of the file to provide the correct characters in the file.
在现实生活中,我会联系错误字符集文件的提供者,告诉他们需要更正文件的生成以提供文件中的正确字符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.