简体   繁体   English

Pentaho 勺子搜索并替换行中的特殊字符

[英]Pentaho spoon search and replace especial character in rows

I have a csv file with mime type US-ASCII and one column in the dataset look like this:我有一个 csv 文件,MIME 类型为 US-ASCII,数据集中的一列如下所示:

id ID V_name V_name
210001 210001 cha?ne des Puys cha?ne des Puys
210030 210030 M?los米洛斯
213004 213004 G?ll?会吗?
213021 213021 S?phan沙凡
221110 221110 Afd?ra阿夫德拉

And so on.等等。

I would like to change those characters to:我想将这些字符更改为:

id ID V_name V_name
210001 210001 chaine des Puys链德普伊斯
210030 210030 Milos米洛斯
213004 213004 Gollu咕噜
213021 213021 Suphan素攀
221110 221110 Afdera阿夫德拉

The thing is that there are 95 rows of this kind, how can I search and replace those rows?问题是有 95 行这样的行,我该如何搜索和替换这些行? I using the suite PDI spoon.我使用套件 PDI 勺子。 Thanks in advance.提前致谢。

As @Iłya Bursov has stated, the source file you are reading doesn't provide the correct characters, it is providing the?正如@Iłya Bursov 所说,您正在阅读的源文件没有提供正确的字符,它提供了? in the source, so if you want to correct it, you'll have to do it manually.在源代码中,所以如果你想更正它,你必须手动完成。

I don't think it is worth it, unless you know you are going to get always the same set of V_name over time and different files.我认为这不值得,除非您知道随着时间的推移您将始终获得同一组 V_name 和不同的文件。 In that case you could create a file to correlate the V_name in your source with the?在那种情况下,您可以创建一个文件来将源代码中的V_name与? characters to a V_name_corrected with the correct display for the characters.字符到V_name_corrected并正确显示字符。 This seems to be an exercise, so I would let the names as they are.这似乎是一个练习,所以我会让名字保持原样。 In real life, I would contact the provider of the file with the incorrect character set to tell them that they need to correct the generation of the file to provide the correct characters in the file.在现实生活中,我会联系错误字符集文件的提供者,告诉他们需要更正文件的生成以提供文件中的正确字符。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 高效的字符串搜索和替换 - Efficient String Search and Replace 使用 function.replace() 将数据集中的双字符“YY”替换为一个“Y” - Replace doubled character "YY" by one "Y" in dataset using function .replace() 如何将 python 中偶数行的值替换为奇数行? - How to replace values from even rows into odd rows in python? 如何替换 pandas 中每组完整行中的不完整行 - How to replace incomplete rows from complete rows per group in pandas 如何从 dataframe 列中的某些行中删除字符? - How to remove a character from some rows in a dataframe column? 将索引范围分配给原始 dataframe 时,不保存使用 iloc 替换 dataframe 中的一系列行 - Using iloc to replace a range of rows in the dataframe is not saved when assigning that index range to the original dataframe 如何将 NA 替换为数值列的平均值和字符列的模式值? - How can I replace NA's with mean values for numeric columns and with mode values for character columns by group? 如何往前看几行以查看是否满足条件,然后相应地替换单元格 - How to look back a few rows higher to see if a condition is satisfied and then replace a cell accordingly 用按行自动递增的值替换 Pandas 数据框中列中的 # 值 - Replace the # values present in a column in pandas dataframe with auto-incremental values by rows 由于字符原因,无法使用 pd.read_json 或 json.load 读取 json 文件。 将json作为字符串读取以查找和替换的方法? 或正确加载? - Unable to read json file using pd.read_json or json.load due to character. Way to read json as string to find and replace? or load properly?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM