简体   繁体   English

用空字符串替换所有非字母数字+标点字符

[英]Replacing all non-alphanumeric + punctuation characters with empty strings

I'm working on a regular expression in Talend inside a tReplace component我正在 TReplace 组件内处理 Talend 中的正则表达式

I'm moving data from Oracle to Redshift and I'm having issues with DDL length because some characters are not supported (I guess)我正在将数据从 Oracle 移动到 Redshift,我遇到了 DDL 长度问题,因为不支持某些字符(我猜)

I have product names like我有像这样的产品名称

175/65 R14 Efficiency + 175/65 R14 效率 +

XXX N° 5 H7DC XXX N° 5 H7DC

And they have to stay like this.他们必须保持这样。 But sometimes I have NBSP inside labels or even worse sometimes但有时我的标签内有 NBSP,有时甚至更糟

I saw this list of punctuation online [,"#$%&'()*+.-:/;?<=>?@[\]^_{|}~°]我在网上看到这个标点符号列表 [,"#$%&'()*+.-:/;?<=>?@[\]^_{|}~°]

and I need to add it to my already existent Regex "[^A-Za-z0-9]"我需要将它添加到我已经存在的正则表达式“[^A-Za-z0-9]”

TLDR ; TLDR ; Can someone help me writing a REGEX to replace everything in a column except [A-Za-z0-9] and the punctuation list above?有人可以帮我写一个 REGEX 来替换列中除 [A-Za-z0-9] 和上面的标点符号列表之外的所有内容吗? It must be able to be use in the following code (As I'm using Talend and it's java interpreted)它必须能够在以下代码中使用(因为我正在使用 Talend 并且它是 java 解释的)

StringUtils.replaceAll(row1.label, "[^A-Za-z0-9]", ""); StringUtils.replaceAll(row1.label, "[^A-Za-z0-9]", "");

I ended up finding the solution thanks to the help of the answers above.由于上述答案的帮助,我最终找到了解决方案。

I used:我用了:

[^\p{Alnum}\p{Punct}\s] [^\p{Alnum}\p{Punct}\s]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用空字符串替换所有非字母数字字符 - Replacing all non-alphanumeric characters with empty strings 替换除某些字符外的所有非字母数字字符 - Replacing all non-alphanumeric characters except some characters 通过忽略(不替换)非字母数字字符或查看第一个字母数字字符对字符串列表进行排序 - Sorting a list of strings by ignoring (not replacing) non-alphanumeric characters, or by looking at the first alphanumeric character 使用正则表达式删除$以外的所有非字母数字字符 - Regular expression to remove all non-alphanumeric characters except $ 正则表达式要删除所有具有通用语言支持的非字母数字字符吗? - Regex to remove all non-Alphanumeric characters with universal language support? 正则表达式匹配ASCII非字母数字字符 - Regex to match ASCII non-alphanumeric characters 如何删除任何非字母数字字符? - How to remove any non-alphanumeric characters? 如何从字符串中删除所有非字母数字字符(Java中的小数点除外) - How to remove all non-alphanumeric characters from a string expect decimal point in Java 删除所有非字母数字字符但允许多字词 - remove all non-alphanumeric characters but allow multi-word terms 没有运行非字母数字字符的行的正则表达式 - Regex for line without runs of non-alphanumeric characters
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM