简体   繁体   English

正则表达式 - 查找特定的字符序列,有些不是字母、数字或下划线

[英]Regex - find specific sequence of characters, some are not letters, digits or underscore

I am new to regex and need to search a string field in Impala for multiple matches to this exact sequence of characters: ~FC*我是正则表达式的新手,需要在 Impala 中的字符串字段中搜索与此确切字符序列的多个匹配项: ~FC*

since the ~ and * are not letters or digits, I am unsure on how to search for these in this specific order and not just for any of these single characters occuring.由于~*不是字母或数字,我不确定如何以这个特定的顺序搜索这些,而不仅仅是出现这些单个字符。

This is my code so far, have tried both of these [~FC*] or ^~FC*$到目前为止,这是我的代码,已经尝试了这两个[~FC*]^~FC*$

This is a test string, it has 2 occurrences:这是一个测试字符串,它出现了 2 次:

N4*CITY*STATE*2155446*2120~FC*C*IND*30*MC*blah blah fjdgfeufh*27*0*****Y~FC*Z*IND*39*MC*jhlkfhfudfgsdkufgkusgfn*23*0*****Y~
.*(~FC\*).* or .*(\~FC\*).*

.* - Zero or more characters
.*(~FC\*).* - Means Search for ~FC* 

    

if 1st one does not work, please try second one, it might work if tilde symbol is reserved in regex.

You can use a simple SQL like below.您可以使用如下所示的简单 SQL。 This will work only on hardcoded string.这仅适用于硬编码字符串。

select (length(mycol)- length (replace(mycol,'~FC*','')))/length('~FC*') as occurance_str

Here is the SQL i tested ok这是我测试好的 SQL

select 
(length('N4*CITY*STATE*2155446*2120~FC*C*IND*30*MC*blah blah fjdgfeufh*27*0*****Y~FC*Z*IND*39*MC*jhlkfhfudfgsdkufgkusgfn*23*0*****Y~')
- length(replace('N4*CITY*STATE*2155446*2120~FC*C*IND*30*MC*blah blah fjdgfeufh*27*0*****Y~FC*Z*IND*39*MC*jhlkfhfudfgsdkufgkusgfn*23*0*****Y~','~FC*',''))
)/length('~FC*') as occurance_str

About the patterns that you tried:关于您尝试的模式:

  • This pattern [~FC*] matches a single character being one of ~ F C *此模式[~FC*]匹配单个字符,它是~ F C *

  • This pattern ^~FC*$ has anchors ^ and $ to assert the start and the end of the string, and in between it matches ~F followed by optional repetitions of a C char这个模式^~FC*$有锚^$来断言字符串的开始和结束,并且在它之间匹配~F后跟可选的C字符重复

If you want to find the 2 occurrences, you can use this pattern escaping the asterix:如果要查找 2 次出现,可以使用此模式 escaping asterix:

~FC\*

See a regex demo .查看正则表达式演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Imapala Regex - 查找特定的字符序列,它们之间有分隔符,有些不是字母、数字或下划线 - Imapala Regex - find specific sequence of characters, with delimiters between them, some are not letters, digits or underscore 如何存储数字序列(或其他字符集)? - How to store a sequence of digits (or sequence of characters from some other set)? SQL Oracle / Varchar2 中的多序列 ID 带字母和数字 - Multiple Sequence in SQL Oracle / Varchar2 ID with letters and digits 如何使用正则表达式查找缺少特定字符的值 - How do I find values that are missing specific characters with regex 如何使用Regex查找两个特殊字符之间的特定字符串? - How to use Regex to find a specific string between two special characters? Oracle 替换一些重复的字符(非数字) - Oracle replace some duplicated characters (non digits ) 查找除下划线和空格外包含特殊字符的行 - Find rows which contain special characters except underscore and space 忽略特定字母以在SQL查询中找到匹配项 - Ignoring specific letters to find match in SQL Query 用字母和数字检查约束 - Check Constraint with letters and digits 使用正则表达式以任何顺序查找特定单词的序列 - Find sequence of particular words in any order with Regex
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM