[英]Regular Expression to Replace Unwanted Letters
I wrote a small program in C# to Capture ingame Text. 我用C#编写了一个小程序来捕获游戏中的文本。 My issue is that the Text allso containts Collor Codes which i try to not to have.
我的问题是Text allso包含我尝试不使用的Collor代码。 I read about the function Regex.Replace Which i think is going to suite for that.
我读到有关Regex.Replace函数的信息,我认为这将适合于此 。
I have Following String (Line) i want to clear i used the small little tool espresso to play a little bit with regular expression but i never figured it really out. 我有“跟随字符串(行)”,我想清除一下,我用小的工具意式浓缩咖啡用正则表达式演奏了一下,但我从未真正弄清楚。
This is the String i am going to work with: 这是我要使用的字符串:
|c001177ffSave Code =|r |cff00AA00A|cff00AA00G|cff00AA00Q|cffff69b4g|r |cff00AA00R|cff40e0d09|cffffff00$|cffffff00#|r |cff40e0d04|cffff69b4f|cff00AA00R
I try to use ^|( [a-zA-Z0-9]{9})
我尝试使用
^|( [a-zA-Z0-9]{9})
which gave me theese matches c001177ff cff00AA00 cff00AA00 cff00AA00 cffff69b4 cff00AA00 cff40e0d0 cffffff00 cffffff00 cff40e0d0 cffff69b4 cff00AA00
这给了我这些人匹配的
c001177ff cff00AA00 cff00AA00 cff00AA00 cffff69b4 cff00AA00 cff40e0d0 cffffff00 cffffff00 cff40e0d0 cffff69b4 cff00AA00
Well i am not good at regex more likly i just started it. 好吧,我刚开始就不太擅长正则表达式。 I don't want any body to present me completed solution (you are more than welcome to do that) at least a little help how i can solve that issue.
我不希望任何人向我介绍完整的解决方案(非常欢迎您这样做),至少我没有什么可以解决该问题的帮助。 I want to filter the Text.
我想过滤文本。
Inpute Code 输入代码
|c001177ffSave Code =|r |cff00AA00A|cff00AA00G|cff00AA00Q|cffff69b4g|r |cff00AA00R|cff40e0d09|cffffff00$|cffffff00#|r |cff40e0d04|cffff69b4f|cff00AA00R
Should be Filtered to this 应该过滤到这个
Save Code = AGQg R9$# 4fR
I think theese are Hexadecimal Color Codes the |c marks the beginning and the |r the End of the string.I think the |r | 我认为这些是十六进制颜色代码,| c表示字符串的开头,| r表示字符串的结尾。 is just used to indicate that the first color string ends than we get an SPACE and the |
只是用来表示第一个颜色字符串比我们得到的SPACE和|结束。 indicates the next start.
表示下一次开始。
How about a simple Linq? 简单的Linq怎么样?
var output = String.Join("", input.Split('|')
.Select(s => s.Length != 10 ? ' ' : s.Last()))
.Trim();
So I think the problem you were having was not escaping your |
因此,我认为您遇到的问题没有使您逃脱
|
... the following regex works for me: ...以下正则表达式适用于我:
var replaced = Regex.Replace(intput, @"\|c[0-9a-zA-Z]{8}|\|r", "");
\\|c[0-9a-zA-Z]{8}
- match starting with "|c"
and then any 8 letters or numbers \\|c[0-9a-zA-Z]{8}
-匹配以"|c"
开头,然后是任意8个字母或数字 |
- or \\|r
- match "|r"
\\|r
匹配"|r"
You're on the right track. 您走在正确的轨道上。 Your regex
您的正则表达式
^|( [a-zA-Z0-9]{9})
Both forces the match to be only at the start of your input string, due to the ^
start-of-line anchor , and the |
由于
^
start-of-line anchor和|
,两者都强制匹配仅在输入字符串的开头 |
needs to be escaped, because unescaped, it's a special "or" operator , which completely changes the meaning of your regex. 需要转义,因为不转义,它是一个特殊的“或”运算符 ,它完全改变了正则表达式的含义。
In addition, the space after the |
另外,
|
后的空格 is undesired, and the capture group is unnecessary, as you only want to eliminate this portion. 是不希望的,而捕获组是不必要的,因为您只想删除此部分。
If you replace all instances of this 如果替换此所有实例
\|[a-zA-z0-9]{9}
with nothing (the empty string) 一无所有 (空字符串)
You will achieve most of your goal. 您将实现大部分目标。 Try it here: http://regex101.com/r/rF6yB6/1
在这里尝试: http : //regex101.com/r/rF6yB6/1
But it seems you really want to eliminate not just nine characters after the pipe, but up through nine characters. 但是似乎您真的希望不仅在管道后面消除9个字符,而且还要消除9 个字符。 So use the
{1,9}
range quantifier instead: 因此,请使用
{1,9}
范围量词 :
\|[a-zA-z0-9]{1,9}
Try it: http://regex101.com/r/rF6yB6/2 试试看: http : //regex101.com/r/rF6yB6/2
This seems to achieve your goal exactly. 这似乎完全可以实现您的目标。
Please consider bookmarking the Stack Overflow Regular Expressions FAQ for future reference. 请考虑将“ 堆栈溢出正则表达式” FAQ标记为书签,以备将来参考。
string input = "[The example input from your question]";
string output = input.Replace("|r", "");
while (output.Contains("|c"))
output = output.Remove(output.IndexOf("|c"), 10);
// output = "Save Code = AGQg R9$# 4fR"
I like this much more than using Regexes just because it's so much more clear to me. 我比使用Regexes更喜欢这一点,因为对我而言,它是如此清晰。
var str1 = "|c001177ffSave Code =|r |cff00AA00A|cff00AA00G|cff00AA00Q|cffff69b4g|r |cff00AA00R|cff40e0d09|cffffff00$|cffffff00#|r |cff40e0d04|cffff69b4f|cff00AA00R"
var str2 = Regex.Replace(str,@"\|(r|[a-zA-Z0-9]{9})","") //"Save Code = AGQg R9$# 4fR"
In addition to this answer re: escaping the "pipe" character , you're starting your regex with the caret ( ^
) character. 除了这个答案re:转义“ pipe”字符之外 ,您还使用插入符号(
^
)来启动正则表达式。 This matches the beginning of a line. 这匹配行的开头。
A correct regex would be: 正确的正则表达式为:
\|c[0-9a-zA-Z]{8}
This regex should match all of the characters you want to remove: 此正则表达式应与您要删除的所有字符匹配:
([|]c([0-9]|[a-f]|[A-F]){8})|[|]r
Here's the breakdown... 这是细分...
The vertical pipe is an OR marker, so to search for it, place it in square brackets [ and ]. 垂直管道是一个OR标记,因此要搜索它,请将其放在方括号[和]中。
The parenthesis makes a set. 括号进行设置。 So you're searching for ([|]c([0-9]|[af]|[AF]){8}) OR [|]r which is all of your color codes OR |r.
因此,您要搜索([|] c([0-9] | [af] | [AF]){8})OR [|] r,它是所有颜色代码OR | r。
Breakdown of the color codes is the set that begins with |c and is followed by the set of exactly 8 characters that can be 0 though 9 or a through f or A through F. 颜色代码的分解是一个以| c开头的集合,其后是正好是8个字符的集合,这些字符可以是0到9或a到f或A到F。
I tested it at RegexPal.com. 我在RegexPal.com上进行了测试。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.