Suppose I have the following text in a text file
First Text
"Some Text"
"124arandom txt that should not be parsed!@
"124 Some Text"
"어떤 글"
this text a"s well should not be parsed
I would like to retrieve Some Text
, 124 Some Text
and 어떤 글
as matched strings. The text is read line by line. Catch is, it has to match foreign languages as well if it is inside quotes.
Update: I found out something weird. I was trying some random stuff and found out that:
string s = "어떤 글"
Regex regex = new Regex("[^\"]*");
MatchCollection matches = regex.Matches(s);
matches have a count = 10 and have generated some empty items inside (The parsed text is in index 2). This might've been why I kept getting empty string when I was just doing Regex.Replace. Why is this happening?
If you read the text line by line, then the regex
"[^"]*"
will find all quoted strings, unless those may contain escaped quotes like "a 2\\" by 4\\" board"
.
To match those correctly, you need
"(?:\\.|[^"\\])*"
If you don't want the quotes to become part of the match, use lookaround assertions :
(?<=")[^"]*(?=")
(?<=")(?:\\.|[^"\\])*(?=")
These regexes, as C# regexes, can be created like this:
Regex regex1 = new Regex(@"(?<="")[^\""]*(?="")");
Regex regex2 = new Regex(@"(?<="")(?:\\.|[^""\\])*(?="")");
. You can use a regular expression and then try to match it with any text you want. can be in a loop or what ever you need.
string str = "\"your text\"";
//check for at least on char inside the qoutes
Regex r = new Regex("\".+\"");
bool ismatch = r.IsMatch(str);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.