简体   繁体   中英

(C#, RegEx) RegEx.Split with \r\n, ;

I have a CSV file with translation pairs. It has the following scheme:

text language 1;text language 2
text language 1;text language 2
text language 1;text language 2

and so on. The problem is sometimes the text is very long or contains \\n or even multiple quotation marks, like this:

"Very long long long long long long long long long long long long long long long long long long long text";"Very long long long long long long long long long long text2"
text;text2

My problem is that i cant figure out the right Regex pattern to split the word or sentence pairings correctly. Especially when its a long bracked containing \\n or even \\r\\n . In these cases however, the sentence pairs are each encapsuled in quotation marks if thats any help. Similar to this

"Long text with lines\r\nmore lines\nand another line\nAnd yet another";"Long text with lines\r\nmorelines\nand another line\nAnd yet another"\r\n
word1;word2

so i assume, i need to split the word pairs if theres either a "\\r\\n or a \\r\\n" or a ; ? Sadly im not experienced with regular expressions.

I uploaded the csv here: http://s000.tinyupload.com/?file_id=11646241007071639575

Ok i finally solved my problem using a so called "TextFieldParser" (.NET frame 2.0 and higher, Microsoft.VisualBasic.FileIO Namespace)

using (TextFieldParser fParser = new TextFieldParser(file, enc)) { fParser.SetDelimiters(new string[] { ";" }); ... } using (TextFieldParser fParser = new TextFieldParser(file, enc)) { fParser.SetDelimiters(new string[] { ";" }); ... } – Moonpaw

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM