I have a CSV file with translation pairs. It has the following scheme:
text language 1;text language 2
text language 1;text language 2
text language 1;text language 2
and so on. The problem is sometimes the text is very long or contains \\n or even multiple quotation marks, like this:
"Very long long long long long long long long long long long long long long long long long long long text";"Very long long long long long long long long long long text2"
text;text2
My problem is that i cant figure out the right Regex pattern to split the word or sentence pairings correctly. Especially when its a long bracked containing \\n or even \\r\\n . In these cases however, the sentence pairs are each encapsuled in quotation marks if thats any help. Similar to this
"Long text with lines\r\nmore lines\nand another line\nAnd yet another";"Long text with lines\r\nmorelines\nand another line\nAnd yet another"\r\n
word1;word2
so i assume, i need to split the word pairs if theres either a "\\r\\n or a \\r\\n" or a ; ? Sadly im not experienced with regular expressions.
I uploaded the csv here: http://s000.tinyupload.com/?file_id=11646241007071639575
Ok i finally solved my problem using a so called "TextFieldParser" (.NET frame 2.0 and higher, Microsoft.VisualBasic.FileIO Namespace)
using (TextFieldParser fParser = new TextFieldParser(file, enc)) { fParser.SetDelimiters(new string[] { ";" }); ... }
using (TextFieldParser fParser = new TextFieldParser(file, enc)) { fParser.SetDelimiters(new string[] { ";" }); ... }
– Moonpaw
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.