Searching for a complex Regular Expression to use with Notepad++

Question

I need a (from my view) very complex regular expression. I have coordinates in the following format:

[xx.x,yy.y]

and have to replace them to

[xxx0yyy000]

But they can also apear as

[xx,yy] --> [xx00yy0000]

[xx.x,yy.y] --> [xxx0yyy000]

[xx.x,yy] --> [xxx0yy0000]

[xx,yy.y] --> [xx00yyy000]

for example:

[20,14] --> [2000140000]

[17.3,15.1] --> [1730151000]

[23.4,19] --> [2340190000]

[53,11.7] --> [5300117000]

Would be nice, if some "regex pro" can give me a solution for this and explain how I can use it with notepad++ An short description on how the regex is build-on (for learning effect) would also be very nice :)

Answer 1

This is the line that I took right out of Notepad++ reference manual:

"Because Notepad++ makes use of the Scintilla regex engine, it is the same as with SciTE, so a full list of regex options can be found here (with the difference that POSIX mode is always on, this is not an option): http://www.scintilla.org/SciTERegEx.html "

It's quite limited for me to write a complex regex that fits your patterns using Scintilla regex engine. However, if you're willing to use 4 regex patterns, it becomes quite possible. Make sure you select search mode: Regular Expression.

[xx,yy] --> [xx00yy0000]
Find what: \[(\d\d),(\d\d)\]
Replace with: [\100\20000]

[xx.x,yy.y] --> [xxx0yyy000]
Find what: \[(\d\d)[.](\d),(\d\d)[.](\d)\]
Replace with: [\1\20\3\4000]

[xx.x,yy] --> [xxx0yy0000]
Find what: \[(\d\d)[.](\d),(\d\d)\]
Replace with: [\1\20\30000]

[xx,yy.y] --> [xx00yyy000]
Find what: \[(\d\d),(\d\d)[.](\d)\]
Replace with: [\100\2\3000]

If you run those 4 find/replace patterns, they should take care of your situation.

Answer 2

I don't use notepad++, but if you do these 3 find/replaces in this order, it should get you most of the way there.

1. find *.* replace with "" (empty string)
2. find *,* replace with 0
3. find *]  replace with 000]

My question would be though, in the resulting data, how will you distinguish 20.0 in your starting value from 2.0 as a starting value? Both would parse to 2000.

gnu/linux sed command has s///q replace, which can use regular expressions, but I posit that it would take several independent expressions to perform the operation you suggest. This would not be optimal if the ordering of the lines matters.

A better approach would be to use a scripting language to run a loop over the input, attempt to match each data line to the style of line it is, and then convert that line according to what expression it matches.

The algorithm would look something like this:

1 read in a line
2 test against the 4 different patterns
3 depending on pattern, convert individual line
4 write out individual line

Searching for a complex Regular Expression to use with Notepad++

Question

2 answers

solution1
1 2010-12-11 00:04:15

solution2
0 2010-12-11 00:00:49

Searching for a complex Regular Expression to use with Notepad++

Question

2 answers

solution1 1 2010-12-11 00:04:15

solution2 0 2010-12-11 00:00:49

solution1
1 2010-12-11 00:04:15

solution2
0 2010-12-11 00:00:49