I am currently building a small text editor for a custom file format. I have a GUI, but I also implemented a small output console. What I want to achieve is to add a very basic input field to execute some commands and pass parameters. A command would look like :
compile test.json output.bin -location "Paris, France" -author "Charles \\"Demurgos\\""
My problem is to get an array containing the space-separated arguments, but preserving the double quoted parts which might be a string generated by JSON.stringify
containing escaped double-quotes inside.
To be clear, the expected array for the previous command is :
[
'compile',
'test.json',
'output.bin',
'-location',
'"Paris, France"',
'-author',
'"Charles \\"Demurgos\\""'
]
Then I can iterate over this array and apply a JSON.parse
if indexOf('"') == 0
to get the final result :
[
'compile',
'test.json',
'output.bin',
'-location',
'Paris, France',
'-author',
'Charles "Demurgos"'
]
Thanks to this question : Split a string by commas but ignore commas within double-quotes using Javascript . I was able to get what I need if the arguments do NOT contain any double-quotes. Here is the regex i got :
/(".*?"|[^"\\s]+)(?=\\s*|\\s*$)/g
But it exits the current parameter when it encounters a double-quote, even if it is escaped. How can I adapt this RegEx to take care about the escaped or not double quotes ? And what about edge cases if I prompt action "windowsDirectory\\\\" otherArg
, here the backslash is already escaped so even if it's followed by a double quote, it should exit the argument. This a problem I was trying to avoid as long as possible during previous projects, but I feel it's time for me to learn how to properly take under-account escape characters.
Here is a JS-Fiddle : http://jsfiddle.net/GwY8Y/1/ You can see that the beginning is well-parsed but the last arguments is split and bugs.
Thank you for any help.
This regex will give you the strings you need (see demo ):
"(?:\\"|\\\\|[^"])*"|\S+
Use it like this:
your_array = subject.match(/"(?:\\"|\\\\|[^"])*"|\S+/g);
Explain Regex
" # '"'
(?: # group, but do not capture (0 or more times
# (matching the most amount possible)):
\\ # '\'
" # '"'
| # OR
\\\\ # two backslashes
| # OR
[^"] # any character except: '"'
)* # end of grouping
" # '"'
| # OR
\S+ # non-whitespace (all but \n, \r, \t, \f,
# and " ") (1 or more times (matching the
# most amount possible))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.