简体   繁体   English

正则表达式匹配 JSON 字符串

[英]Regex to match a JSON String

I am building a JSON validator from scratch, but I am quite stuck with the string part.我正在从头开始构建一个 JSON 验证器,但我非常坚持字符串部分。 My hope was building a regex which would match the following sequence found on JSON.org:我希望构建一个正则表达式,它与 JSON.org 上的以下序列匹配:

JSON.org 字符串序列

My regex so far is:到目前为止,我的正则表达式是:

/^\"((?=\\)\\(\"|\/|\\|b|f|n|r|t|u[0-9a-f]{4}))*\"$/

It does match the criteria with a backslash following by a character and an empty string.它确实与条件匹配,反斜杠后跟一个字符和一个空字符串。 But I'm not sure how to use the UNICODE part.但我不确定如何使用 UNICODE 部分。

Is there a regex to match any UNICODE character expert " or \ or control character? And will it match a newline or horizontal tab?是否有正则表达式可以匹配任何 UNICODE 字符专家“或 \ 或控制字符?它会匹配换行符或水平制表符吗?

The last question is because the regex match the string "\t", but not " " (four spaces, but the idea is to be a tab).最后一个问题是因为正则表达式匹配字符串“\t”,而不是“”(四个空格,但想法是成为一个制表符)。 Otherwise I will need to expand the regex with it, which is not a problem, but my guess is the horizontal tab is a UNICODE character.否则我需要用它扩展正则表达式,这不是问题,但我猜水平制表符是 UNICODE 字符。

Thanks to Jaeger Kor, I now have the following regex:感谢 Jaeger Kor,我现在有了以下正则表达式:

/^\"((?=\\)\\(\"|\/|\\|b|f|n|r|t|u[0-9a-f]{4})|[^\\"]*)*\"$/

It appears to be correct, but is there any way to check for control characters or is this unneeded as they appear on the non-printable characters on regular-expressions.info?它似乎是正确的,但是有什么方法可以检查控制字符,或者因为它们出现在regular-expressions.info 上的不可打印字符上,所以不需要这样做? The input to validate is always text from a textarea.要验证的输入始终是来自 textarea 的文本。

Update: the regex is as following in case anyone needs it:更新:如果有人需要,正则表达式如下:

/^("(((?=\\)\\(["\\\/bfnrt]|u[0-9a-fA-F]{4}))|[^"\\\0-\x1F\x7F]+)*")$/

For your exact question create a character class 对于您的确切问题,请创建一个字符类

# Matches any character that isn't a \ or "
/[^\\"]/

And then you can just add * on the end to get 0 or unlimited number of them or alternatively 1 or an unlimited number with + 然后你可以在最后添加*以获得0或无限数量的它们,或者1或无限数字+

/[^\\"]*/

or 要么

/[^\\"]+/

Also there is this below, found at https://regex101.com/ under the library tab when searching for json 还有以下内容,在搜索json时,在库选项卡下的https://regex101.com/上找到

/(?(DEFINE)
# Note that everything is atomic, JSON does not need backtracking if it's valid
# and this prevents catastrophic backtracking
(?<json>(?>\s*(?&object)\s*|\s*(?&array)\s*))
(?<object>(?>\{\s*(?>(?&pair)(?>\s*,\s*(?&pair))*)?\s*\}))
(?<pair>(?>(?&STRING)\s*:\s*(?&value)))
(?<array>(?>\[\s*(?>(?&value)(?>\s*,\s*(?&value))*)?\s*\]))
(?<value>(?>true|false|null|(?&STRING)|(?&NUMBER)|(?&object)|(?&array)))
(?<STRING>(?>"(?>\\(?>["\\\/bfnrt]|u[a-fA-F0-9]{4})|[^"\\\0-\x1F\x7F]+)*"))
(?<NUMBER>(?>-?(?>0|[1-9][0-9]*)(?>\.[0-9]+)?(?>[eE][+-]?[0-9]+)?))
)
\A(?&json)\z/x

This should match any valid json, you can also test it at the website above 这应该匹配任何有效的json,你也可以在上面的网站上测试它

EDIT: 编辑:

Link to the regex 链接到正则表达式

Use this, works also with array jsons [{...},{...}]:使用它,也适用于数组 jsons [{...},{...}]:

((\[[^\}]{3,})?\{s*[^\}\{]{3,}?:.*\}([^\{]+\])?)

Demo: https://regex101.com/r/aHAnJL/1演示: https://regex101.com/r/aHAnJL/1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM