jQuery / JavaScript正确解析字符串

Question

Recently, I've been attempting to emulate a small language in jQuery and JavaScript, yet I've come across what I believe is an issue. 最近，我一直在尝试在jQuery和JavaScript中模拟一种小型语言，但遇到了我认为是个问题。 I think that I may be parsing everything completely wrong. 我认为我可能正在解析所有完全错误的内容。
In the code: 在代码中：

@name Testing
@inputs
@outputs
@persist 
@trigger 
print("Test")

The current way I am separating and parsing the string is by splitting all of the code into lines, and then reading through this lines array using searches and splits. 我目前分离和解析字符串的方法是将所有代码拆分为几行，然后使用搜索和拆分方式遍历该行数组。 For example, I would find the name using something like: 例如，我会使用类似以下内容的名称：

if(typeof lines[line] === 'undefined')
{
}
else
{
    if(lines[line].search('@name') == 0)
    {
        name = lines[line].split(' ')[1];
    }
}

But I think that I may be largely wrong on how I am handling parsing. 但是我认为我在处理解析方面可能大错了。
While reading through examples on how other people are handling parsing of code blocks like this, it appeared that people parsed the entire block, instead of splitting it into lines as I do. 在阅读有关其他人如何处理这样的代码块解析的示例时，似乎人们在解析整个块，而不是像我一样将其拆分为几行。 I suppose the question of the matter is, what is the proper and conventional way of parsing things like this, and how do you suggest I use it to parse something such as this? 我想问题是，解析这样的事情的正确和常规方法是什么，您如何建议我用它来解析这样的事情？

Answer 1

In simple cases like this regular expressions is your tool of choice: 在像这样的简单情况下，正则表达式是您选择的工具：

matches = code.match(/@name\s+(\w+)/)
name = matches[1]

To parse "real" programming languages regexps are not powerful enough, you'll need a parser, either hand-written or automatically generated with a tool like PEG . 要解析“真实的”编程语言，正则表达式功能不够强大，您需要一个解析器，该解析器可以是手写的，也可以使用PEG之类的工具自动生成。

Answer 2

A general approach to parsing, that I like to take often is the following: 我喜欢经常采用的一般解析方法如下：

loop through the complete block of text, character by character. 逐字符遍历整个文本块。
if you find a character that signalizes the start of one unit, call a specialized subfunction to parse the next characters. 如果找到一个信号来指示一个单元的开始，请调用一个专门的子函数来解析下一个字符。
within each subfunction, call additional subfunctions if you find certain characters 在每个子功能中，如果找到某些字符，则调用其他子功能
return from every subfunction when a character is found, that signalizes, that the unit has ended. 当找到一个字符，表示该单元已结束时，从每个子功能返回。

Here is a small example: 这是一个小例子：

var text = "@func(arg1,arg2)"

function parse(text) {
    var i, max_i, ch, funcRes;

    for (i = 0, max_i = text.length; i < max_i; i++) {
        ch = text.charAt(i);

        if (ch === "@") {
            funcRes = parseFunction(text, i + 1);
            i = funcRes.index;
        }
    }
    console.log(funcRes);
}

function parseFunction(text, i) {
    var max_i, ch, name, argsRes;

    name = [];    
    for (max_i = text.length; i < max_i; i++) {
        ch = text.charAt(i);

        if (ch === "(") {
            argsRes = parseArguments(text, i + 1);
            return {
                name: name.join(""),
                args: argsRes.arr,
                index: argsRes.index
            };
        } 
        name.push(ch);
    }
}

function parseArguments(text, i) {
    var max_i, ch, args, arg;

    arg = [];
    args = [];
    for (max_i = text.length; i < max_i; i++) {
        ch = text.charAt(i);

        if (ch === ",") {
            args.push(arg.join(""));
            arg = [];
            continue;
        } else if (ch === ")") {
            args.push(arg.join(""));
            return {
                arr: args,
                index: i
            };
        }
        arg.push(ch);
    }
}

FIDDLE 小提琴

this example just parses function expressions, that follow the syntax "@functionName(argumentName1, argumentName2, ...)". 此示例仅分析遵循语法“ @functionName（argumentName1，argumentName2，...）”的函数表达式。 The general idea is to visit every character exactly once without the need to save current states like "hasSeenAtCharacter" or "hasSeenOpeningParentheses", which can get pretty messy when you parse large structures. 一般的想法是只访问一次每个字符，而无需保存诸如“ hasSeenAtCharacter”或“ hasSeenOpeningParentheses”之类的当前状态，当解析大型结构时，这些状态可能会变得非常混乱。

Please note that this is a very simplified example and it misses all the error handling and stuff like that, but I hope the general idea can be seen. 请注意，这是一个非常简化的示例，它错过了所有错误处理和类似内容，但我希望可以看到总体思路。 Note also that I'm not saying that you should use this approach all the time. 还要注意，我并不是说您应该一直使用这种方法。 It's a very general approach, that can be used in many scenerios. 这是一种非常通用的方法，可以在许多场景中使用。 But that doesn't mean that it can't be combined with regular expressions for instance, if it, at some part of your text, makes more sense than parsing each individual character. 但这并不意味着它不能与正则表达式结合使用，例如，如果它在文本的某个部分比解析每个字符更有意义。

And one last remark: you can save yourself the trouble if you put the specialized parsing function inside the main parsing function, so that all functions have access to the same variable i . 最后一点：如果将专门的解析函数放在主解析函数中，这样所有函数都可以访问相同的变量i ，则可以避免麻烦。

jQuery / JavaScript正确解析字符串

问题描述

2 个解决方案

解决方案1
0 2013-07-13 21:31:28

解决方案2
0 2013-07-13 22:09:52

jQuery / JavaScript正确解析字符串

问题描述

2 个解决方案

解决方案1 0 2013-07-13 21:31:28

解决方案2 0 2013-07-13 22:09:52

解决方案1
0 2013-07-13 21:31:28

解决方案2
0 2013-07-13 22:09:52