简体   繁体   English

Ragel Java输出[nulla,b]而不是[a,b,c]

[英]Ragel Java outputs [nulla,b] instead of [a,b,c]

I would like to code a CSVReader State Machine in ragel, since I've doned mine in Java with Enums already. 我想在ragel中编写CSVReader状态机,因为我已经使用Enums在Java中进行了开发。 The returned list should be [a,b,c] but I get [nulla,b]. 返回的列表应该是[a,b,c],但我得到的是[nulla,b]。 I'm using Ragel 6.8 on Fedora 22, I really hope any one could help me 我在Fedora 22上使用Ragel 6.8,我真的希望有人能帮助我

This is the source: 这是来源:

%%{

machine csv_reader_java;
seperator = (';'|',');
letter = [A-Za-z0-9]*;

main := |*
seperator => { putToList(tokens, string); };
letter => { emit(data, tokens, ts, te); };
space;
*|;

}%%

import java.util.*;

public class CSVReader {

private String string;

public void emit(char[] data, List<String> tokens, int ts, int te) {
     char output = data[ts];
     string += output;
}
public void putToList(List<String> tokens, String data){
tokens.add(data);
string = "";
}

%% write data;

public List<String> split(char[] data) {
    int cs; /* state number */
    int p = 0, /* start of input */
    pe = data.length, /* end of input */
    eof = pe,
    ts, /* token start */
    te, /* token end */
    act /* used for scanner backtracking */;

    List<String> tokens = new ArrayList<String>();

    %% write init;
    %% write exec;

    return tokens;
}

public static void main(String[] args) {
    System.out.println(new CSVReader().split("a,b,c".toCharArray()));
}
}

And this is what it returns me: [nulla, b] 这就是它给我的回报: [nulla, b]

Looking over this, I see two problems. 纵观此,我看到两个问题。 The first is the null at the start of your output, which I think is caused by not initializing string when parsing starts, which leaves it null . 第一个是输出开始处的null,我认为这是由于解析开始时未初始化string导致的,因此将其null When the emit call reaches string += output; emit调用到达string += output; , string is null , so it appends the current token ("a") to the string representation of null , resulting in "nulla". stringnull ,因此它将当前标记(“ a”)附加到null的字符串表示形式中,结果为“ nulla”。 Initializing string with "" would fix this. ""初始化string将解决此问题。

The second problem, the one of not adding "c" to the list, is even simpler. 第二个问题,就是不将“ c”添加到列表中的问题,甚至更加简单。 The tokens only get added to the list when a separator is found, and since there is no separator after "c", that token doesn't get added. 仅当找到分隔符时,令牌才会添加到列表中,并且由于“ c”后没有分隔符,因此不会添加该令牌。 You could solve this by calling an action on end-of-file to emit the current token string if it isn't empty. 你可以通过调用档案结尾的行动来解决这个emit当前令牌字符串,如果它不是空的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM