[英]Ragel Java outputs [nulla,b] instead of [a,b,c]
I would like to code a CSVReader State Machine in ragel, since I've doned mine in Java with Enums already. 我想在ragel中编写CSVReader状态机,因为我已经使用Enums在Java中进行了开发。 The returned list should be [a,b,c] but I get [nulla,b].
返回的列表应该是[a,b,c],但我得到的是[nulla,b]。 I'm using Ragel 6.8 on Fedora 22, I really hope any one could help me
我在Fedora 22上使用Ragel 6.8,我真的希望有人能帮助我
This is the source: 这是来源:
%%{
machine csv_reader_java;
seperator = (';'|',');
letter = [A-Za-z0-9]*;
main := |*
seperator => { putToList(tokens, string); };
letter => { emit(data, tokens, ts, te); };
space;
*|;
}%%
import java.util.*;
public class CSVReader {
private String string;
public void emit(char[] data, List<String> tokens, int ts, int te) {
char output = data[ts];
string += output;
}
public void putToList(List<String> tokens, String data){
tokens.add(data);
string = "";
}
%% write data;
public List<String> split(char[] data) {
int cs; /* state number */
int p = 0, /* start of input */
pe = data.length, /* end of input */
eof = pe,
ts, /* token start */
te, /* token end */
act /* used for scanner backtracking */;
List<String> tokens = new ArrayList<String>();
%% write init;
%% write exec;
return tokens;
}
public static void main(String[] args) {
System.out.println(new CSVReader().split("a,b,c".toCharArray()));
}
}
And this is what it returns me: [nulla, b]
这就是它给我的回报:
[nulla, b]
Looking over this, I see two problems. 纵观此,我看到两个问题。 The first is the null at the start of your output, which I think is caused by not initializing
string
when parsing starts, which leaves it null
. 第一个是输出开始处的null,我认为这是由于解析开始时未初始化
string
导致的,因此将其null
。 When the emit
call reaches string += output;
当
emit
调用到达string += output;
, string
is null
, so it appends the current token ("a") to the string representation of null
, resulting in "nulla". ,
string
为null
,因此它将当前标记(“ a”)附加到null
的字符串表示形式中,结果为“ nulla”。 Initializing string
with ""
would fix this. 用
""
初始化string
将解决此问题。
The second problem, the one of not adding "c" to the list, is even simpler. 第二个问题,就是不将“ c”添加到列表中的问题,甚至更加简单。 The tokens only get added to the list when a separator is found, and since there is no separator after "c", that token doesn't get added.
仅当找到分隔符时,令牌才会添加到列表中,并且由于“ c”后没有分隔符,因此不会添加该令牌。 You could solve this by calling an action on end-of-file to
emit
the current token string if it isn't empty. 你可以通过调用档案结尾的行动来解决这个
emit
当前令牌字符串,如果它不是空的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.