[英]How do I extract tokens from string?
I'm trying to make a 'Compiler' and I need to get the comparison operators like:我正在尝试制作一个“编译器”,我需要获得如下比较运算符:
from an input string.来自输入字符串。
I'm trying to tokenize my input string in order to get the operators but instead I get the two outputs eg:我试图标记我的输入字符串以获得运算符,但我得到了两个输出,例如:
'<' and '=' or '>' and '=' '<' 和 '=' 或 '>' 和 '='
static List<String> divideSymbols(string token){
/** quitamos punto y coma y tokenizamos separando por operadores*/
List<String> myTokens = new List<string>();
// separar operadores
char [] tokens = token.ToCharArray();
String accum="";
String accum1 = "";
for(int i=0;i<tokens.Length; i++){
try{
if((tokens[i]!='>' && tokens[i]!='<' && tokens[i]!='=' && tokens[i]!='+' && tokens[i]!='-' && tokens[i]!='(' && tokens[i]!=')' && tokens[i]!='{' && tokens[i]!='}' && tokens[i]!=(char)34 && tokens[i]!=(char)39 && tokens[i]!='/' && tokens[i]!='*' && tokens[i]!='%' && tokens[i]!='&' && tokens[i]!='|' && tokens[i]!='!' && tokens[i]!=',' && tokens[i]!='[' && tokens[i]!=']' /*&& tokens[i+1] !='='*/) ){
/*if((tokens[i] == '>' || tokens[i] == '<' || tokens[i]== '!') && tokens[i+1]== '=' ){
Console.WriteLine("TEST");
}*/
if(tokens[i] == '<' && tokens[i+1] == '='){
}
if(tokens[i]!=';' ){ // quitar ; (punto y coma)
removeDuplicates(accum);
accum+=tokens[i];
}
}else{
if(accum!=""){
myTokens.Add(accum);
myTokens.Add(tokens[i].ToString());
}else{
removeDuplicates(accum);
myTokens.Add(tokens[i].ToString());
}
accum="";
}
if((tokens[i]== '>' ||tokens[i]== '<' || tokens[i]== '!' || tokens[i]== '=') && tokens[i+1] == '='){
accum1 = tokens[i].ToString()+tokens[i+1].ToString();
myTokens.Add(accum1);
i++;
//myTokens.Remove('<');
}
}catch(IndexOutOfRangeException){
}
}
myTokens.Add(accum);
myTokens.Add(accum1);
return myTokens;
}
When I get the output I get both of the tokens and I need to delete the first one if the second one is a = sign.当我得到输出时,我得到了两个令牌,如果第二个是 = 符号,我需要删除第一个。
The expected output is :预期的输出是:
1,<if_stmt>,if
1,<open_parents>,(
1,<number>,4
1,<morethan_op>,>
1,<eqmorethan_op>,>=
1,<number>,5
1,<close_parents>,)
1,<open_braces>,{
1,<eqmorethan_op>,>=
2,<print>,print
2,<open_parents>,(
2,<string_op>,"
2,<variable>,yes
2,<string_op>,"
2,<close_parents>,)
3,<close_braces>,}
4,<class>,class
4,<variable>,Foo
4,<open_braces>,{
5,<type>,int
5,<variable>,key
6,<close_braces>,}
8,<variable>,Foo
8,<variable>,a
but without repeating the > and >=.但不要重复 > 和 >=。
If I understand your problem correctly - what we need to do is extract the comparison operators from an input string.如果我正确理解您的问题 - 我们需要做的是从输入字符串中提取比较运算符。
So the idea here is to traverse the string and concentrate only on所以这里的想法是遍历字符串并只专注于
As you trying to simulate a compiler what we know is that each line must end with a semi-column (punto y coma);当您尝试模拟编译器时,我们知道每一行必须以半列(punto y coma)结尾;
With that in mind what we do is pass the string, remove all whitespaces, analyse and return the found comparison operators.考虑到这一点,我们所做的是传递字符串,删除所有空格,分析并返回找到的比较运算符。
static string RemoveWhitespace( string input)
{
int j = 0, inputlen = input.Length;
char[] newarr = new char[inputlen];
for (int i = 0; i < inputlen; ++i)
{
char tmp = input[i];
if (!char.IsWhiteSpace(tmp))
{
newarr[j] = tmp;
++j;
}
}
return new String(newarr, 0, j);
}
static List<String> DivideSymbols(string tokenisedString)
{
string token= RemoveWhitespace(tokenisedString);
List<String> myTokens = new List<string>();
List<char> tokensToSkip = new List<char> { '+', '-', '(', ')', '{', '}', '/', '*', '%', '&', '|', '!', ',', '[', ']', '\'', '"' };
char different = '!';
char lessThan = '<';
char greaterThan = '>';
char equal = '=';
char endOfLine = ';';
for (int i = 0; i < token.Length - 1; i++)
{
if (token[i] == endOfLine)
{
break;
}
if (token[i] == different && token[i + 1] == equal)
{
myTokens.Add(token[i].ToString() + token[i + 1].ToString());
}
if (token[i] == lessThan && token[i + 1] == equal)
{
myTokens.Add(token[i].ToString() + token[i + 1].ToString());
}
if (token[i] == greaterThan && token[i + 1] == equal)
{
myTokens.Add(token[i].ToString() + token[i + 1].ToString());
}
}
return myTokens;
}
static void Main(string[] args)
{
DivideSymbols("if(x == 1 && x >= 10 || v != 3 'perhaps something like' AND SQL <::= c<=d) { x++};");
}
Now you have the logic in place and can do whatever you like with them - within the if logic.现在你已经有了逻辑,可以对它们做任何你喜欢的事情——在 if 逻辑中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.