简体   繁体   English

如何使用C#解析JavaScript代码/数组

[英]How to parse a JavaScript code/array with c#

I've got a web request to a javascript file. 我已经收到一个对javascript文件的网络请求。 As a response I've JavaScript-Snippet which I'm trying to parse in C#. 作为回应,我有一个JavaScript片段 ,我正在尝试使用C#进行解析。

The Snippet looks like this: 该代码段如下所示:

sDt[1647110]=['SVK U19 A','D43A71','Jupie Podlavice Badin(U19)','TJ Straza(U19)','','',' / '
,'','',114745,114746,1,'',0,0,0,1012,1,'','',''];sDt[1647108]=['SVK U19 A','D43A71','Kysucke Nove Mesto(U19)',
'MFK Lokomotiva Zvolen(U19)','','',' / ','','',114741,114742,1,'',0,0,0,1012,1,'','',''];
sDt[1647109]=['SVK U19 A', /* A lot of more of that kind followed by */ ;WLID[1623901]=1;
WLID[1623902]=1;WLID[1623903]=1;WLID[1637686]=1;
WLID[1637692]=1;WLID[1637687]=1;WLID[1637688]=1;WLID[1637685]= /* ending with */ 
var ORD = [1647110,1647108,1647109,1647133,1645669,1647122,1626152,1647251,1646643,
1647130,1646685,1 ... ];

Obviously this isn't pure JSON array. 显然,这不是纯JSON数组。 Now I wonder how to parse this most efficiently. 现在我想知道如何最有效地解析它。 First I started to do this per pedes meaning usig String.Split and so on. 首先,我开始按照脚步来执行此操作,这意味着使用字符串String.Split等。 But this is slow and unfortunately not really stable. 但这是缓慢的,不幸的是,它并不是很稳定。

While the Part behind each sDt[Idendifier]= is an Array which I could parse with Json.Net I also need the Idendifier . 虽然每个sDt[Idendifier]=后面的部分都是一个数组,我可以用Json.Net解析, 我也需要Idendifier Everything else like WLID or var ORD I can ignore. 其他一切,例如WLIDvar ORD我都可以忽略。

Does anyone has an idea how to do this efficiently? 有谁知道如何有效地做到这一点?

Thanks in advance 提前致谢

You have to go through the whole request token by token if you don't have any other information. 如果您没有其他信息,则必须逐个令牌遍历整个请求令牌。 There is no other way around. 没有其他办法了。 Why don't you just send the JSON? 您为什么不只发送JSON?

But to parse it I would do the following: Go through the whole request. 但是要解析它,我将执行以下操作:遍历整个请求。 If you come across a '[' make sure to check if you're not in a string. 如果遇到“ [”,请确保检查您是否不在字符串中。 (For example by setting a flag when you stumble over a ' " ' and by unsetting it if you come to the next ' " '). (例如,当您无意中碰到“”时设置一个标志,如果碰到下一个“”则取消设置)。 If you're are not parsing a string right now, the following tokens are either the identifier or the content. 如果您现在不解析字符串,则以下标记是标识符或内容。 You can easily check that. 您可以轻松地检查。 In case of a number, this is your identifier until you reach "]" (and given that you aren't parsing a string currently). 如果是数字,这是您的标识符,直到您到达“]”为止(并且鉴于您当前未解析字符串)。 In the other case it's the content which you can parse with Json.Net now, just remember where (the index) the first "[" and the following "]" is and you can generate a substring which you can then pass to Json.Net. 在另一种情况下,它是您现在可以使用Json.Net解析的内容,只需记住第一个“ [”和后面的“]”在(索引)的位置,然后可以生成一个子字符串,然后可以将该子字符串传递给Json。净。 If you come across a ";" 如果遇到“;” and you are not in a string, make sure that you skip the WLID and ORD part. 并且您不在字符串中,请确保跳过WLID和ORD部分。

The whole operation takes O(n * m) with n=Number of tokens and m=length of the longest content string. 整个操作采用O(n * m),其中n =令牌数,m =最长内容字符串的长度。

If you do the parsing of the content yourself (and not letting Json.Net do that for you) you could narrow it down to O(n) of course. 如果您自己进行内容解析(而不是让Json.Net为您完成),则可以将其范围缩小到O(n)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM