繁体   English   中英

使用 Python 中的正则表达式提取 javascript 表

[英]Extract javascript table with regex in Python

我已经抓取了一个网页,我想使用正则表达式提取一个特定的表。 我将如何进行?

renderList([
["Platform_1","1","TWOB","6150","Paris","France", "2016/03/01", "01-03-2016", "", ""],
["Platform_2","3","NRCN","2000","Bruxelles","Belgium", "2009/04/21", "21-04-2009", "", ""],
]
);

我想保留列表格式,这样我只检索以下字符串:

["Platform_1","1","TWOB","6150","Paris","France", "2016/03/01", "01-03-2016", "", ""],
["Platform_2","3","NRCN","2000","Bruxelles","Belgium", "2009/04/21", "21-04-2009", "", ""],

如果可能,请删除表格末尾的逗号:

["Platform_1","1","TWOB","6150","Paris","France", "2016/03/01", "01-03-2016", "", ""],
["Platform_2","3","NRCN","2000","Bruxelles","Belgium", "2009/04/21", "21-04-2009", "", ""]

您可以使用substring来提取数组。 然后您可以对其进行解析以生成您希望的 output:

 const str = `renderList([ ["Platform_1","1","TWOB","6150","Paris","France", "2016/03/01", "01-03-2016", "", ""], ["Platform_2","3","NRCN","2000","Bruxelles","Belgium", "2009/04/21", "21-04-2009", "", ""], ] );` const table = JSON.parse(str.substring(11, str.length - 2).replaceAll('\n', '').replaceAll(',]', ']')); console.log(table);

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM