简体   繁体   中英

Extract javascript table with regex in Python

I have scraped a webpage and I would like to extract a particular table using regex. How would I have to proceed?

renderList([
["Platform_1","1","TWOB","6150","Paris","France", "2016/03/01", "01-03-2016", "", ""],
["Platform_2","3","NRCN","2000","Bruxelles","Belgium", "2009/04/21", "21-04-2009", "", ""],
]
);

I want to keep the list format, such that I only retrieve the following string:

["Platform_1","1","TWOB","6150","Paris","France", "2016/03/01", "01-03-2016", "", ""],
["Platform_2","3","NRCN","2000","Bruxelles","Belgium", "2009/04/21", "21-04-2009", "", ""],

And if possible, remove the comma at the end of the table:

["Platform_1","1","TWOB","6150","Paris","France", "2016/03/01", "01-03-2016", "", ""],
["Platform_2","3","NRCN","2000","Bruxelles","Belgium", "2009/04/21", "21-04-2009", "", ""]

You can use substring to extract the array. Then you can parse it to produce the output you wish:

 const str = `renderList([ ["Platform_1","1","TWOB","6150","Paris","France", "2016/03/01", "01-03-2016", "", ""], ["Platform_2","3","NRCN","2000","Bruxelles","Belgium", "2009/04/21", "21-04-2009", "", ""], ] );` const table = JSON.parse(str.substring(11, str.length - 2).replaceAll('\n', '').replaceAll(',]', ']')); console.log(table);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM