简体   繁体   中英

Javascript Regex format - works in PHP, not in JS

I'm trying to use regular expressions in JS but I'm probably missing something because I can't get it to work. I have a couple of regular expression working fine in PHP (used with a preg_match) but when I use exactly the same expression in JS, I get no matched patterns.

Here is an example, I'm trying to parse the page:

https://www.coolblue.be/fr/rechercher?query=GTX+1060&trier=prix-les-moins-chers

My code:

var pattern = '/<a class=\"product__title js-product-title\" href=\"(.*)\" data-trackclickevent=\"(.*)\">[\n\r\s]+(.*)[\n\r\s]+<\/a>/gi';
var found = content.match(pattern);

The variable content contains the full source code of the page, I have dumped it in the console to make sure it was working and I see for example: (the code is dirty but I took it from the page mentionned above without changing anything)

<div class="product__titles"><div class="js-product-feature-title"></div><a class="product__title js-product-title" href="/fr/produit/654109" data-trackclickevent="Internal Search, Product, Oehlbach BTX 1000 (654109) - Product title">
                Oehlbach BTX 1000
            </a></div><div class="product__review-rating"><div class="review-rating alt-compact"><div class="review-rating--rating">

When I use https://regex101.com/ to test my regular expression, it also works but somehow in JS it doesn't.

Any idea of what I'm missing ?

thanks

Laurent

In JavaScript you will use regular expressions mostly in two methods: test and replace. Whereas test just tells you whether its argument matches the regular expression, replace takes a second parameter: the string to replace the text that matches. Like most functions, replace generates a new string as a return value; it does not change the input eg;

 document.write(/cats/i.test("Cats are fun. I like cats.")); 
And replace:

 document.write("Cats are fun. I like cats.".replace(/cats/gi,"dogs")); 

and also in Javascript, you have to escape the close bracket "]" as below;

 \\[([^\\]\\s]+).([^\\]]+)\\] 

Ok, here is how I solved it.

var el = document.createElement( 'html' );
el.innerHTML = content;
all_links = el.getElementsByClassName("product__title");

content needs to contain your html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM