parse tag with regex in javascript

Question

I have this string:

   s='data-id="a1429883480588" class="privateMessage" @zaza
    data-id="a1429883480589" class="privateMessage" @zaza2
    data-id="a1429883480598" class="privateMessage" @zaza3'

My goal is to capture the what's between : data-id=" and " to have results: [a1429883480588, a1429883480589, a1429883480598]

I tried with

var splitted = s.match(/data-id="(\w)+(?=")/g)

But this also captures data-id=" and "

Any idea on how to write this regex ?

It must be done with JS since it is nodeJS function !

Answer 1

If you're happy that the string will always be well formed and not mangled up. Here's one that'll do it:

var s = '<span data-id="a1429883480588" class="privateMessage">@zaza</span>&nbsp;';
s += '<span data-id="a1429883480589" class="privateMessage">@zaza2</span>&nbsp;';
s += '<span data-id="a1429883480598" class="privateMessage">@zaza3</span>';

s.match(/data-id="\w+"/g).map(function(attributeAndValue) {
    return attributeAndValue.split('"')[1];
})

The concerns raised above about using RegEx to parse HTML are valid but more for HTML in the wild.

Answer 2

Here's the cheerio equivalent, just for reference or whatever

var cheerio = require('cheerio');

var markup = '<span data-id="a1429883480588" class="privateMessage">@zaza</span>&nbsp;<span data-id="a1429883480589" class="privateMessage">@zaza2</span>&nbsp;<span data-id="a1429883480598" class="privateMessage">@zaza3</span>';
var $ = cheerio.load('<div>'+markup+'</div>');
var ids = Array.prototype.map.call($('[data-id]'), function(e) {
    return $(e).attr('data-id');
});

console.log(ids);
// [ 'a1429883480588', 'a1429883480589', 'a1429883480598' ]

parse tag with regex in javascript

Question

2 answers

solution1
1 ACCPTED 2015-04-24 16:56:18

solution2
1 2015-04-24 17:12:05

parse tag with regex in javascript

Question

2 answers

solution1 1 ACCPTED 2015-04-24 16:56:18

solution2 1 2015-04-24 17:12:05

solution1
1 ACCPTED 2015-04-24 16:56:18

solution2
1 2015-04-24 17:12:05