I have this string:
s='data-id="a1429883480588" class="privateMessage" @zaza
data-id="a1429883480589" class="privateMessage" @zaza2
data-id="a1429883480598" class="privateMessage" @zaza3'
My goal is to capture the what's between : data-id=" and " to have results: [a1429883480588, a1429883480589, a1429883480598]
I tried with
var splitted = s.match(/data-id="(\w)+(?=")/g)
But this also captures data-id=" and "
Any idea on how to write this regex ?
It must be done with JS since it is nodeJS function !
If you're happy that the string will always be well formed and not mangled up. Here's one that'll do it:
var s = '<span data-id="a1429883480588" class="privateMessage">@zaza</span> ';
s += '<span data-id="a1429883480589" class="privateMessage">@zaza2</span> ';
s += '<span data-id="a1429883480598" class="privateMessage">@zaza3</span>';
s.match(/data-id="\w+"/g).map(function(attributeAndValue) {
return attributeAndValue.split('"')[1];
})
The concerns raised above about using RegEx to parse HTML are valid but more for HTML in the wild.
Here's the cheerio equivalent, just for reference or whatever
var cheerio = require('cheerio');
var markup = '<span data-id="a1429883480588" class="privateMessage">@zaza</span> <span data-id="a1429883480589" class="privateMessage">@zaza2</span> <span data-id="a1429883480598" class="privateMessage">@zaza3</span>';
var $ = cheerio.load('<div>'+markup+'</div>');
var ids = Array.prototype.map.call($('[data-id]'), function(e) {
return $(e).attr('data-id');
});
console.log(ids);
// [ 'a1429883480588', 'a1429883480589', 'a1429883480598' ]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.