简体   繁体   中英

RegEx to match text to first occurence of a delimiter

This is the data I want to match with RegEx:

<table>
  <tr>
    <td>
      <font size="4">Speciality</font>
    </td>
    <td>
      <font size="4">somespeciality</font>
    </td>
  </tr>
  <tr>
    <td>
      <font size="4">Date</font>
    </td>
    <td>
      <font size="4">somedate</font>
    </td>
  </tr>
</table>

I want to get as a result somespeciality but with this RexEx:

/Speciality[\s\S]*size="4">(.*?)<\/font>/i

I'm getting somedate . What is the correct way to do this?

Thanks.

在角色类之后,您需要使用 贪婪量词。

[\s\S]*?

Just for the record, if you did want to do this with plain DOM methods, you'd do something like the following. It gets all the elements, finds the first one with text content that matches the text, gets it's tagname, then finds the next element with that tag name and returns the text content:

var data = '<table><tr><td><font size="4">Speciality</font></td>' +
           '<td><font size="4">somespeciality</font></td></tr>' +
           '<tr><td><font size="4">Date</font></td><td><font size="4">' +
           'somedate</font></td></tr></table>';

function getSpecial(text, data) {
  var div = document.createElement('div');
  div.innerHTML = data;
  var tagName;

  var nodes = div.getElementsByTagName('*');

  for (var i=0, iLen=nodes.length; i<iLen; i++) {
    if (tagName && nodes[i].tagName == tagName) {
      return nodes[i].textContent;
    }

    if (nodes[i].textContent.trim() == text) {
      tagName = nodes[i].tagName;
    }
  }
}

console.log(getSpecial('Speciality', data)); // somespeciality

The difficulty with any such approach (including using a regular expression) is that any change to the markup (and resulting DOM) will likely cause the process to fail.

Note that the above requires ES5 and support for textContent , which should be all modern browsers and IE 9+. Support for older browsers can be added by adding a polyfill for trim and using nodes[i].textContent || nodes[i].innerText nodes[i].textContent || nodes[i].innerText . The rest will be fine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM