简体   繁体   中英

Regex to find <a> tags containing links to specific file types

I am trying to write a small jQuery / javascript function that searches through all the links on a page, identifies the type of file to which the tag links, and then adds an appropriate class. The purpose of this task is to style the links depending on the type of file at the other end of the link.

So far I have this:

$(document).ready(function(){
    $('#rt-mainbody a').each(function(){
        linkURL = $(this).attr('href');
        var match = linkURL.match("^.*\.(pdf|PDF)$");
        if(match != null){$(this).addClass('pdf');}
    });
});

Fiddle me this.

And then I would continue the concept to identify, for example, spreadsheet files, Word documents, text files, jpgs, etc.

it works... but the thing is, to me this is super clunky because I have completely botched it together from odds and sods I've found around SO and the internet - I'm sure there must be a neater, more efficient, more readable way of doing this but I have no idea what it might be. Can someone give it a spit and polish for me, please?

Ideally the function should detect (a) that the extension is at the end of the href string, and (b) that the extension is preceded by a dot.

Thanks! :)

EDIT

Wow! Such a response! :) Thanks guys!

When I saw the method using simply the selector it was a bit of a facepalm moment - however the end user I am building this app for is linking to PDFs (and potentially other MIMEs) on a multitude of resource websites and has no control over the case usage of the filenames to which they'll be linking... using the selector is clearly not the way to go because the result would be so inconsistent.

EDIT

And the grand prize goes to @Dave Stein!! :D

The solution I will adopt is a "set it and leave it" script ( fiddle me that ) which will accommodate any extension, regardless of case, and all I need to do is tweak the CSS for each reasonable eventuality.

It's actually nice to learn that I was already fairly close to the best solution already... more through good luck than good judgement though XD

Well you don't want to use regex to search strings so I like that you narrowed it to just links. I saved off $(this) so you don't have to double call it. I also changed the regex so it's case insensitive. And lastly I made sure that the class is adding what the match was. This accomplish what you want?

$(document).ready(function(){
    $('#rt-mainbody a').each(function(){
        var $link = $(this),
            linkURL = $link.attr('href'),
            // I can't remember offhand but I think some extensions have numbers too
            match = linkURL.match( /^.*\.([a-z0-9]+)$/i );

        if( match != null ){
          $link.addClass( match[1].toLowerCase() );
        }
    });
});

Oh and I almost forgot, I made sure linkURL was no longer global. :)

"Attribute ends with" selector :

$('#rt-mainbody a[href$=".pdf"], #rt-mainbody a[href$=".PDF"]').addClass('pdf')

EDIT: Or more generally and flexibly:

var types = {
  doc: ['doc', 'docx'],
  pdf: ['pdf'],
  // ...
};

function addLinkClasses(ancestor, types) {
  var $ancestor = $(ancestor);
  $.each(types, function(type, extensions) {
    selector = $.map(extensions, function(extension) {
        return 'a[href$=".' + extension + '"]';
      }).join(', ');
    $ancestor.find(selector).addClass(type);
  });
}

addLinkClasses('#rt-mainbody', types);

This is case sensitive, so I suggest you canonicalise all extensions to lowercase on your server.

正则表达式应为/^.*\\.(pdf)$/i

use this regex (without quotes):

/\.(pdf|doc)$/i

this regex matches (case insensitive) anything that ends with .pdf, .doc etc.

for dynamic class:

    var match = linkURL.match(/\.(pdf|doc)$/i);
    match = match ? match[1].toLowerCase() : null;
    if (match != null) {
        $(this).addClass(match);
    }

您可以在选择器中使用它(查找所有指向pdf文件的链接)

a[href$=".pdf"]

Another answer, building off of @Amadan is:

var extensions = [
  'pdf',
  'jpg',
  'doc'
];

$.each( extensions, function( i, v) {
  $('#rt-mainbody').find( 'a[href$=".' + v + '"], a[href$=".' + v.toUpperCase() + '"]')
  .addClass( extension );
});

The onyl suggestion I would make is that you can change your match to inspect what is the file extension instead of having to do a different regex search for each possible file extension:

var linkURL = $(this).attr('href');  //<--you were accidentally declared linkURL as a global BTW.
var match = linkURL.match(/\.(.*)$/);
if(match != null){
   //we can extract the part between the parens in our regex
   var ext = match[1].toLowerCase() 
   switch(ext){
      case 'pdf': $(this).addClass('pdf'); break;
      case 'jpg': $(this).addClass('jpg'); break;
      //...
   }
}

This switch statement mostly useful if you want the option of using class names that are different from your file extensions. If the file extension is always the same you can consider changing the regex to something that fits the file extensions you want

/\.(pdf|jpg|txt)$/i  //i for "case insensitive"

and then just do

var ext = match[1].toLowerCase() 
$(this).addClass(ext);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM