简体   繁体   English

在Ruby 1.9.3上使用花括号进行通配

[英]Globbing using braces on Ruby 1.9.3

Recent versions of Ruby support the use of braces in globbing, if you use the File::FNM_EXTGLOB option 如果使用File :: FNM_EXTGLOB选项,则最新版本的Ruby支持在括号中使用花括号。

From the 2.2.0 documentation 2.2.0文档

File.fnmatch('c{at,ub}s', 'cats', File::FNM_EXTGLOB) #=> true  # { } is supported on FNM_EXTGLOB

However, the 1.9.3 documentation says it isn't supported in 1.9.3: 但是,1.9.3文档说1.9.3不支持它:

File.fnmatch('c{at,ub}s', 'cats')       #=> false # { } isn't supported

(also, trying to use File::FNM_EXTGLOB gave a name error) (此外,尝试使用File::FNM_EXTGLOB给出了名称错误)

Is there any way to glob using braces in Ruby 1.9.3, such as a third-party gem? 有什么办法可以在Ruby 1.9.3中使用花括号(例如第三方gem)?

The strings I want to match against are from S3, not a local file system, so I can't just ask the operating system to do the globbing as far as I know. 我要匹配的字符串来自S3,而不是本地文件系统,因此,我不能仅仅要求操作系统就我所知进行定位。

I'm in the process of packaging up a Ruby Backport for braces globbing support. 我正在打包Ruby Backport,以获取括号支持。 Here are the essential parts of that solution: 以下是该解决方案的重要部分:

module File::Constants
  FNM_EXTGLOB = 0x10
end

class << File
  def fnmatch_with_braces_glob(pattern, path, flags =0)
    regex = glob_convert(pattern, flags)

    return regex && path.match(regex).to_s == path
  end

  def fnmatch_with_braces_glob?(pattern, path, flags =0)
    return fnmatch_with_braces_glob(pattern, path, flags)
  end

private
  def glob_convert(pattern, flags)
    brace_exp = (flags & File::FNM_EXTGLOB) != 0
    pathnames = (flags & File::FNM_PATHNAME) != 0
    dot_match = (flags & File::FNM_DOTMATCH) != 0
    no_escape = (flags & File::FNM_NOESCAPE) != 0
    casefold = (flags & File::FNM_CASEFOLD) != 0
    syscase = (flags & File::FNM_SYSCASE) != 0
    special_chars = ".*?\\[\\]{},.+()|$^\\\\" + (pathnames ? "/" : "")
    special_chars_regex = Regexp.new("[#{special_chars}]")

    if pattern.length == 0 || !pattern.index(special_chars_regex)
      return Regexp.new(pattern, casefold || syscase ? Regexp::IGNORECASE : 0)
    end

    # Convert glob to regexp and escape regexp characters
    length = pattern.length
    start = 0
    brace_depth = 0
    new_pattern = ""
    char = "/"

    loop do
      path_start = !dot_match && char[-1] == "/"

      index = pattern.index(special_chars_regex, start)

      if index
        new_pattern += pattern[start...index] if index > start
        char = pattern[index]

        snippet = case char
        when "?"  then path_start ? (pathnames ? "[^./]" : "[^.]") : ( pathnames ? "[^/]" : ".")
        when "."  then "\\."
        when "{"  then (brace_exp && (brace_depth += 1) >= 1) ? "(?:" : "{"
        when "}"  then (brace_exp && (brace_depth -= 1) >= 0) ? ")" : "}"
        when ","  then (brace_exp && brace_depth >= 0) ? "|" : ","
        when "/"  then "/"
        when "\\"
          if !no_escape && index < length
            next_char = pattern[index += 1]
            special_chars.include?(next_char) ? "\\#{next_char}" : next_char
          else
            "\\\\"
          end
        when "*"
          if index+1 < length && pattern[index+1] == "*"
            char += "*"
            if pathnames && index+2 < length && pattern[index+2] == "/"
              char += "/"
              index += 2
              "(?:(?:#{path_start ? '[^.]' : ''}[^\/]*?\\#{File::SEPARATOR})(?:#{!dot_match ? '[^.]' : ''}[^\/]*?\\#{File::SEPARATOR})*?)?"
            else
              index += 1
              "(?:#{path_start ? '[^.]' : ''}(?:[^\\#{File::SEPARATOR}]*?\\#{File::SEPARATOR}?)*?)?"
            end
          else
            path_start ? (pathnames ? "(?:[^./][^/]*?)?" : "(?:[^.].*?)?") : (pathnames ? "[^/]*?" : ".*?")
          end
        when "["
          # Handle character set inclusion / exclusion
          start_index = index
          end_index = pattern.index(']', start_index+1)
          while end_index && pattern[end_index-1] == "\\"
            end_index = pattern.index(']', end_index+1)
          end
          if end_index
            index = end_index
            char_set = pattern[start_index..end_index]
            char_set.delete!('/') if pathnames
            char_set[1] = '^' if char_set[1] == '!'
            (char_set == "[]" || char_set == "[^]") ? "" : char_set
          else
            "\\["
          end
        else
          "\\#{char}"
        end

        new_pattern += snippet
      else
        if start < length
          snippet = pattern[start..-1]
          new_pattern += snippet
        end
      end

      break if !index
      start = index + 1
    end

    begin
      return Regexp.new("\\A#{new_pattern}\\z", casefold || syscase ? Regexp::IGNORECASE : 0)
    rescue
      return nil
    end
  end
end

This solution takes into account the various flags available for the File::fnmatch function, and uses the glob pattern to build a suitable Regexp to match the features. 该解决方案考虑了File :: fnmatch函数可用的各种标志,并使用glob模式来构建合适的Regexp以匹配功能。 With this solution, these tests can be run successfully: 使用此解决方案,这些测试可以成功运行:

File.fnmatch('c{at,ub}s', 'cats', File::FNM_EXTGLOB)
#=> true
File.fnmatch('file{*.doc,*.pdf}', 'filename.doc')
#=> false
File.fnmatch('file{*.doc,*.pdf}', 'filename.doc', File::FNM_EXTGLOB)
#=> true
File.fnmatch('f*l?{[a-z].doc,[0-9].pdf}', 'filex.doc', File::FNM_EXTGLOB)
#=> true
File.fnmatch('**/.{pro,}f?l*', 'home/.profile', File::FNM_EXTGLOB | File::FNM_DOTMATCH)
#=> true

The fnmatch_with_braces_glob (and ? variant) will be patched in place of fnmatch , so that Ruby 2.0.0-compliant code will work with earlier Ruby versions, as well. 将修补fnmatch_with_braces_glob (和?变体)以代替fnmatch ,以便兼容Ruby 2.0.0的代码也可与早期Ruby版本一起使用。 For clarity reasons, the code shown above does not include some performance improvements, argument checking, or the Backports feature detection and patch-in code; 为了清楚起见,上面显示的代码未包括某些性能改进,参数检查或Backports功能检测和修补程序代码。 these will obviously be included in the actual submission to the project. 这些显然将包括在实际提交给项目中。

I'm still testing some edge cases and heavily optimizing performance; 我仍在测试一些极端情况,并在很大程度上优化性能。 it should be ready to submit very soon. 应该准备很快提交。 Once it's available in an official Backports release, I'll update the status here. 在正式的Backports版本中可用后,我将在此处更新状态。

Note that Dir::glob support will be coming at the same time, as well. 请注意, Dir :: glob支持也将同时出现。

That was a fun Ruby exercise! 那是一个有趣的Ruby练习! No idea if this solution is robust enough for you, but here goes : 不知道该解决方案是否足够健壮,但是这里有:

class File
  class << self
    def fnmatch_extglob(pattern, path, flags=0)
      explode_extglob(pattern).any?{|exploded_pattern|
        fnmatch(exploded_pattern,path,flags)
      }
    end

    def explode_extglob(pattern)
      if match=pattern.match(/\{([^{}]+)}/) then
        subpatterns = match[1].split(',',-1)
        subpatterns.map{|subpattern| explode_extglob(match.pre_match+subpattern+match.post_match)}.flatten
      else
        [pattern]
      end
    end
  end
end

Better testing is needed, but it seems to work fine for simple cases : 需要更好的测试,但是对于简单的情况,它似乎可以正常工作:

[2] pry(main)> File.explode_extglob('c{at,ub}s')
=> ["cats", "cubs"]
[3] pry(main)> File.explode_extglob('c{at,ub}{s,}')
=> ["cats", "cat", "cubs", "cub"]
[4] pry(main)> File.explode_extglob('{a,b,c}{d,e,f}{g,h,i}')
=> ["adg", "adh", "adi", "aeg", "aeh", "aei", "afg", "afh", "afi", "bdg", "bdh", "bdi", "beg", "beh", "bei", "bfg", "bfh", "bfi", "cdg", "cdh", "cdi", "ceg", "ceh", "cei", "cfg", "cfh", "cfi"]
[5] pry(main)> File.explode_extglob('{a,b}c*')
=> ["ac*", "bc*"]
[6] pry(main)> File.fnmatch('c{at,ub}s', 'cats')
=> false
[7] pry(main)> File.fnmatch_extglob('c{at,ub}s', 'cats')
=> true
[8] pry(main)> File.fnmatch_extglob('c{at,ub}s*', 'catsssss')
=> true

Tested with Ruby 1.9.3 and Ruby 2.1.5 and 2.2.1. 经过Ruby 1.9.3,Ruby 2.1.5和2.2.1的测试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM