簡體   English   中英

我的代碼混淆了正則表達式的輸入文件名

[英]My code is confusing an input file name for a regex expression

我的正則表達式沒有在字符范圍內明確包含破折號,但是當輸入文件名如下所示時,我的代碼失敗了:

Rage Against The Machine - 1996 - Bulls On Parade [Maxi-Single]

這是我的代碼:

def find_cue_files(path):
  found_files = []
  for root, dirs, files in os.walk(path):
    if files:
      fcue = glob(os.path.join(root, '*.[Cc][Uu][Ee]')) # this is line 81 in my source file (mentioned in the traceback)
      # do a few other things...
  return found_files

文件名的這一部分似乎很明顯是問題所在: [Maxi-Single]

我如何處理類似的文件名,以便將它們視為固定字符串,而不是正則表達式的一部分?

(這不是我的主要問題,但如果它是相關的,我願意嘗試另一種方法來進行不區分大小寫的搜索。我已經查看了關於該主題的幾個 Stack Overflow 問題,但我沒有 - 到目前為止 - - 找到似乎適合這種情況的任何解決方案。)

這是我的錯誤:

回溯(最后一次通話):

  File "/usr/bin/xonsh", line 33, in <module>
    sys.exit(load_entry_point('xonsh==0.10.0', 'console_scripts', 'xonsh')())
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 21336, in main
    _failback_to_other_shells(args, err)
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 21283, in _failback_to_other_shells
    raise err
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 21334, in main
    sys.exit(main_xonsh(args))
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 21388, in main_xonsh
    run_script_with_cache(
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 3285, in run_script_with_cache
    run_compiled_code(ccode, glb, loc, mode)
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 3190, in run_compiled_code
    func(code, glb, loc)
  File "process_audio_files.xsh", line 160, in <module>
    cue_files = find_cue_files(dest_path)
  File "process_audio_files.xsh", line 81, in find_cue_files
    fcue = glob(os.path.join(root, '*.[Cc][Uu][Ee]'))
  File "/usr/lib/python3.9/glob.py", line 22, in glob
    return list(iglob(pathname, recursive=recursive))
  File "/usr/lib/python3.9/glob.py", line 74, in _iglob
    for dirname in dirs:
  File "/usr/lib/python3.9/glob.py", line 75, in _iglob
    for name in glob_in_dir(dirname, basename, dironly):
  File "/usr/lib/python3.9/glob.py", line 86, in _glob1
    return fnmatch.filter(names, pattern)
  File "/usr/lib/python3.9/fnmatch.py", line 58, in filter
    match = _compile_pattern(pat)
  File "/usr/lib/python3.9/fnmatch.py", line 52, in _compile_pattern
    return re.compile(res).match
  File "/usr/lib/python3.9/re.py", line 252, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python3.9/re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.9/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/usr/lib/python3.9/sre_parse.py", line 948, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "/usr/lib/python3.9/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/usr/lib/python3.9/sre_parse.py", line 834, in _parse
    p = _parse_sub(source, state, sub_verbose, nested + 1)
  File "/usr/lib/python3.9/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/usr/lib/python3.9/sre_parse.py", line 598, in _parse
    raise source.error(msg, len(this) + 1 + len(that))
re.error: bad character range i-S at position 70

編輯:我嘗試使用此處引用的re.escapehttps://docs.python.org/3/library/re.html

def find_cue_files(path):
  found_files = []
  for root, dirs, files in os.walk(path):
    if files:
      root2 = re.escape(root)
      fcue = glob(os.path.join(root2, '*.[Cc][Uu][Ee]')) 
      # do a few other things...
  return found_files

它處理了較早的文件名,但現在輸入文件名Aerosmith - Aerosmith (2014) [24-96 HD]在我修改后的代碼中的同一點產生了相同的錯誤。

與其將 glob 與通過根傳遞的有趣文件模式一起使用,不如只整理名稱,然后將根添加到前面。 一種可能的單線:

fcue=list(map(lambda x: os.path.join(root,x), (f for f in files if f.lower().endswith('.cue'))))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM