[英]Change Content-Disposition filename with userscript from a Post
[英]javascript regex for extracting filename from Content-Disposition header
Content-disposition 标头包含可以轻松提取的文件名,但有时它包含双引号,有时不包含引号,并且可能还有其他一些变体。 有人可以编写一个适用于所有情况的正则表达式。
Content-Disposition: attachment; filename=content.txt
以下是一些可能的目标字符串:
attachment; filename=content.txt
attachment; filename*=UTF-8''filename.txt
attachment; filename="EURO rates"; filename*=utf-8''%e2%82%ac%20rates
attachment; filename="omáèka.jpg"
and some other combinations might also be there
你可以尝试这种精神:
filename[^;=\n]*=((['"]).*?\2|[^;\n]*)
filename # match filename, followed by
[^;=\n]* # anything but a ;, a = or a newline
=
( # first capturing group
(['"]) # either single or double quote, put it in capturing group 2
.*? # anything up until the first...
\2 # matching quote (single if we found single, double if we find double)
| # OR
[^;\n]* # anything but a ; or a newline
)
您的文件名在第一个捕获组中: http : //regex101.com/r/hJ7tS6
略微修改以匹配我的用例(删除所有引号和UTF标记)
filename\\*?=['"]?(?:UTF-\\d['"]*)?([^;\\r\\n"']*)['"]?;?
/filename[^;=\n]*=(?:(\\?['"])(.*?)\1|(?:[^\s]+'.*?')?([^;\n]*))/i
https://regex101.com/r/hJ7tS6/51
编辑 :您也可以使用此解析器: https : //github.com/Rob--W/open-in-browser/blob/master/extension/content-disposition.js
免责声明:以下答案仅适用于PCRE (例如Python / PHP),如果您必须使用javascript,请使用Robin的答案。
这个修改后的Robin正则表达式删除了引号:
filename[^;\n=]*=(['\"])*(.*)(?(1)\1|)
filename # match filename, followed by
[^;=\n]* # anything but a ;, a = or a newline
=
(['"])* # either single or double quote, put it in capturing group 1
(?:utf-8\'\')? # removes the utf-8 part from the match
(.*) # second capturing group, will contain the filename
(?(1)\1|) # if clause: if first capturing group is not empty,
# match it again (the quotes), else match nothing
https://regex101.com/r/hJ7tS6/28
文件名位于第二个捕获组中。
这是我的正则表达式。 它适用于Javascript。
filename\*?=((['"])[\s\S]*?\2|[^;\n]*)
我在我的项目中使用了这个。
filename[^;\n]*=(UTF-\d['"]*)?((['"]).*?[.]$\2|[^;\n]*)?
我已经升级了Robin的解决方案,还做了两件事:
这是一个ECMAScript解决方案。
我制作了一个使用组filename
查找这些名称的正则表达式
/(?<=filename(?:=|\*=(?:[\w\-]+'')))["']?(?<filename>[^"';\n]+)["']?/g
const regex = /(?<=filename(?:=|\\*=(?:[\\w\\-]+'')))["']?(?<filename>[^"';\\n]+)["']?/g const filenames = ` attachment; filename=content.txt attachment; filename*=UTF-8''filename.txt attachment; filename="EURO rates"; filename*=utf-8''%e2%82%ac%20rates attachment; filename="omáèka.jpg" ` function logMatches(){ const array = new Array filenames.split("\\n").forEach(line => { if(!line.trim()) return const matches = line.matchAll(regex) const groups = Array.from(matches).map(match => match?.groups?.filename) array.push(groups.length === 1 ? groups[0] : groups) }) console.log(array) } logMatches()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.