简体   繁体   English

使用 JavaScript 确定字符串是否在 base64 中

[英]Determine if string is in base64 using JavaScript

I'm using the window.atob('string') function to decode a string from base64 to a string.我正在使用window.atob('string') function 将字符串从 base64 解码为字符串。 Now I wonder, is there any way to check that 'string' is actually valid base64?现在我想知道,有什么方法可以检查“字符串”是否真的有效 base64? I would like to be notified if the string is not base64 so I can perform a different action.如果字符串不是 base64,我希望收到通知,以便我可以执行不同的操作。

If you want to check whether it can be decoded or not, you can simply try decoding it and see whether it failed:如果你想检查它是否可以解码,你可以简单地尝试解码它,看看它是否失败:

try {
    window.atob(str);
} catch(e) {
    // something failed

    // if you want to be specific and only catch the error which means
    // the base 64 was invalid, then check for 'e.code === 5'.
    // (because 'DOMException.INVALID_CHARACTER_ERR === 5')
}

Building on @anders-marzi-tornblad's answer , using the regex to make a simple true/false test for base64 validity is as easy as follows:基于@anders-marzi-tornblad 的答案,使用正则表达式对base64 有效性进行简单的真/假测试,如下所示:

var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;

base64regex.test("SomeStringObviouslyNotBase64Encoded...");             // FALSE
base64regex.test("U29tZVN0cmluZ09idmlvdXNseU5vdEJhc2U2NEVuY29kZWQ=");   // TRUE

Update 2021 2021 年更新

  • Following the comments below it transpires this regex-based solution provides a more accurate check than simply try `ing atob because the latter doesn't check for = -padding.根据下面的评论,这个基于正则表达式的解决方案提供了比简单地try `ing atob更准确的检查,因为后者不检查= -padding。 According to RFC4648 = -padding may only be ignored for base16-encoding or if the data length is known implicitely.根据RFC4648 = -padding 只能在 base16 编码或隐含已知数据长度的情况下被忽略。
  • Regex-based solution also seems to be the fastest as hinted by kai .正如kai暗示的,基于正则表达式的解决方案似乎也是最快的。 As jsperf seems flaky atm i made a new test on jsbench which confirms this.由于 jsperf 看起来很不稳定,我在 jsbench 上做了一个新的测试,证实了这一点。

This should do the trick.这应该可以解决问题。

function isBase64(str) {
    if (str ==='' || str.trim() ===''){ return false; }
    try {
        return btoa(atob(str)) == str;
    } catch (err) {
        return false;
    }
}

If "valid" means "only has base64 chars in it" then check against /[A-Za-z0-9+/=]/ .如果“有效”意味着“其中只有 base64 字符”,则检查/[A-Za-z0-9+/=]/

If "valid" means a "legal" base64-encoded string then you should check for the = at the end.如果“有效”表示“合法”base64 编码的字符串,那么您应该检查末尾的=

If "valid" means it's something reasonable after decoding then it requires domain knowledge.如果“有效”意味着解码后它是合理的,那么它需要领域知识。

I would use a regular expression for that.我会为此使用正则表达式。 Try this one:试试这个:

/^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/

Explanation:解释:

^                          # Start of input
([0-9a-zA-Z+/]{4})*        # Groups of 4 valid characters decode
                           # to 24 bits of data for each group
(                          # Either ending with:
    ([0-9a-zA-Z+/]{2}==)   # two valid characters followed by ==
    |                      # , or
    ([0-9a-zA-Z+/]{3}=)    # three valid characters followed by =
)?                         # , or nothing
$                          # End of input

This method attempts to decode then encode and compare to the original.此方法尝试解码然后编码并与原始文件进行比较。 Could also be combined with the other answers for environments that throw on parsing errors.也可以与引发解析错误的环境的其他答案结合使用。 Its also possible to have a string that looks like valid base64 from a regex point of view but is not actual base64.从正则表达式的角度来看,它也可能有一个看起来像有效的 base64 但不是实际的 base64 的字符串。

if(btoa(atob(str))==str){
  //...
}

This is how it's done in one of my favorite validation libs:这是在我最喜欢的验证库之一中完成的:

const notBase64 = /[^A-Z0-9+\/=]/i;

export default function isBase64(str) {
  assertString(str); // remove this line and make sure you pass in a string
  const len = str.length;
  if (!len || len % 4 !== 0 || notBase64.test(str)) {
    return false;
  }
  const firstPaddingChar = str.indexOf('=');
  return firstPaddingChar === -1 ||
    firstPaddingChar === len - 1 ||
    (firstPaddingChar === len - 2 && str[len - 1] === '=');
}

https://github.com/chriso/validator.js/blob/master/src/lib/isBase64.js https://github.com/chriso/validator.js/blob/master/src/lib/isBase64.js

As there are mostly two possibilities posted here (regex vs try catch) I did compare the performance of both: https://jsperf.com/base64-check/由于这里发布的主要有两种可能性(regex vs try catch),我确实比较了两者的性能: https : //jsperf.com/base64-check/

Regex solution seems to be much faster and clear winner.正则表达式解决方案似乎更快更明显。 Not sure if the regex catches all cases but for my tests it worked perfectly.不确定正则表达式是否能捕获所有情况,但对于我的测试,它运行良好。

Thanks to @Philzen for the regex!感谢@Philzen 的正则表达式!

ps ps

In case someone is interested in finding the fastest way to safely decode a base64 string (that's how I came here): https://jsperf.com/base64-decoding-check如果有人有兴趣找到安全解码 base64 字符串的最快方法(这就是我来到这里的方式): https : //jsperf.com/base64-decoding-check

For me, a string is likely an encoded base64 if:对我来说,如果满足以下条件,字符串很可能是经过编码的 base64:

  1. its length is divisible by 4它的长度可以被 4 整除
  2. uses AZ az 0-9 +/=使用AZ az 0-9 +/=
  3. only uses = in the end (0-2 chars)最后只使用= (0-2个字符)

so the code would be所以代码是

function isBase64(str)
{
    return str.length % 4 == 0 && /^[A-Za-z0-9+/]+[=]{0,2}$/.test(str);
}

Implementation in nodejs (validates not just allowed chars but base64 string at all) nodejs 中的实现(不仅验证允许的字符,还验证 base64 字符串)


    const validateBase64 = function(encoded1) {
        var decoded1 = Buffer.from(encoded1, 'base64').toString('utf8');
        var encoded2 = Buffer.from(decoded1, 'binary').toString('base64');
        return encoded1 == encoded2;
    }

I have tried the below answers but there are some issues.我尝试了以下答案,但存在一些问题。

var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
base64regex.test(value)

when using this it will be true with "BBBBB" capital letters.当使用它时,“BBBBB”大写字母将是真实的。 and also it will be true with "4444". “4444”也是如此。

I added some code to work correctly for me.我添加了一些代码来为我正常工作。

function (value) {
  var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
  if (base64regex.test(value) && isNaN(value) && !/^[a-zA-Z]+$/.test(value)) {
  return decodeURIComponent(escape(window.atob(value)));
}

Throwing my results into the fray here.把我的结果扔在这里。 In my case, there was a string that was not base64 but was valid base64 so it was getting decoded into gibberish.在我的例子中,有一个不是 base64 但有效的字符串 base64 所以它被解码成乱码。 (ie yyyyyyyy is valid base64 according to the usual regex) (即根据通常的正则表达式,yyyyyyyy 有效 base64)

My testing resulted in checking first if the string was a valid base64 string using the regex others shared here and then decrypting it and testing if it was a valid ascii string since (in my case) I should only get ascii characters back.我的测试结果是首先检查字符串是否是有效的 base64 字符串,使用其他人在这里共享的正则表达式,然后解密它并测试它是否是有效的 ascii 字符串,因为(在我的情况下)我应该只取回 ascii 字符。 (This can probably be extended to include other characters that may not fall into ascii ranges.) (这可能会扩展到包括可能不属于 ascii 范围的其他字符。)

This is a bit of a mix of multiple answers.这是多个答案的混合体。

let base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
function isBase64(str) {
    if (str ==='' || str.trim() ===''){ return false; }
    try {
        if (base64regex.test(str)) {
            return /^[\x00-\x7F]*$/.test(atob(str));
        } else {
            return false
        }
    } catch (err) {
        // catch
    }
}

As always with my JavaScript answers, I have no idea what I am doing.与我的 JavaScript 答案一样,我不知道自己在做什么。 So there might be a better way to write this out.所以可能有更好的方法来写出来。 But it works for my needs and covers the case when you have a string that isn't supposed to be base64 but is valid and still decrypts as base64.但它适用于我的需要,并涵盖了当您有一个不应为 base64 但有效且仍解密为 base64 的字符串时的情况。

I know its late, but I tried to make it simple here;我知道为时已晚,但我试图在这里让它变得简单;

function isBase64(encodedString) {
    var regexBase64 = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
    return regexBase64.test(encodedString);   // return TRUE if its base64 string.
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM