简体   繁体   English

如何使各种浏览器的toLowerCase()和toUpperCase()保持一致

[英]How do I make toLowerCase() and toUpperCase() consistent across browsers

Are there JavaScript polyfill implementations of String.toLowerCase() and String.toUpperCase(), or other methods in JavaScript that can work with Unicode characters and are consistent across browsers? 是否存在String.toLowerCase()和String.toUpperCase()的JavaScript polyfill实现,或者JavaScript中可以使用Unicode字符并且跨浏览器一致的其他方法?

Background info 背景信息

Performing the following will give difference results in browsers, or even between browser versions (Eg FireFox 54 vs 55): 执行以下操作将在浏览器中或甚至浏览器版本之间产生不同的结果(例如FireFox 54与55):

document.write(String.fromCodePoint(223).normalize("NFKC").toLowerCase().toUpperCase().toLowerCase())

In Firefox 55 it gives you ss , in Firefox 54 it gives you ß . 在Firefox 55中它为你提供了ss ,在Firefox 54中它为你提供了ß

Generally this is fine, and mechanisms such as Locales handle a lot of the cases you'd want; 通常这很好,Locales等机制可以处理你想要的很多情况; however, when you need consistent behavior across platforms such as talking to BaaS systems like it can greatly simplify interactions where you're essentially processing internal data on the client. 但是,当您需要跨平台的一致行为时,例如与等BaaS系统交谈,它可以极大地简化您实际处理客户端内部数据的交互。

Note that this issue only seems to affect outdated versions of Firefox, so unless you explicitly need to support those old versions, you could choose to just not bother at all. 请注意,此问题似乎只会影响过时的Firefox版本,因此除非您明确需要支持这些旧版本,否则您可以选择不打扰。 The behavior for your example is the same in all modern browsers (since the change in Firefox). 您的示例的行为在所有现代浏览器中都是相同的(因为Firefox中的更改)。 This can be verified using jsvu + eshost : 这可以使用jsvu + eshost验证:

$ jsvu # Update installed JavaScript engine binaries to the latest version.

$ eshost -e '"\xDF".normalize("NFKC").toLowerCase().toUpperCase().toLowerCase()'
#### Chakra
ss

#### V8 --harmony
ss

#### JavaScriptCore
ss

#### V8
ss

#### SpiderMonkey
ss

#### xs
ss

But you asked how to solve this problem, so let's continue. 但你问如何解决这个问题,让我们继续。

Step 4 of https://tc39.github.io/ecma262/#sec-string.prototype.tolowercase states: https://tc39.github.io/ecma262/#sec-string.prototype.tolowercase的第4步说明:

Let cuList be a List where the elements are the result of toLowercase(cpList) , according to the Unicode Default Case Conversion algorithm. 根据Unicode默认大小写转换算法,让cuList为List,其中元素是toLowercase(cpList)的结果。

This Unicode Default Case Conversion algorithm is specified in section 3.13 Default Case Algorithms of the Unicode standard . Unicode默认大小写转换算法 在Unicode标准的3.13默认大小写算法中指定。

The full case mappings for Unicode characters are obtained by using the mappings from SpecialCasing.txt plus the mappings from UnicodeData.txt , excluding any of the latter mappings that would conflict. Unicode字符的完整大小写映射是通过使用来自SpecialCasing.txt的映射加上UnicodeData.txt的映射来获得的,不包括任何后面会发生冲突的映射。 Any character that does not have a mapping in these files is considered to map to itself. 任何在这些文件中没有映射的字符都被视为映射到自身。

[…] [...]

The following rules specify the default case conversion operations for Unicode strings. 以下规则指定Unicode字符串的默认大小写转换操作。 These rules use the full case conversion operations, Uppercase_Mapping(C) , Lowercase_Mapping(C) , and Titlecase_Mapping(C) , as well as the context-dependent mappings based on the casing context, as specified in Table 3-17. 这些规则使用完整的情况下转换操作, Uppercase_Mapping(C) Lowercase_Mapping(C)Titlecase_Mapping(C)以及基于所述壳体上下文依赖于上下文的映射,如表3-17中指定。

For a string X : 对于字符串X

  • R1 toUppercase(X) : Map each character C in X to Uppercase_Mapping(C) . R1 toUppercase(X)每个字符映射CXUppercase_Mapping(C)
  • R2 toLowercase(X) : Map each character C in X to Lowercase_Mapping(C) . R2 toLowercase(X)每个字符映射CXLowercase_Mapping(C)

Here's an example from SpecialCasing.txt , with my annotation added below: 以下是来自SpecialCasing.txt的示例,其中添加了我的注释:

00DF  ; 00DF   ; 0053 0073; 0053 0053;                      # LATIN SMALL LETTER SHARP S
<code>; <lower>; <title>  ; <upper>  ; (<condition_list>;)? # <comment>

This line says that U+00DF ( 'ß' ) lowercases to U+00DF ( ß ) and uppercases to U+0053 U+0053 ( SS ). 该线表示U + 00DF( 'ß' )小写为U + 00DF( ß ),大写小写为U + 0053 U + 0053( SS )。

Here's an example from UnicodeData.txt , with my annotation added below: 这是UnicodeData.txt的一个示例,我的注释在下面添加:

0041  ; LATIN CAPITAL LETTER A; Lu;0;L;;;;;N;;;; 0061   ;
<code>; <name>                ; <ignore>       ; <lower>; <upper>

This line says that U+0041 ( 'A' ) lowercases to U+0061 ( 'a' ). 该行表示U + 0041( 'A' )小写为U + 0061( 'a' )。 It doesn't have an explicit uppercase mapping, meaning it uppercases to itself. 它没有明确的大写映射,这意味着它是自身的大写。

Here's another example from UnicodeData.txt : 这是UnicodeData.txt的另一个例子:

0061  ; LATIN SMALL LETTER A; Ll;0;L;;;;;N;; ;0041;        ; 0041
<code>; <name>              ; <ignore>            ; <lower>; <upper>

This line says that U+0061 ( 'a' ) uppercases to U+0041 ( 'A' ). 该行表示U + 0061( 'a' )上限为U + 0041( 'A' )。 It doesn't have an explicit lowercase mapping, meaning it lowercases to itself. 它没有明确的小写映射,这意味着它会降低自身的范围。

You could write a script that parses these two files, reads each line following these examples, and builds lowercase/uppercase mappings. 您可以编写一个解析这两个文件的脚本,按照这些示例读取每一行,并构建小写/大写映射。 You could then turn those mappings into a small JavaScript library that provides spec-compliant toLowerCase / toUpperCase functionality. 然后,您可以将这些映射转换为一个小型JavaScript库,该库提供符合规范的toLowerCase / toUpperCase功能。

This seems like a lot of work. 这似乎很多工作。 Depending on the old behavior in Firefox and what exactly changed (?) you could probably limit the work to just the special mappings in SpecialCasing.txt . 根据旧的行为在Firefox和究竟是什么改变了(?),你很可能限制了工作, 只是在特殊的映射SpecialCasing.txt (I'm making this assumption that only the special casings changed in Firefox 55, based on the example you provided.) (我假设在Firefox 55中只根据您提供的示例更改了特殊外壳。)

// Instead of…
function normalize(string) {
  const normalized = string.normalize('NFKC');
  const lowercased = normalized.toLowerCase();
  return lowercased;
}

// …one could do something like:
function lowerCaseSpecialCases(string) {
  // TODO: replace all SpecialCasing.txt characters with their lowercase
  // mapping.
  return string.replace(/TODO/g, fn);
}
function normalize(string) {
  const normalized = string.normalize('NFKC');
  const fixed = lowerCaseSpecialCases(normalized); // Workaround for old Firefox 54 behavior.
  const lowercased = fixed.toLowerCase();
  return lowercased;
}

I wrote a script that parses SpecialCasing.txt and generates a JS library that implements the lowerCaseSpecialCases functionality mentioned above (as toLower ) as well as toUpper . 我编写了一个脚本来解析SpecialCasing.txt并生成一个JS库,该库实现了上面提到的lowerCaseSpecialCases功能(如toLower )以及toUpper Here it is: https://gist.github.com/mathiasbynens/a37e3f3138069729aa434ea90eea4a3c Depending on your exact use case, you might not need the toUpper and its corresponding regex and map at all. 这是: httpstoUpper根据您的确切用例,您可能根本不需要toUpper及其相应的正则表达式和映射。 Here's the full generated library: 这是完整生成的库:

const reToLower = /[\u0130\u1F88-\u1F8F\u1F98-\u1F9F\u1FA8-\u1FAF\u1FBC\u1FCC\u1FFC]/g;
const toLowerMap = new Map([
  ['\u0130', 'i\u0307'],
  ['\u1F88', '\u1F80'],
  ['\u1F89', '\u1F81'],
  ['\u1F8A', '\u1F82'],
  ['\u1F8B', '\u1F83'],
  ['\u1F8C', '\u1F84'],
  ['\u1F8D', '\u1F85'],
  ['\u1F8E', '\u1F86'],
  ['\u1F8F', '\u1F87'],
  ['\u1F98', '\u1F90'],
  ['\u1F99', '\u1F91'],
  ['\u1F9A', '\u1F92'],
  ['\u1F9B', '\u1F93'],
  ['\u1F9C', '\u1F94'],
  ['\u1F9D', '\u1F95'],
  ['\u1F9E', '\u1F96'],
  ['\u1F9F', '\u1F97'],
  ['\u1FA8', '\u1FA0'],
  ['\u1FA9', '\u1FA1'],
  ['\u1FAA', '\u1FA2'],
  ['\u1FAB', '\u1FA3'],
  ['\u1FAC', '\u1FA4'],
  ['\u1FAD', '\u1FA5'],
  ['\u1FAE', '\u1FA6'],
  ['\u1FAF', '\u1FA7'],
  ['\u1FBC', '\u1FB3'],
  ['\u1FCC', '\u1FC3'],
  ['\u1FFC', '\u1FF3']
]);
const toLower = (string) => string.replace(reToLower, (match) => toLowerMap.get(match));

const reToUpper = /[\xDF\u0149\u01F0\u0390\u03B0\u0587\u1E96-\u1E9A\u1F50\u1F52\u1F54\u1F56\u1F80-\u1FAF\u1FB2-\u1FB4\u1FB6\u1FB7\u1FBC\u1FC2-\u1FC4\u1FC6\u1FC7\u1FCC\u1FD2\u1FD3\u1FD6\u1FD7\u1FE2-\u1FE4\u1FE6\u1FE7\u1FF2-\u1FF4\u1FF6\u1FF7\u1FFC\uFB00-\uFB06\uFB13-\uFB17]/g;
const toUpperMap = new Map([
  ['\xDF', 'SS'],
  ['\uFB00', 'FF'],
  ['\uFB01', 'FI'],
  ['\uFB02', 'FL'],
  ['\uFB03', 'FFI'],
  ['\uFB04', 'FFL'],
  ['\uFB05', 'ST'],
  ['\uFB06', 'ST'],
  ['\u0587', '\u0535\u0552'],
  ['\uFB13', '\u0544\u0546'],
  ['\uFB14', '\u0544\u0535'],
  ['\uFB15', '\u0544\u053B'],
  ['\uFB16', '\u054E\u0546'],
  ['\uFB17', '\u0544\u053D'],
  ['\u0149', '\u02BCN'],
  ['\u0390', '\u0399\u0308\u0301'],
  ['\u03B0', '\u03A5\u0308\u0301'],
  ['\u01F0', 'J\u030C'],
  ['\u1E96', 'H\u0331'],
  ['\u1E97', 'T\u0308'],
  ['\u1E98', 'W\u030A'],
  ['\u1E99', 'Y\u030A'],
  ['\u1E9A', 'A\u02BE'],
  ['\u1F50', '\u03A5\u0313'],
  ['\u1F52', '\u03A5\u0313\u0300'],
  ['\u1F54', '\u03A5\u0313\u0301'],
  ['\u1F56', '\u03A5\u0313\u0342'],
  ['\u1FB6', '\u0391\u0342'],
  ['\u1FC6', '\u0397\u0342'],
  ['\u1FD2', '\u0399\u0308\u0300'],
  ['\u1FD3', '\u0399\u0308\u0301'],
  ['\u1FD6', '\u0399\u0342'],
  ['\u1FD7', '\u0399\u0308\u0342'],
  ['\u1FE2', '\u03A5\u0308\u0300'],
  ['\u1FE3', '\u03A5\u0308\u0301'],
  ['\u1FE4', '\u03A1\u0313'],
  ['\u1FE6', '\u03A5\u0342'],
  ['\u1FE7', '\u03A5\u0308\u0342'],
  ['\u1FF6', '\u03A9\u0342'],
  ['\u1F80', '\u1F08\u0399'],
  ['\u1F81', '\u1F09\u0399'],
  ['\u1F82', '\u1F0A\u0399'],
  ['\u1F83', '\u1F0B\u0399'],
  ['\u1F84', '\u1F0C\u0399'],
  ['\u1F85', '\u1F0D\u0399'],
  ['\u1F86', '\u1F0E\u0399'],
  ['\u1F87', '\u1F0F\u0399'],
  ['\u1F88', '\u1F08\u0399'],
  ['\u1F89', '\u1F09\u0399'],
  ['\u1F8A', '\u1F0A\u0399'],
  ['\u1F8B', '\u1F0B\u0399'],
  ['\u1F8C', '\u1F0C\u0399'],
  ['\u1F8D', '\u1F0D\u0399'],
  ['\u1F8E', '\u1F0E\u0399'],
  ['\u1F8F', '\u1F0F\u0399'],
  ['\u1F90', '\u1F28\u0399'],
  ['\u1F91', '\u1F29\u0399'],
  ['\u1F92', '\u1F2A\u0399'],
  ['\u1F93', '\u1F2B\u0399'],
  ['\u1F94', '\u1F2C\u0399'],
  ['\u1F95', '\u1F2D\u0399'],
  ['\u1F96', '\u1F2E\u0399'],
  ['\u1F97', '\u1F2F\u0399'],
  ['\u1F98', '\u1F28\u0399'],
  ['\u1F99', '\u1F29\u0399'],
  ['\u1F9A', '\u1F2A\u0399'],
  ['\u1F9B', '\u1F2B\u0399'],
  ['\u1F9C', '\u1F2C\u0399'],
  ['\u1F9D', '\u1F2D\u0399'],
  ['\u1F9E', '\u1F2E\u0399'],
  ['\u1F9F', '\u1F2F\u0399'],
  ['\u1FA0', '\u1F68\u0399'],
  ['\u1FA1', '\u1F69\u0399'],
  ['\u1FA2', '\u1F6A\u0399'],
  ['\u1FA3', '\u1F6B\u0399'],
  ['\u1FA4', '\u1F6C\u0399'],
  ['\u1FA5', '\u1F6D\u0399'],
  ['\u1FA6', '\u1F6E\u0399'],
  ['\u1FA7', '\u1F6F\u0399'],
  ['\u1FA8', '\u1F68\u0399'],
  ['\u1FA9', '\u1F69\u0399'],
  ['\u1FAA', '\u1F6A\u0399'],
  ['\u1FAB', '\u1F6B\u0399'],
  ['\u1FAC', '\u1F6C\u0399'],
  ['\u1FAD', '\u1F6D\u0399'],
  ['\u1FAE', '\u1F6E\u0399'],
  ['\u1FAF', '\u1F6F\u0399'],
  ['\u1FB3', '\u0391\u0399'],
  ['\u1FBC', '\u0391\u0399'],
  ['\u1FC3', '\u0397\u0399'],
  ['\u1FCC', '\u0397\u0399'],
  ['\u1FF3', '\u03A9\u0399'],
  ['\u1FFC', '\u03A9\u0399'],
  ['\u1FB2', '\u1FBA\u0399'],
  ['\u1FB4', '\u0386\u0399'],
  ['\u1FC2', '\u1FCA\u0399'],
  ['\u1FC4', '\u0389\u0399'],
  ['\u1FF2', '\u1FFA\u0399'],
  ['\u1FF4', '\u038F\u0399'],
  ['\u1FB7', '\u0391\u0342\u0399'],
  ['\u1FC7', '\u0397\u0342\u0399'],
  ['\u1FF7', '\u03A9\u0342\u0399']
]);
const toUpper = (string) => string.replace(reToUpper, (match) => toUpperMap.get(match));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使jQuery动画在主要浏览器上正常工作? - How do I make a jQuery animation work across major browsers? 我如何制作HTML5 <input type=month> 跨所有浏览器工作? - How do I make HTML5 <input type=month> to work across all browsers? .toUpperCase将一些字符分成两部分? 可能.toLowerCase也这样做吗? - .toUpperCase splits some chars in two? Might .toLowerCase do that too? .toLowerCase() /.toUpperCase() 不工作 - .toLowerCase() / .toUpperCase() not working 跨浏览器的一致滚动行为 - Consistent scroll behavior across browsers 我如何使一个div在该div中具有相同的图像大小? 但整个网站上所有图片的大小仍然一致吗? - How do i make a div to have the same image size in that div? but still have a consistent size for all images across the website? 如何跨浏览器标准化CSS3 Transition功能? - How do I normalize CSS3 Transition functions across browsers? 混合模式下的Javascript .toLowerCase / .toUpperCase - Javascript .toLowerCase/.toUpperCase in mixed mode 有没有办法在使用鼠标滚轮时使浏览器的滚动量保持一致? - Is there a way to make the scroll amount consistent across browsers when using the mouse wheel? javascript的SVG过滤器更新在浏览器之间不一致 - SVG filter updates by javascript not consistent across browsers
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM