How do I make toLowerCase() and toUpperCase() consistent across browsers
是否有string.toLowercase()和string.toUppercase()的javascript polyfill实现,或者javascript中可以使用Unicode字符并在浏览器中保持一致的其他方法?
背景信息执行以下操作会在浏览器中产生不同的结果,甚至在浏览器版本之间也会产生不同的结果(例如,firefox 54与55):
1 | document.write(String.fromCodePoint(223).normalize("NFKC").toLowerCase().toUpperCase().toLowerCase()) |
在firefox 55中,它给你的是
通常情况下,这是很好的,而且像locales这样的机制可以处理很多您想要的情况;但是,当您需要跨平台的一致行为(如与google cloud firestore这样的baas系统交谈)时,它可以大大简化交互,在交互中,您基本上是在客户机上处理内部数据。
请注意,这个问题似乎只影响旧版本的Firefox,所以除非您明确需要支持这些旧版本,否则您可以选择根本不麻烦。您的示例的行为在所有现代浏览器中都是相同的(因为Firefox发生了变化)。这可以使用jsvu+eshost进行验证:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | $ jsvu # Update installed JavaScript engine binaries to the latest version. $ eshost -e '"\xDF".normalize("NFKC").toLowerCase().toUpperCase().toLowerCase()' #### Chakra ss #### V8 --harmony ss #### JavaScriptCore ss #### V8 ss #### SpiderMonkey ss #### xs ss |
但是你问了如何解决这个问题,所以我们继续。
https://tc39.github.io/ecma262/sec-string.prototype.tolowercase的第4步状态:
Let
cuList be a List where the elements are the result oftoLowercase(cpList) , according to the Unicode Default Case Conversion algorithm.
此Unicode默认大小写转换算法在Unicode标准的第3.13节默认大小写算法中指定。
The full case mappings for Unicode characters are obtained by using the mappings from
SpecialCasing.txt plus the mappings fromUnicodeData.txt , excluding any of the latter mappings that would conflict. Any character that does not have a mapping in these files is considered to map to itself.[…]
The following rules specify the default case conversion operations for Unicode strings. These rules use the full case conversion operations,
Uppercase_Mapping(C) ,Lowercase_Mapping(C) , andTitlecase_Mapping(C) , as well as the context-dependent mappings based on the casing context, as specified in Table 3-17.For a string
X :
- R1
toUppercase(X) : Map each characterC inX toUppercase_Mapping(C) .- R2
toLowercase(X) : Map each characterC inX toLowercase_Mapping(C) .
下面是来自
1 2 | 00DF ; 00DF ; 0053 0073; 0053 0053; # LATIN SMALL LETTER SHARP S <wyn>; <lower>; ; <upper> ; (<condition_list>;)? # <comment> |
这条线表示u+00df(
下面是来自
1 2 | 0041 ; LATIN CAPITAL LETTER A; Lu;0;L;;;;;N;;;; 0061 ; <wyn>; <name> ; <ignore> ; <lower>; <upper> |
这条线表示u+0041(
下面是来自
1 2 | 0061 ; LATIN SMALL LETTER A; Ll;0;L;;;;;N;; ;0041; ; 0041 <wyn>; <name> ; <ignore> ; <lower>; <upper> |
这条线表示u+0061(
您可以编写一个脚本来解析这两个文件,读取这些示例后面的每一行,并构建小写/大写映射。然后,您可以将这些映射转换为一个小的javascript库,该库提供符合规范的
这似乎是很多工作。取决于火狐的旧行为以及到底发生了什么变化(?)您可以将工作限制在
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | // Instead of… function normalize(string) { const normalized = string.normalize('NFKC'); const lowercased = normalized.toLowerCase(); return lowercased; } // …one could do something like: function lowerCaseSpecialCases(string) { // TODO: replace all SpecialCasing.txt characters with their lowercase // mapping. return string.replace(/TODO/g, fn); } function normalize(string) { const normalized = string.normalize('NFKC'); const fixed = lowerCaseSpecialCases(normalized); // Workaround for old Firefox 54 behavior. const lowercased = fixed.toLowerCase(); return lowercased; } |
我编写了一个脚本来解析
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | const reToLower = /[\u0130\u1F88-\u1F8F\u1F98-\u1F9F\u1FA8-\u1FAF\u1FBC\u1FCC\u1FFC]/g; const toLowerMap = new Map([ ['\u0130', 'i\u0307'], ['\u1F88', '\u1F80'], ['\u1F89', '\u1F81'], ['\u1F8A', '\u1F82'], ['\u1F8B', '\u1F83'], ['\u1F8C', '\u1F84'], ['\u1F8D', '\u1F85'], ['\u1F8E', '\u1F86'], ['\u1F8F', '\u1F87'], ['\u1F98', '\u1F90'], ['\u1F99', '\u1F91'], ['\u1F9A', '\u1F92'], ['\u1F9B', '\u1F93'], ['\u1F9C', '\u1F94'], ['\u1F9D', '\u1F95'], ['\u1F9E', '\u1F96'], ['\u1F9F', '\u1F97'], ['\u1FA8', '\u1FA0'], ['\u1FA9', '\u1FA1'], ['\u1FAA', '\u1FA2'], ['\u1FAB', '\u1FA3'], ['\u1FAC', '\u1FA4'], ['\u1FAD', '\u1FA5'], ['\u1FAE', '\u1FA6'], ['\u1FAF', '\u1FA7'], ['\u1FBC', '\u1FB3'], ['\u1FCC', '\u1FC3'], ['\u1FFC', '\u1FF3'] ]); const toLower = (string) => string.replace(reToLower, (match) => toLowerMap.get(match)); const reToUpper = /[\xDF\u0149\u01F0\u0390\u03B0\u0587\u1E96-\u1E9A\u1F50\u1F52\u1F54\u1F56\u1F80-\u1FAF\u1FB2-\u1FB4\u1FB6\u1FB7\u1FBC\u1FC2-\u1FC4\u1FC6\u1FC7\u1FCC\u1FD2\u1FD3\u1FD6\u1FD7\u1FE2-\u1FE4\u1FE6\u1FE7\u1FF2-\u1FF4\u1FF6\u1FF7\u1FFC\uFB00-\uFB06\uFB13-\uFB17]/g; const toUpperMap = new Map([ ['\xDF', 'SS'], ['\uFB00', 'FF'], ['\uFB01', 'FI'], ['\uFB02', 'FL'], ['\uFB03', 'FFI'], ['\uFB04', 'FFL'], ['\uFB05', 'ST'], ['\uFB06', 'ST'], ['\u0587', '\u0535\u0552'], ['\uFB13', '\u0544\u0546'], ['\uFB14', '\u0544\u0535'], ['\uFB15', '\u0544\u053B'], ['\uFB16', '\u054E\u0546'], ['\uFB17', '\u0544\u053D'], ['\u0149', '\u02BCN'], ['\u0390', '\u0399\u0308\u0301'], ['\u03B0', '\u03A5\u0308\u0301'], ['\u01F0', 'J\u030C'], ['\u1E96', 'H\u0331'], ['\u1E97', 'T\u0308'], ['\u1E98', 'W\u030A'], ['\u1E99', 'Y\u030A'], ['\u1E9A', 'A\u02BE'], ['\u1F50', '\u03A5\u0313'], ['\u1F52', '\u03A5\u0313\u0300'], ['\u1F54', '\u03A5\u0313\u0301'], ['\u1F56', '\u03A5\u0313\u0342'], ['\u1FB6', '\u0391\u0342'], ['\u1FC6', '\u0397\u0342'], ['\u1FD2', '\u0399\u0308\u0300'], ['\u1FD3', '\u0399\u0308\u0301'], ['\u1FD6', '\u0399\u0342'], ['\u1FD7', '\u0399\u0308\u0342'], ['\u1FE2', '\u03A5\u0308\u0300'], ['\u1FE3', '\u03A5\u0308\u0301'], ['\u1FE4', '\u03A1\u0313'], ['\u1FE6', '\u03A5\u0342'], ['\u1FE7', '\u03A5\u0308\u0342'], ['\u1FF6', '\u03A9\u0342'], ['\u1F80', '\u1F08\u0399'], ['\u1F81', '\u1F09\u0399'], ['\u1F82', '\u1F0A\u0399'], ['\u1F83', '\u1F0B\u0399'], ['\u1F84', '\u1F0C\u0399'], ['\u1F85', '\u1F0D\u0399'], ['\u1F86', '\u1F0E\u0399'], ['\u1F87', '\u1F0F\u0399'], ['\u1F88', '\u1F08\u0399'], ['\u1F89', '\u1F09\u0399'], ['\u1F8A', '\u1F0A\u0399'], ['\u1F8B', '\u1F0B\u0399'], ['\u1F8C', '\u1F0C\u0399'], ['\u1F8D', '\u1F0D\u0399'], ['\u1F8E', '\u1F0E\u0399'], ['\u1F8F', '\u1F0F\u0399'], ['\u1F90', '\u1F28\u0399'], ['\u1F91', '\u1F29\u0399'], ['\u1F92', '\u1F2A\u0399'], ['\u1F93', '\u1F2B\u0399'], ['\u1F94', '\u1F2C\u0399'], ['\u1F95', '\u1F2D\u0399'], ['\u1F96', '\u1F2E\u0399'], ['\u1F97', '\u1F2F\u0399'], ['\u1F98', '\u1F28\u0399'], ['\u1F99', '\u1F29\u0399'], ['\u1F9A', '\u1F2A\u0399'], ['\u1F9B', '\u1F2B\u0399'], ['\u1F9C', '\u1F2C\u0399'], ['\u1F9D', '\u1F2D\u0399'], ['\u1F9E', '\u1F2E\u0399'], ['\u1F9F', '\u1F2F\u0399'], ['\u1FA0', '\u1F68\u0399'], ['\u1FA1', '\u1F69\u0399'], ['\u1FA2', '\u1F6A\u0399'], ['\u1FA3', '\u1F6B\u0399'], ['\u1FA4', '\u1F6C\u0399'], ['\u1FA5', '\u1F6D\u0399'], ['\u1FA6', '\u1F6E\u0399'], ['\u1FA7', '\u1F6F\u0399'], ['\u1FA8', '\u1F68\u0399'], ['\u1FA9', '\u1F69\u0399'], ['\u1FAA', '\u1F6A\u0399'], ['\u1FAB', '\u1F6B\u0399'], ['\u1FAC', '\u1F6C\u0399'], ['\u1FAD', '\u1F6D\u0399'], ['\u1FAE', '\u1F6E\u0399'], ['\u1FAF', '\u1F6F\u0399'], ['\u1FB3', '\u0391\u0399'], ['\u1FBC', '\u0391\u0399'], ['\u1FC3', '\u0397\u0399'], ['\u1FCC', '\u0397\u0399'], ['\u1FF3', '\u03A9\u0399'], ['\u1FFC', '\u03A9\u0399'], ['\u1FB2', '\u1FBA\u0399'], ['\u1FB4', '\u0386\u0399'], ['\u1FC2', '\u1FCA\u0399'], ['\u1FC4', '\u0389\u0399'], ['\u1FF2', '\u1FFA\u0399'], ['\u1FF4', '\u038F\u0399'], ['\u1FB7', '\u0391\u0342\u0399'], ['\u1FC7', '\u0397\u0342\u0399'], ['\u1FF7', '\u03A9\u0342\u0399'] ]); const toUpper = (string) => string.replace(reToUpper, (match) => toUpperMap.get(match)); |