关于java:当我调用String toLowerCase时,我应该指定哪个Locale

Which Locale should I specify when I call String toLowerCase

在Java中,ToLoToWrError方法使用默认系统区域设置来确定如何处理下拉框。如果我正在降低一些ASCII文本的值,并希望确保按预期处理这些文本,那么应该使用哪个区域设置?

编辑:我主要关注模式中的表和列名称等编程标识符。因此,我希望使用英语下套管。

locale.root声明它是区域设置敏感操作的语言/国家非特定区域设置。

英语大概也是一个安全的选择。


是的,对于编程语言标识符和url部分之类的操作,Locale.ENGLISH是一个安全的选择,因为它不涉及任何特殊的大小写规则,并且英语大小写中的所有7位ASCII字符都转换为7位ASCII字符。

其他地方的情况并非如此。在土耳其语中,"i"和"i"字符不会互相转换为大小写。

"点点滴滴的我"解释说:

The Turkish alphabet, which is a variant of the Latin alphabet, includes two distinct versions of the letter I, one dotted and the other dotless.

In Unicode, U+0131 is a lower case letter dotless i (?). U+0130 (?) is capital i with dot. ISO-8859-9 has them at positions 0xFD and 0xDD respectively. In normal typography, when lower case i is combined with other diacritics, the dot is generally removed before the diacritic is added; however, Unicode still lists the equivalent combining sequences as including the dotted i, since logically it is the normal dotted i character that is being modified.

Most Unicode software uppercases ? to I and lowercases ? to i, but, unless specifically set up for Turkish, it lowercases I to i and uppercases i to I. Thus uppercasing then lowercasing, or vice versa, changes the letters.

特殊例外列表保存在http://unicode.org/public/unidata/specialcasing.txt中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# ================================================================================

# Turkish and Azeri

# I and i-dotless; I-dot and i are case pairs in Turkish and Azeri
# The following rules handle those cases.

0130; 0069; 0130; 0130; tr; # LATIN CAPITAL LETTER I WITH DOT ABOVE
0130; 0069; 0130; 0130; az; # LATIN CAPITAL LETTER I WITH DOT ABOVE

# When lowercasing, remove dot_above in the sequence I + dot_above, which will turn into i.
# This matches the behavior of the canonically equivalent I-dot_above

0307; ; 0307; 0307; tr After_I; # COMBINING DOT ABOVE
0307; ; 0307; 0307; az After_I; # COMBINING DOT ABOVE

...


If I am lowercasing some ASCII text and want to be sure that this is processed as expected which Locale should I use?

这取决于"按预期"对你意味着什么。允许指定一个区域设置的要点是,大写/小写在所有语言中的作用并不相同,即使它们可能使用相同的字母。因此,请指定您和/或您的客户所居住的区域设置,它可能会如您/他们所期望的那样工作。