关于mysql：utf8_general_ci和utf8_unicode_ci有什么区别？

What are the differences between utf8_general_ci and utf8_unicode_ci?

本问题已经有最佳答案，请猛点这里访问。

Possible Duplicate:
What's the difference between utf8_general_ci and utf8_unicode_ci

号

我有两个Unicode选项，对于MySQL数据库来说很有希望。

1 2	utf8_general_ci unicode (multilingual), case-insensitive utf8_unicode_ci unicode (multilingual), case-insensitive

你能解释一下utf8_-general_-ci和utf8_-unicode_-ci的区别吗？在设计数据库时，选择其中一个对另一个的影响是什么？

相关讨论

utf8_general_ci是一个非常简单的unicode，非常破碎的排序规则，它在普通unicode文本上给出错误的结果。它的作用是：

转换为Unicode规范化形式d进行规范化分解
删除任何组合字符
转换为大写

这在Unicode上不能正常工作，因为它不理解Unicode大小写。单是Unicode的大小写就比一个注重ASCII的方法要复杂得多。例如：

小写的"？"是"吗？"但是大写的呢？""是"ss"。
有两个小写的希腊符号，但只有一个大写的；考虑"是吗？"西格玛？"。
像这样的字母？不要分解为"o"加上音调符号，这意味着它不会正确排序。

还有许多其他的微妙之处。

utf8_unicode_ci使用标准的unicode排序算法，支持所谓的扩展和绑定，例如：德文字母？(u+00df字母Sharp S)在"s s"附近排序信？(U+0152拉丁文大写连字Oe)在"Oe"附近排序。

utf8_general_ci不支持扩展/连接，它排序所有这些字母都是单个字符，有时顺序不对。

对于所有脚本，utf8_unicode_ci通常更准确。例如，在西里尔文块上：对于所有这些语言，utf8_unicode_ci都可以：俄语、保加利亚语、白俄罗斯语、马其顿语、塞尔维亚语和乌克兰语。而utf8_-general_-ci只适用于西里尔文的俄语和保加利亚语子集。白俄罗斯语、马其顿语、塞尔维亚语和乌克兰语中使用的附加字母分类不好。

EDOCX1[1]的成本是有点比utf8_general_ci慢。但这就是你为正确性付出的代价。要么你有一个错误的快速答案，要么是一个非常缓慢的正确答案。你的选择。很难证明给出错误答案的合理性，因此最好假设utf8_general_ci不存在，并且总是使用utf8_unicode_ci。好吧，除非你想要错误的答案。

资料来源：http://forums.mysql.com/read.php？103187048188748消息-188748

相关讨论

从MySQL文档中的Unicode字符集：

For any Unicode character set, operations performed using the _general_ci collation are faster than those for the _unicode_ci collation. For example, comparisons for the utf8_general_ci collation are faster, but slightly less correct, than comparisons for utf8_unicode_ci. The reason for this is that utf8_unicode_ci supports mappings such as expansions; that is, when one character compares as equal to combinations of other characters. For example, in German and some other languages"?" is equal to"ss". utf8_unicode_ci also supports contractions and ignorable characters. utf8_general_ci is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters.