Unicode surrogates character encoding c#
我对Unicode字符有问题。当我想编码代理字符(在
我使用以下代码:
编码代码:
1 2 3 4 5 6 | string unicodeChars="a\uD800\uDA65"; FileStream stream=new FileStream (@"unicode_encoding.txt",FileMode.Create,FileAccess.Write); byte[] buffer=Encoding.Unicode.GetBytes(unicodeChars); stream.Write(buffer,0,buffer.Length); stream.Close(); |
译码:
1 2 3 4 5 6 7 8 9 10 | string decodedUnicodeChars; FileStream stream2=new FileStream (@"unicode_encoding.txt",FileMode.Open,FileAccess.Read); StreamReader reader=new StreamReader(stream2,Encoding.Unicode); decodedUnicodeChars=reader.ReadToEnd(); foreach(char c in decodedUnicodeChars) { Console.Write("{0}",Convert.ToInt32(c).ToString("X4")); } |
输出为:
1 | 0061 FFFD FFFD |
1 | string unicodeChars="a\uD800\uD565"; |
这是一个吉戈,垃圾进,垃圾出的例子。代理项无效,第二个代理项必须在范围udc00..udfff内。