C#string comparision’?’

C# string comparision '?' 'oe' 'o'

本问题已经有最佳答案,请猛点这里访问。

Possible Duplicate:
how to recognize similar words with difference in spelling

在比较这3个字符串时,我试图返回true:"voest"、"vost"和"v"?(德国文化),因为它是同一个词。(实际上,只有oE和?是相同的,但例如,对于数据库排序规则CI,正确的是相同的,因为"vost"是错误输入的"voest")。

无论我为该方法提供什么参数,compare(..)/string.equals(..)始终返回false。

如何使string.compare()/等于(..)返回true?


您可以创建一个忽略umlauts的自定义比较器:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class IgnoreUmlautComparer : IEqualityComparer<string>
{
    Dictionary<char, char> umlautReplacer = new Dictionary<char, char>()
    {
        {'?','a'}, {'?','A'},
        {'?','o'}, {'?','O'},
        {'ü','u'}, {'ü','U'},
    };
    Dictionary<string, string> pseudoUmlautReplacer = new Dictionary<string, string>()
    {
        {"ae","a"}, {"Ae","A"},
        {"oe","o"}, {"Oe","O"},
        {"ue","u"}, {"Ue","U"},
    };

    private IEnumerable<char> ignoreUmlaut(string s)
    {
        char value;
        string replaced = new string(s.Select(c => umlautReplacer.TryGetValue(c, out value) ? value : c).ToArray());
        foreach (var kv in pseudoUmlautReplacer)
            replaced = replaced.Replace(kv.Key, kv.Value);
        return replaced;
    }

    public bool Equals(string x, string y)
    {
        var xChars = ignoreUmlaut(x);
        var yChars = ignoreUmlaut(y);
        return xChars.SequenceEqual(yChars);
    }

    public int GetHashCode(string obj)
    {
        return ignoreUmlaut(obj).GetHashCode();
    }
}

现在您可以将此比较器与Enumerable方法(如Distinct方法)一起使用:

1
2
3
string[] allStrings = new[]{"voest","vost","v?st"};
bool allEqual = allStrings.Distinct(new IgnoreUmlautComparer()).Count() == 1;
// --> true


在比较时可以尝试使用ignorenonspace选项。它不会解决voest-vost,但会帮助vost-v?圣

1
2
int a = new CultureInfo("de-DE").CompareInfo.Compare("vost","v?st", CompareOptions.IgnoreNonSpace);
// a = 0; strings are equal.