关于C#:如果两个相似的字符串长度不同,strcmp会返回什么?

What does strcmp return if two similar strings are of different lengths?

我知道如果你在strcmp中有'cat'(string1)和'dog'(string2)(这是一个C问题)那么strcmp的返回值将小于0(因为'cat'在词法上小于'dog'")。

但是,如果发生这种情况,我不确定strcmp会发生什么:

1
2
string1: 'dog'
string2: 'dog2'.

strcmp将返回什么? 小于零,零或大于? 对于上下文,我试图编写一个比较器函数来比较字符串,并希望考虑以相同字符开头的字符串。 一个字符串可能有一个扩展名(例如上面例子中'dog2'中的'2')。

编辑:这不是一个重复的问题。 这个问题据说类似于回复类型代表的问题 - 我说的是当字符串相同但直到其中一个字符串停止而另一个字符串继续时会发生什么。


它在C标准中定义为前两个不匹配字符之间的差异,但实现是疯狂的。唯一的共同点是,对于相等的字符串,返回值为零,然后对于str1str1>str2分别为<0 or >0
来自ISO / IEC 9899:201x,§7.23.4比较功能:

The sign of a nonzero value returned by the comparison functions
memcmp, strcmp, and strncmp is determined by the sign of the
difference between the values of the first pair of characters (both
interpreted as unsigned char) that differ in the objects being
compared.

但是有些实现会将典型值返回为0, 1 and -1。请参阅Apple实现(http://opensource.apple.com//source/Libc/Libc-262/ppc/gen/strcmp.c):

1
2
3
4
5
6
7
8
int
strcmp(const char *s1, const char *s2)
{
    for ( ; *s1 == *s2; s1++, s2++)
    if (*s1 == '\0')
        return 0;
    return ((*(unsigned char *)s1 < *(unsigned char *)s2) ? -1 : +1);
}

编辑:
在Donut-release的Android启动库中(https://android.googlesource.com/platform/bootable/bootloader/legacy/+/donut-release/libc/strcmp.c)如果字符串相等,函数返回0和其他2种情况的1,仅用于逻辑操作:

1
2
3
4
5
6
7
8
int strcmp(const char *a, const char *b)
{
    while(*a && *b) {
        if(*a++ != *b++) return 1;
    }
    if(*a || *b) return 1;
    return 0;
}

它返回八位字节不同的差异。在您的示例'\0' < '2'中,返回负数。


C11报价

C11 N1570标准草案

我认为"dog" <"dog2"由以下引号保证:

7.23.4 Comparison functions
1
The sign of a nonzero value returned by the comparison functions memcmp, strcmp,
and strncmp is determined by the sign of the difference between the values of the first
pair of characters (both interpreted as unsigned char) that differ in the objects being
compared.

所以字符被解释为数字,'\0'保证0

然后:

7.23.4.2 The strcmp function
2
The strcmp function compares the string pointed to by s1 to the string pointed to by
s2.

说,显然,字符串被比较,并且:

7.1.1 Definitions of terms
1 A string is a contiguous sequence of characters terminated by and including the first null
character.

说null是字符串的一部分。

最后:

5.2.1 Character sets
2 [...] A byte with
all bits set to 0, called the null character, shall exist in the basic execution character set; it
is used to terminate a character string.

所以'\0'等于零。

由于解释为unsigned char,并且所有字符都不同,因此零是可能的最小数字。


来自man strcmp:

The strcmp() and strncmp() functions return an integer less than,
equal to, or greater than zero if s1 (or the first n bytes thereof) is
found, respectively, to be less than, to match, or be greater than s2.

这通常会像@hroptatyr描述的那样实现。