关于C#:什么时候strcmp不会返回-1,0或1?

When will strcmp not return -1, 0 or 1?

从手册页:

The strcmp() and strncmp() functions return an integer less than, equal
to, or greater than zero if s1 (or the first n bytes thereof) is found,
respectively, to be less than, to match, or be greater than s2.

C中的示例代码(在我的机器上打印-15,交换test1和test2反转该值):

1
2
3
4
5
6
7
8
9
#include <stdio.h>
#include <string.h>

int main() {
    char* test1 ="hello";
    char* test2 ="world";
    printf("%d
"
, strcmp(test1, test2));
}

我发现这个代码(取自这个问题)依赖于strcmp的值不是-1,0和1(它使用qsort中的返回值)。对我来说,这是一种糟糕的风格,取决于无证的特征。

我想我有两个相关的问题:

  • C标准中是否有某些内容定义了返回值除了小于,大于或等于零之外的值?如果没有,标准实现有什么作用?
  • 整个Linux,Windows和BSD的返回值是否一致?

编辑:

离开我的电脑5分钟后,我意识到问题代码实际上没有错误。在阅读评论/答案之前,我找出了我想出的部分,但是我把它留在那里以保持评论的相关性。我认为这仍然是一个有趣的问题,可能会导致习惯于其他语言的程序员总是返回-1,0或1(例如Python似乎这样做,但是没有记录这种方式)。

FWIW,我认为依赖记录行为以外的其他东西是不好的风格。


Is there something in the C standard that defines what the return values are besides less than, greater than, or equal to zero?

不会。最严格的限制是它应该为零,小于零或大于零,如此特定功能的文档中所指定的那样。

If not, what does the standard implementation do?

没有"标准实施"这样的东西。即使有,也可能只是

1
return zero, less than zero or more than zero;

:-)

Is the return value consistent across the Linux, Windows and the BSDs?

从10.7.4开始,我可以确认它在Linux和OS X中是一致的(具体来说,它是-1,0或+1)。我不知道Windows,但我打赌微软的人使用-2和+3只是为了打破代码:P

另外,我还要指出,你完全误解了代码的作用。

I found this code (taken from this question) that relies on the values of strcmp being something other than -1, 0 and 1 (it uses the return value in qsort). To me, this is terrible style and depends on undocumented features.

不,它实际上没有。 C标准库在设计时考虑了一致性和易用性。也就是说,qsort()要求的是它的比较器函数返回一个负数或正数或零 - 正是strcmp()保证要做的。所以这不是"糟糕的风格",它是完全符合标准的代码,不依赖于未记录的功能。


?Is there something in the C standard that defines what the return values are besides less than, greater than, or equal to zero? If not, what does the standard implementation do?

不,正如你自己提到的那样,男人说less than, equal to, or greater than zero,那也是标准。

?Is the return value consistent across the Linux, Windows and the BSDs?

没有。

在带有gcc的linux(openSuse 12.1,内核3.1)上,我得到-15 / 15,具体取决于test1test2是否为第一个。在Windows 7(VS 2010)上,我得到-1 / 1

基于strcmp()的松散定义,两者都很好。

...that relies on the values of strcmp being something other than -1, 0 and 1 (it uses the return value in qsort).

一个有趣的方面不适合你......如果你看一下qsort()这个例子,那么你使用strcmp()发布的铃声代码几乎相同,原因是qsort()需要的比较器功能实际上非常适合从strcmp()返回:

The comparison function must return an integer less than, equal to, or
greater than zero if the first argument is considered to be
respectively less than, equal to, or greater than the second.


在c99标准

1
7.21.4.2 The strcmp function

"strcmp函数返回一个整数greater than, equal to, or less than zero
因此,s1指向的字符串大于,等于或小于s2指向的字符串。"

这意味着标准不保证-101可能因操作系统而异。

您获得的值是wh之差15

在你的情况下helloworld所以'h'-'w' = -15 < 0that's why strcmp retuns -15


在此页面中:

The strcmp() function compares the string pointed to by s1 to the string pointed to by s2.
The sign of a non-zero return value is determined by the sign of the difference between the values of the first pair of bytes (both interpreted as type unsigned char) that differ in the strings being compared.

这是FreeBSD中strcmp的一个实现。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <string.h>

/*
 * Compare strings.
 */

int
strcmp(s1, s2)
    register const char *s1, *s2;
{
    while (*s1 == *s2++)
        if (*s1++ == 0)
            return (0);
    return (*(const unsigned char *)s1 - *(const unsigned char *)(s2 - 1));
}


实际上,strcmp的返回值可能是第一个位置的字节值之间的差异,这仅仅是因为返回这个差异比执行额外的条件分支将其转换为 - 更有效率 - 不幸的是,已知一些破碎的软件假设结果符合8位,导致严重的漏洞。简而言之,除了结果的标志外,你不应该使用任何东西。

有关这些问题的详细信息,请阅读我上面链接的文章:

https://communities.coverity.com/blogs/security/2012/07/19/more-defects-like-the-mysql-memcmp-vulnerability


C标准中没有任何内容可以讨论strcmp()返回的值(即除了该值的符号之外):

7.21.4.2 The strcmp function

Synopsis

1
2
#include <string.h>
int strcmp(const char *s1, const char *s2);

Description

The strcmp function compares the string pointed to by s1
to the string pointed to by s2.

Returns

The strcmp function returns an integer greater than, equal
to, or less than zero, accordingly as the string pointed to by s1 is
greater than, equal to, or less than the string pointed to by s2.

因此很明显,使用除返回值的符号之外的任何东西都是不好的做法。


从手册页:

RETURN VALUE
The strcmp() and strncmp() functions return an integer less than, equal to, or greater than zero if s1 (or the first n bytes
thereof) is found, respectively, to
be less than, to match, or be greater than s2.

它只指定它大于或小于0,没有说明具体值,这些是我认为的具体实现。

CONFORMING TO
SVr4, 4.3BSD, C89, C99.
This says in which standards it is included. The function must exist and behave as specified, but the specification doesn't say anything about the actual returned values, so you can't rely on them.