C中的char数组与char指针有什么区别?

What is the difference between char array vs char pointer in C?

我试图理解C中的指针,但目前我对以下内容感到困惑:

  • 1
    char *p ="hello"

    这是一个指向字符数组的字符指针,从H开始。

  • 1
    char p[] ="hello"

    这是一个存储hello的数组。

当我把这两个变量都传递给这个函数时,有什么区别?

1
2
3
4
void printSomething(char *p)
{
    printf("p: %s",p);
}


char*char[]是不同的类型,但并非在所有情况下都很明显。这是因为数组衰减为指针,也就是说,如果在需要char*类型的表达式的地方提供char[]类型的表达式,编译器会自动将数组转换为指向其第一个元素的指针。

您的示例函数printSomething需要一个指针,因此如果您试图像这样向它传递数组:

1
2
char s[10] ="hello";
printSomething(s);

编译器假装您写了:

1
2
char s[10] ="hello";
printSomething(&s[0]);


让我们看看:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <stdio.h>
#include <string.h>

int main()
{
    char *p ="hello";
    char q[] ="hello"; // no need to count this

    printf("%zu
"
, sizeof(p)); // => size of pointer to char -- 4 on x86, 8 on x86-64
    printf("%zu
"
, sizeof(q)); // => size of char array in memory -- 6 on both

    // size_t strlen(const char *s) and we don't get any warnings here:
    printf("%zu
"
, strlen(p)); // => 5
    printf("%zu
"
, strlen(q)); // => 5

    return 0;
}

foo*和foo[]是不同的类型,编译器对它们进行不同的处理(指针=地址+指针类型的表示,数组=指针+数组的可选长度,例如,如果已知数组是静态分配的,则可以在标准中找到详细信息。在运行时级别上,它们之间没有区别(在汇编程序中,好吧,几乎,见下文)。

此外,C FAQ中还有一个相关问题:

Q: What is the difference between these initializations?

1
2
char a[] ="string literal";  
char *p  ="string literal";

My program crashes if I try to assign a new value to p[i].

A: A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:

  • As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values of the characters in that array (and, if necessary, its size).
  • Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.
  • Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).

    See also questions 1.31, 6.1, 6.2, 6.8, and 11.8b.

    References: K&R2 Sec. 5.5 p. 104

    ISO Sec. 6.1.4, Sec. 6.5.7

    Rationale Sec. 3.1.4

    H&S Sec. 2.7.4 pp. 31-2


    What is the difference between char array vs char pointer in C?

    C99 N1256草案

    字符串文本有两种不同的用途:

  • 初始化char[]

    1
    char c[] ="abc";

    这是"更神奇的",并在6.7.8/14"初始化"中描述:

    An array of character type may be initialized by a character string literal, optionally
    enclosed in braces. Successive characters of the character string literal (including the
    terminating null character if there is room or if the array is of unknown size) initialize the
    elements of the array.

    所以这只是一个捷径:

    1
    char c[] = {'a', 'b', 'c', '\0'};

    与其他任何常规数组一样,可以修改c

  • 其他任何地方:它生成:

    • 未命名的
    • char数组:C和C++中字符串字的类型是什么?
    • 带静态存储器
    • 如果修改的话会得到ub

    所以当你写:

    1
    char *c ="abc";

    这类似于:

    1
    2
    3
    /* __unnamed is magic because modifying it gives UB. */
    static char __unnamed[] ="abc";
    char *c = __unnamed;

    注意从char[]char *的隐式强制转换,这始终是合法的。

    然后,如果修改c[0],也可以修改__unnamed,即ub。

    这在6.4.5"字符串文字"中有记录:

    5 In translation phase 7, a byte or code of value zero is appended to each multibyte
    character sequence that results from a string literal or literals. The multibyte character
    sequence is then used to initialize an array of static storage duration and length just
    sufficient to contain the sequence. For character string literals, the array elements have
    type char, and are initialized with the individual bytes of the multibyte character
    sequence [...]

    6 It is unspecified whether these arrays are distinct provided their elements have the
    appropriate values. If the program attempts to modify such an array, the behavior is
    undefined.

  • 6.7.8/32"初始化"给出了一个直接示例:

    EXAMPLE 8: The declaration

    1
    char s[] ="abc", t[3] ="abc";

    defines"plain" char array objects s and t whose elements are initialized with character string literals.

    This declaration is identical to

    1
    2
    char s[] = { 'a', 'b', 'c', '\0' },
    t[] = { 'a', 'b', 'c' };

    The contents of the arrays are modifiable. On the other hand, the declaration

    1
    char *p ="abc";

    defines p with type"pointer to char" and initializes it to point to an object with type"array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

    GCC 4.8 x86-64 ELF实现

    程序:

    1
    2
    3
    4
    5
    6
    7
    8
    #include <stdio.h>

    int main(void) {
        char *s ="abc";
        printf("%s
    "
    , s);
        return 0;
    }

    编译和反编译:

    1
    2
    gcc -ggdb -std=c99 -c main.c
    objdump -Sr main.o

    输出包含:

    1
    2
    3
    4
     char *s ="abc";
    8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)
    f:  00
            c: R_X86_64_32S .rodata

    结论:GCC将char*储存于.rodata段,不储存于.text段。

    如果我们对char[]也这样做:

    1
     char s[] ="abc";

    我们得到:

    1
    17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

    所以它被存储在堆栈中(相对于%rbp)。

    但是请注意,默认链接器脚本将.rodata.text放在同一段中,该段具有执行权限,但没有写入权限。这可以通过以下方式观察到:

    1
    readelf -l a.out

    其中包含:

    1
    2
    3
     Section to Segment mapping:
      Segment Sections...
       02     .text .rodata


    不允许更改字符串常量的内容,这是第一个p指向的内容。第二个p是一个用字符串常量初始化的数组,您可以更改它的内容。


    对于这样的情况,效果是一样的:您最终会传递字符串中第一个字符的地址。

    但声明显然不同。

    下面为字符串和字符指针留出内存,然后初始化指针以指向字符串中的第一个字符。

    1
    char *p ="hello";

    而下面只为字符串留出内存。所以它实际上可以使用更少的内存。

    1
    char p[10] ="hello";


    据我所知,数组实际上是一组指针。例如

    1
    p[1]== *(&p+1)

    是一个真实的陈述


    从辅助动力装置,第5.14节:

    1
    2
    char    good_template[] ="/tmp/dirXXXXXX"; /* right way */
    char    *bad_template ="/tmp/dirXXXXXX";   /* wrong way*/

    ... For the first template, the name is allocated on the stack, because we use an
    array variable. For the second name, however, we use a pointer. In this case, only the
    memory for the pointer itself resides on the stack; the compiler arranges for the string to
    be stored in the read-only segment of the executable. When the mkstemp function tries
    to modify the string, a segmentation fault occurs.

    引用的文本符合@ciro santilli的解释。


    江户十一〔五〕号?应该是char p[6] ="hello",记住在C中的"字符串"的末尾有一个''字符。

    无论如何,C中的数组只是指向内存中调整对象的第一个对象的指针。唯一不同的是语义。虽然可以将指针的值更改为指向内存中的其他位置,但创建后的数组将始终指向相同的位置。另外,当使用数组时,"新建"和"删除"会自动为您完成。