What is the difference between char array vs char pointer in C?
我试图理解C中的指针,但目前我对以下内容感到困惑:
-
1char *p ="hello"
这是一个指向字符数组的字符指针,从H开始。
-
1char p[] ="hello"
这是一个存储hello的数组。
当我把这两个变量都传递给这个函数时,有什么区别?
1 2 3 4 |
您的示例函数
1 2 | char s[10] ="hello"; printSomething(s); |
编译器假装您写了:
1 2 | char s[10] ="hello"; printSomething(&s[0]); |
让我们看看:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | #include <stdio.h> #include <string.h> int main() { char *p ="hello"; char q[] ="hello"; // no need to count this printf("%zu ", sizeof(p)); // => size of pointer to char -- 4 on x86, 8 on x86-64 printf("%zu ", sizeof(q)); // => size of char array in memory -- 6 on both // size_t strlen(const char *s) and we don't get any warnings here: printf("%zu ", strlen(p)); // => 5 printf("%zu ", strlen(q)); // => 5 return 0; } |
foo*和foo[]是不同的类型,编译器对它们进行不同的处理(指针=地址+指针类型的表示,数组=指针+数组的可选长度,例如,如果已知数组是静态分配的,则可以在标准中找到详细信息。在运行时级别上,它们之间没有区别(在汇编程序中,好吧,几乎,见下文)。
此外,C FAQ中还有一个相关问题:
Q: What is the difference between these initializations?
1
2 char a[] ="string literal";
char *p ="string literal";My program crashes if I try to assign a new value to p[i].
A: A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:
As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values of the characters in that array (and, if necessary, its size). Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element. Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).
See also questions 1.31, 6.1, 6.2, 6.8, and 11.8b.
References: K&R2 Sec. 5.5 p. 104
ISO Sec. 6.1.4, Sec. 6.5.7
Rationale Sec. 3.1.4
H&S Sec. 2.7.4 pp. 31-2
What is the difference between char array vs char pointer in C?
C99 N1256草案
字符串文本有两种不同的用途:
初始化
1 | char c[] ="abc"; |
这是"更神奇的",并在6.7.8/14"初始化"中描述:
An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the
elements of the array.
所以这只是一个捷径:
1 | char c[] = {'a', 'b', 'c', '\0'}; |
与其他任何常规数组一样,可以修改
其他任何地方:它生成:
- 未命名的
- char数组:C和C++中字符串字的类型是什么?
- 带静态存储器
- 如果修改的话会得到ub
所以当你写:
1 | char *c ="abc"; |
这类似于:
1 2 3 | /* __unnamed is magic because modifying it gives UB. */ static char __unnamed[] ="abc"; char *c = __unnamed; |
注意从
然后,如果修改
这在6.4.5"字符串文字"中有记录:
5 In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals. The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals, the array elements have
type char, and are initialized with the individual bytes of the multibyte character
sequence [...]6 It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. If the program attempts to modify such an array, the behavior is
undefined.
6.7.8/32"初始化"给出了一个直接示例:
EXAMPLE 8: The declaration
1 char s[] ="abc", t[3] ="abc";defines"plain" char array objects
s andt whose elements are initialized with character string literals.This declaration is identical to
1
2 char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };The contents of the arrays are modifiable. On the other hand, the declaration
1 char *p ="abc";defines
p with type"pointer to char" and initializes it to point to an object with type"array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to usep to modify the contents of the array, the behavior is undefined.
GCC 4.8 x86-64 ELF实现
程序:
1 2 3 4 5 6 7 8 |
编译和反编译:
1 2 | gcc -ggdb -std=c99 -c main.c objdump -Sr main.o |
输出包含:
1 2 3 4 | char *s ="abc"; 8: 48 c7 45 f8 00 00 00 movq $0x0,-0x8(%rbp) f: 00 c: R_X86_64_32S .rodata |
结论:GCC将
如果我们对
1 | char s[] ="abc"; |
我们得到:
1 | 17: c7 45 f0 61 62 63 00 movl $0x636261,-0x10(%rbp) |
所以它被存储在堆栈中(相对于
但是请注意,默认链接器脚本将
1 | readelf -l a.out |
其中包含:
1 2 3 | Section to Segment mapping: Segment Sections... 02 .text .rodata |
不允许更改字符串常量的内容,这是第一个
对于这样的情况,效果是一样的:您最终会传递字符串中第一个字符的地址。
但声明显然不同。
下面为字符串和字符指针留出内存,然后初始化指针以指向字符串中的第一个字符。
1 | char *p ="hello"; |
而下面只为字符串留出内存。所以它实际上可以使用更少的内存。
1 | char p[10] ="hello"; |
据我所知,数组实际上是一组指针。例如
1 | p[1]== *(&p+1) |
是一个真实的陈述
从辅助动力装置,第5.14节:
1 2 | char good_template[] ="/tmp/dirXXXXXX"; /* right way */ char *bad_template ="/tmp/dirXXXXXX"; /* wrong way*/ |
... For the first template, the name is allocated on the stack, because we use an
array variable. For the second name, however, we use a pointer. In this case, only the
memory for the pointer itself resides on the stack; the compiler arranges for the string to
be stored in the read-only segment of the executable. When themkstemp function tries
to modify the string, a segmentation fault occurs.
引用的文本符合@ciro santilli的解释。
江户十一〔五〕号?应该是
无论如何,C中的数组只是指向内存中调整对象的第一个对象的指针。唯一不同的是语义。虽然可以将指针的值更改为指向内存中的其他位置,但创建后的数组将始终指向相同的位置。另外,当使用数组时,"新建"和"删除"会自动为您完成。