c++ access static members using null pointer
最近尝试了以下程序,它编译、运行良好并产生预期的输出,而不是任何运行时错误。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | #include <iostream> class demo { public: static void fun() { std::cout<<"fun() is called "; } static int a; }; int demo::a=9; int main() { demo* d=nullptr; d->fun(); std::cout<<d->a; return 0; } |
如果使用未初始化的指针访问类和/或结构成员的行为是未定义的,但为什么还允许它使用空指针访问静态成员。我的程序有什么害处吗?
TL;DR:您的示例定义很好。仅仅取消对空指针的引用并不会调用ub。
关于这个主题有很多争论,主要归结为是否通过空指针间接寻址本身就是ub。在您的示例中唯一有问题的事情是对对象表达式的评估。特别是,根据[expr.ref]/2,
The expression
E1->E2 is converted to the equivalent form
(*(E1)).E2 ; the remainder of 5.2.5 will address only the first
option (dot).
The postfix expression before the dot or arrow is evaluated;65 the
result of that evaluation, together with the id-expression, determines
the result of the entire postfix expression.65) If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary
to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.
让我们提取代码的关键部分。考虑表达式语句
1 | *d; |
在该语句中,
有一个公开的CWG问题232,创建于15年前,涉及到这个确切的问题。提出了一个非常重要的论点。报告开头是
At least a couple of places in the IS state that indirection through a
null pointer produces undefined behavior: 1.9 [intro.execution]
paragraph 4 gives"dereferencing the null pointer" as an example of
undefined behavior, and 8.3.2 [dcl.ref] paragraph 4 (in a note) uses
this supposedly undefined behavior as justification for the
nonexistence of"null references."
请注意,所提到的示例已更改为涵盖
However, 5.3.1 [expr.unary.op] paragraph 1, which describes the unary
"* " operator, does not say that the behavior is undefined if the
operand is a null pointer, as one might expect. Furthermore, at least
one passage gives dereferencing a null pointer well-defined behavior:
5.2.8 [expr.typeid] paragraph 2 saysIf the lvalue expression is obtained by applying the unary * operator
to a pointer and the pointer is a null pointer value (4.10
[conv.ptr]), the typeid expression throws the bad_typeid exception
(18.7.3 [bad.typeid]).这是不一致的,应该清理。
< /块引用>最后一点特别重要。[expr.typeid]中的引号仍然存在,并且属于多态类类型的glvalues,以下示例中就是这种情况:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 int main() try {
// Polymorphic type
class A
{
virtual ~A(){}
};
typeid( *((A*)0) );
}
catch (std::bad_typeid)
{
std::cerr <<"bad_exception
";
}这个程序的行为是定义良好的(异常将被抛出并捕获),表达式
*((A*)0) 的计算方式是因为它不是未计算操作数的一部分。现在,如果通过空指针的间接寻址导致了ub,那么表达式将编写为
1 *((A*)0);就这么做,诱导ub,与
typeid 方案相比,这似乎是无稽之谈。如果上面的表达式只是按照每个丢弃的值表达式为1进行计算,那么在第二个代码段ub中进行计算的关键区别在哪里?目前还没有分析typeid 操作数、查找最里面对应的解引用并用检查包围其操作数的实现,这也会导致性能损失。在该问题中的一个注释结束了简短的讨论:
We agreed that the approach in the standard seems okay:
p = 0; *p;
is not inherently an error. An lvalue-to-rvalue conversion would give
it undefined behavior.也就是说,委员会同意了这一点。虽然本报告提出的决议,即引入所谓的"空lvalues",从未被采纳…
However,"not modifiable" is a compile-time concept, while in fact
this deals with runtime values and thus should produce undefined
behavior instead. Also, there are other contexts in which lvalues can
occur, such as the left operand of . or .*, which should also be
restricted. Additional drafting is required.…这不会影响基本原理。再者,应该注意到这个问题甚至先于C++ 03,这使得我们在接近C++ 17时不太有说服力。
CWG问题315似乎也涵盖了您的案例:
Another instance to consider is that of invoking a member function
from a null pointer:
1
2
3
4
5
6 struct A { void f () { } };
int main ()
{
A* ap = 0;
ap->f ();
}[…]
Rationale (October 2003):
We agreed the example should be allowed.
p->f() is rewritten as
(*p).f() according to 5.2.5 [expr.ref].*p is not an error when
p is null unless the lvalue is converted to an rvalue (4.1
[conv.lval]), which it isn't here.根据这个基本原理,如果没有进一步的lvalue-to-rvalue转换(=访问存储值)、引用绑定、值计算等,通过空指针本身进行间接寻址就不会调用ub。(注意:用空指针调用一个非静态成员函数应该调用ub,尽管只是被[class.mfct.non-static]模糊地禁用了]/2。这方面的理由已经过时了。)
也就是说,仅仅对
*d 的评估不足以调用ub。对象的标识不是必需的,它以前存储的值也不是必需的。另一方面,例如
1 *p = 123;未定义,因为有左操作数的值计算,[expr.ass]/1:
In all cases, the assignment is sequenced after the value computation
of the right and left operands因为左操作数应该是glvalue,所以该glvalue引用的对象的标识必须按照[intro.execution]/12中表达式的计算定义来确定,这是不可能的(因此会导致ub)。
1 [ EXPR]/11:
In some contexts, an expression only appears for its side effects.
Such an expression is called a discarded-value expression. The
expression is evaluated and its value is discarded. […]. The lvalue-to-rvalue conversion (4.1) is
applied if and only if the expression is a glvalue of
volatile-qualified type and […]
型从C++草案标准N33 37:
9.4 Static members
2 A
static member s of classX may be referred to using the qualified-id expressionX::s ; it is not necessary to use the class member access syntax (5.2.5) to refer to astatic member. Astatic member may be referred
to using the class member access syntax, in which case the object expression is evaluated.号
在关于对象表达式的部分…
5.2.5 Class member access
4 If E2 is declared to have type"reference to T," then E1.E2 is an lvalue; the type of E1.E2 is T. Otherwise,
one of the following rules applies.— If
E2 is astatic data member and the type ofE2 isT , thenE1.E2 is an lvalue; the expression designates the named member of the class. The type ofE1.E2 is T.号
根据本标准最后一段,表达式如下:
1
2 d->fun();
std::cout << d->a;工作是因为它们都指定类的命名成员,而不管
d 的值如何。
runs fine and produces expected output instead of any runtime error.
这是一个基本假设错误。您所做的是未定义的行为,这意味着您对任何类型的"预期输出"的声明都是错误的。
附录:请注意,虽然有一份CWG缺陷(315)报告以"同意"的方式关闭,不制作上述UB,但它依赖于另一个仍处于活动状态的CWG缺陷(232)的正关闭,因此没有一个缺陷添加到标准中。
让我引用JamesMcNellis的一部分评论来回答类似的堆栈溢出问题:
I don't think CWG defect 315 is as"closed" as its presence on the"closed issues" page implies. The rationale says that it should be allowed because"*p is not an error when p is null unless the lvalue is converted to an rvalue." However, that relies on the concept of an"empty lvalue," which is part of the proposed resolution to CWG defect 232, but which has not been adopted.
型你在这里看到的是我认为在C++语言和许多属于同一个通用编程语言家族的其他语言的规范中,一个构思欠佳和不幸的设计选择。
这些语言允许您使用对类实例的引用来引用类的静态成员。当然,会忽略实例引用的实际值,因为访问静态成员不需要实例。
因此,在
d->fun(); 中,编译器只在编译期间使用d 指针来确定您引用的是demo 类的成员,然后忽略它。编译器不会发出任何代码来取消对指针的引用,因此运行时指针将为空并不重要。所以,您所看到的情况完全符合语言的规范,在我看来,规范在这方面受到了影响,因为它允许发生不合逻辑的事情:使用实例引用引用引用静态成员。
P.S.大多数语言的编译器实际上都能为这类东西发出警告。我不知道您的编译器,但您可能需要检查,因为您没有收到任何关于执行所做操作的警告,这可能意味着您没有启用足够的警告。