关于c ++：对于用户定义的表达式，std :: regex是否安全？

Is std::regex safe for user-defined expressions?

本问题已经有最佳答案，请猛点这里访问。

使用EDCOX1(0)的用户定义表达式(如服务器端搜索)是否安全？标准库是否保证了被破坏的表达式的安全？

基本上不行，不安全。完全合法的正则表达式可以被制作出来，计算时间非常长，导致拒绝服务。

来自维基百科的Redos

The regular expression denial of service (ReDoS)[1] is an algorithmic complexity attack that produces a denial-of-service by providing a regular expression that takes a very long time to evaluate. The attack exploits the fact that most regular expression implementations have exponential time worst case complexity: the time taken can grow exponentially in relation to input size. An attacker can thus cause a program to spend an unbounded amount of time processing by providing such a regular expression, either slowing down or becoming unresponsive.

标准要求，当传递的regex无效时，实现将引发错误。

[regex.construct-3]:

1
explicit basic_regex(const charT* p, flag_type f = regex_constants::ECMAScript);

Requires: p shall not be a null pointer.

Throws: regex_-error if p is not a valid regular expression.

Effects: Constructs an object of class basic_-regex; the object's internal finite state machine is constructed from the regular expression contained in the array of charT of length char_-traits::?length(p) whose first element is designated by p, and interpreted according to the flags f.

Ensures: flags() returns f. mark_-count() returns the number of marked sub-expressions within the expression.

甚至还有一张表格详细说明了各种可能的错误。

因此，只要不传递空指针，就不应在从用户提供的字符串创建regex时出现未定义的行为。

请注意，任何实际的实现可能仍然有可能导致安全漏洞的错误。这个标准也显然不能保证恶意用户无法通过提交一个非常复杂/自引用的regex来完成系统任务，因为该regex产生了太多的匹配项，使用了太多的内存/cpu等，所以您必须自己考虑。但是如果你只是担心一个无效的regex是否可以自由地引导到ub，答案是"不，你很好"。

C++标准定义了正确行为的含义。即使在抛出异常的函数的情况下，标准也定义了哪些函数抛出，哪些异常将被抛出，以及在什么情况下会导致抛出这些异常。此类代码具有标准定义良好的行为。

如果一个实现正在对标准定义的行为进行操作(即：是"中断的")，则标准不会也不能指定会发生什么。如果标准要定义这样的行为，那么根据定义，实现不会违背标准。所以它们不会再被"破坏"。

因此，regex实现是否能够避免由外部提供的字符串引起的病理行为，而您没有对这些字符串进行清理，这是实现质量的问题，而不是标准定义的行为。