从C API访问Python回溯

Accessing a Python traceback from the C API

我很难找到使用C API执行Python回溯的正确方法。我正在编写一个嵌入Python解释器的应用程序。我希望能够执行任意的Python代码，如果它引发异常，将其转换为我自己的特定于应用程序的C++异常。现在，只提取引发python异常的文件名和行号就足够了。这就是我目前为止所拥有的：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

PyObject* pyresult = PyObject_CallObject(someCallablePythonObject, someArgs);
if (!pyresult)
{
PyObject* excType, *excValue, *excTraceback;
PyErr_Fetch(&excType, &excValue, &excTraceback);
PyErr_NormalizeException(&excType, &excValue, &excTraceback);

PyTracebackObject* traceback = (PyTracebackObject*)traceback;
// Advance to the last frame (python puts the most-recent call at the end)
while (traceback->tb_next != NULL)
traceback = traceback->tb_next;

// At this point I have access to the line number via traceback->tb_lineno,
// but where do I get the file name from?

// ...
}

深入了解python源代码后，我发现他们通过_frame结构访问当前帧的文件名和模块名，这看起来像是一个私有的结构。我的下一个想法是以编程方式加载python"traceback"模块，并使用C API调用其函数。这是理智的吗？有没有更好的方法从C访问python回溯？

相关讨论

这是一个古老的问题，但为了将来参考，您可以从线程状态对象中获取当前堆栈帧，然后向后移动这些帧。除非您希望为将来保留状态，否则不需要回溯对象。

例如：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

PyThreadState *tstate = PyThreadState_GET();
if (NULL != tstate && NULL != tstate->frame) {
PyFrameObject *frame = tstate->frame;

printf("Python stack trace:
");
while (NULL != frame) {
// int line = frame->f_lineno;
/*
frame->f_lineno will not always return the correct line number
you need to call PyCode_Addr2Line().
*/
int line = PyCode_Addr2Line(frame->f_code, frame->f_lasti);
const char *filename = PyString_AsString(frame->f_code->co_filename);
const char *funcname = PyString_AsString(frame->f_code->co_name);
printf(" %s(%d): %s
", filename, line, funcname);
frame = frame->f_back;
}
}

我发现_frame实际上是在Python中包含的frameobject.h头中定义的。在python c实现中，通过这个加上查看traceback.c，我们有：

1
2
3
4
5
6
7

#include <Python.h>
#include <frameobject.h>

PyTracebackObject* traceback = get_the_traceback();

int line = traceback->tb_lineno;
const char* filename = PyString_AsString(traceback->tb_frame->f_code->co_filename);

但这对我来说还是很肮脏。

在编写C扩展时，我发现一个很有用的原则是在最适合的地方使用每种语言。所以，如果你有一个任务要做，那就是最好用python实现，最好用python实现，如果最好用c实现，最好用c实现。解释回溯最好用python完成，原因有两个：第一，因为python有工具来完成，第二，因为它不是速度关键的。

我将编写一个python函数，从回溯中提取您需要的信息，然后从C中调用它。

甚至可以编写用于可调用执行的Python包装器。不要调用someCallablePythonObject，而是将其作为参数传递给python函数：

1
2
3
4
5
6
7
8
9

def invokeSomeCallablePythonObject(obj, args):
try:
result = obj(*args)
ok = True
except:
# Do some mumbo-jumbo with the traceback, etc.
result = myTraceBackMunger(...)
ok = False
return ok, result

然后在C代码中，调用这个python函数来完成这项工作。这里的关键是实用主义地决定C-python拆分的哪一边来放置代码。

相关讨论

我最近有理由在为numpy编写分配跟踪程序时这样做。前面的答案很接近，但是frame->f_lineno不会总是返回正确的行号--您需要打电话给PyFrame_GetLineNumber()。以下是更新后的代码段：

1
2
3
4
5
6

#include"frameobject.h"
...

PyFrameObject* frame = PyEval_GetFrame();
int lineno = PyFrame_GetLineNumber(frame);
PyObject *filename = frame->f_code->co_filename;

完整的线程状态在pyframeobject中也可用；如果您想遍历堆栈，请在f_back上迭代，直到它为空。签出frameobject.h中的完整数据结构：http://svn.python.org/projects/python/trunk/include/frameobject.h

另请参见：https://docs.python.org/2/c-api/reflection.html

我使用以下代码提取了python异常的错误体。strExcType存储异常类型，strExcValue存储异常主体。样本值为：

1 2	strExcType:"<class 'ImportError'>" strExcValue:"ImportError("No module named 'nonexistingmodule'",)"

CPP代码：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

if(PyErr_Occurred() != NULL) {
PyObject *pyExcType;
PyObject *pyExcValue;
PyObject *pyExcTraceback;
PyErr_Fetch(&pyExcType, &pyExcValue, &pyExcTraceback);
PyErr_NormalizeException(&pyExcType, &pyExcValue, &pyExcTraceback);

PyObject* str_exc_type = PyObject_Repr(pyExcType);
PyObject* pyStr = PyUnicode_AsEncodedString(str_exc_type,"utf-8","Error ~");
const char *strExcType = PyBytes_AS_STRING(pyStr);

PyObject* str_exc_value = PyObject_Repr(pyExcValue);
PyObject* pyExcValueStr = PyUnicode_AsEncodedString(str_exc_value,"utf-8","Error ~");
const char *strExcValue = PyBytes_AS_STRING(pyExcValueStr);

// When using PyErr_Restore() there is no need to use Py_XDECREF for these 3 pointers
//PyErr_Restore(pyExcType, pyExcValue, pyExcTraceback);

Py_XDECREF(pyExcType);
Py_XDECREF(pyExcValue);
Py_XDECREF(pyExcTraceback);

Py_XDECREF(str_exc_type);
Py_XDECREF(pyStr);

Py_XDECREF(str_exc_value);
Py_XDECREF(pyExcValueStr);
}

您可以访问类似于tb_printinternal函数的python traceback。它遍历PyTracebackObject列表。我也尝试过上面的建议来遍历帧，但它对我不起作用(我只看到最后一个堆栈帧)。

CPython代码摘录：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

static int
tb_displayline(PyObject *f, PyObject *filename, int lineno, PyObject *name)
{
int err;
PyObject *line;

if (filename == NULL || name == NULL)
return -1;
line = PyUnicode_FromFormat(" File "%U", line %d, in %U
",
filename, lineno, name);
if (line == NULL)
return -1;
err = PyFile_WriteObject(line, f, Py_PRINT_RAW);
Py_DECREF(line);
if (err != 0)
return err;
/* ignore errors since we can't report them, can we? */
if (_Py_DisplaySourceLine(f, filename, lineno, 4))
PyErr_Clear();
return err;
}

static int
tb_printinternal(PyTracebackObject *tb, PyObject *f, long limit)
{
int err = 0;
long depth = 0;
PyTracebackObject *tb1 = tb;
while (tb1 != NULL) {
depth++;
tb1 = tb1->tb_next;
}
while (tb != NULL && err == 0) {
if (depth <= limit) {
err = tb_displayline(f,
tb->tb_frame->f_code->co_filename,
tb->tb_lineno,
tb->tb_frame->f_code->co_name);
}
depth--;
tb = tb->tb_next;
if (err == 0)
err = PyErr_CheckSignals();
}
return err;
}