python how to know which tag exactly is not closed in xml
我有一个xml,我验证它是否真的是一个很好的格式化xml像这样:
1 2 3 4 5 6 7 8 | try: self.doc=etree.parse(attributesXMLFilePath) except IOError: error_message ="Error: Couldn't find attribute XML file path {0}".format(attributesXMLFilePath) raise XMLFileNotFoundException(error_message) except XMLSyntaxError: error_message ="The file {0} is not a good XML file, recheck please".format(attributesXMLFilePath) raise NotGoodXMLFormatException(error_message) |
如你所见,我正在捕获XMLSyntaxError,这是一个错误来自:
效果很好,但这只是告诉我文件是不是一个好的xml格式。 但是,我想问你们有没有办法知道哪个标签是错误的,因为在我这样做的情况下:
1 | <name>Marco</name1> |
我收到错误,有没有办法知道
更新
在一些人给我线路和位置的想法之后,我想出了这个代码:
1 2 3 4 5 6 7 8 | class XMLFileNotFoundException(GeneralSpiderException): def __init__(self, message): super(XMLFileNotFoundException, self).__init__(message, self) class GeneralSpiderException(Exception): def __init__(self, message, e): super(GeneralSpiderException, self).__init__(message+" line of Exception = {0}, position of Exception = {1}".format(e.lineno, e.position)) |
我仍然像这样提出错误
1 | raise XMLFileNotFoundException(error_message) |
我现在得到了这个错误
1 2 3 | super(GeneralSpiderException, self).__init__(message+" line of Exception = {0}, position of Exception = {1}".format(e.lineno, e.position)) exceptions.AttributeError: 'XMLFileNotFoundException' object has no attribute 'lineno' |
这可能不是您想要的,但您可以从异常中获取检测到错误的确切行和列:
1 2 3 4 5 6 7 8 | import lxml.etree import StringIO xml_fragment ="<name>Marco</name1>" # 12345678901234 try: lxml.etree.parse(StringIO.StringIO(xml_fragment)) except lxml.etree.XMLSyntaxError as exc: line, column = exc.position |
在此示例中,
您可以打印错误的详细信息。 例如:
1 2 3 4 5 | try: self.doc = etree.parse(attributesXMLFilePath) except XMLSyntaxError as e: error_message ="The file {0} is not correct XML, {1}".format(attributesXMLFilePath, e.msg) raise NotGoodXMLFormatException(error_message) |