关于正则表达式:Excel VBA正则表达式与引号一起使用

Excel VBA Regular Expressions Working with Quotes

我需要声明一个字符串用作正则表达式模式。

字符串是:
(?<= " [a-zA-Z0-9 .-] * \\\\\\\\ d {8} .xml(?= ")

通常在VBA中声明要在Reg Exp中使用的字符串,请用双引号将其引起来,如下所示:
"(?<= " [a-zA-Z0-9 .-] * \\\\\\\\ d {8} .xml(?= ")" 但这会导致VBA编译错误:预期:语句结尾,突出显示[a-zA-Z0-9 .-]。

此:
"(?<= " [a-zA-Z0-9 .-] * \\\\\\\\ d {8} .xml(?= ")" 导致相同的错误。

这里
"(?<= " [a-zA-Z0-9 .-] * \\\\\\\\ d {8} .xml(?= ")"

可以,但是当我使用Msgbox查看模式时,它看起来像这样:

(?<= " [a-zA-Z0-9 .-] * \\\\\\\\ d {8} .xml(?= ")

,因此无法在RegEx中正常工作。

Arghhhh!

这是我用于测试的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
    Sub tester()
        Dim PATH_TO_FILINGS As String
        'PATH_TO_FILINGS ="www.sec.gov/Archives/edgar/data/1084869/000110465913082760"
        PATH_TO_FILINGS ="www.sec.gov/Archives/edgar/data/1446896/000144689612000023"
        MsgBox GetInstanceDocumentPath(PATH_TO_FILINGS)
    End Sub

    Function GetInstanceDocumentPath(PATH_TO_FILINGS As String)

        'this part launches IE and goes to the correct directory
        If IEbrowser Is Nothing Then
            Set IEbrowser = CreateObject("InternetExplorer.application")
            IEbrowser.Visible = False
        End If

        IEbrowser.Navigate URL:=PATH_TO_FILINGS

        While IEbrowser.Busy Or IEbrowser.readyState <> 4: DoEvents: Wend

       'this part starts the regular expression engine and searches for the reg exp pattern (i.e. the file name)
        Dim RE As Object
        Set RE = CreateObject("vbscript.regexp")

        RE.Pattern ="(?<="[a-zA-Z0-9.-]*\\d{8}.xml(?=")"   '"\\w+(?=-)(-)\\d{8}(.xml)"
        MsgBox RE.Pattern
        RE.IgnoreCase = True

        Dim INSTANCEDOCUMENT As Object

        Set INSTANCEDOCUMENT = RE.Execute(IEbrowser.Document.body.innerhtml)

        If INSTANCEDOCUMENT.Count = 1 Then

            GetInstanceDocumentPath = PATH_TO_FILINGS &"/" & INSTANCEDOCUMENT.Item(0)

        End If

    End Function

任何有关如何解决此问题的想法都值得赞赏。


尝试这样做:

1
2
3
4
5
Sub Test()
RealQ = Chr(34)
Pattern ="(?<=" & RealQ &")[a-zA-Z0-9.-]*\\d{8}.xml(?=" & RealQ &")"
MsgBox Pattern
End Sub

结果:

enter image description here

此外,VBA不支持先行搜索,但它支持先行搜索。可以在此处找到更好的参考。