How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
- 在cell函数中返回匹配的模式或字符串中替换的值。
- 循环通过一列数据并提取与相邻单元格的匹配项。
- 需要什么设置?
- Excel的正则表达式特殊字符是什么?
步骤1:添加对"Microsoft vbscript正则表达式5.5"的vba引用好的。
- 选择"开发人员"选项卡(我没有此选项卡,我该怎么做?)
- 从"代码"功能区部分选择"Visual Basic"图标
- 在"Microsoft Visual Basic for Applications"窗口中,从顶部菜单中选择"Tools"。
- 选择"引用"
- 选中要包含在工作簿中的"Microsoft vbscript正则表达式5.5"旁边的框。
- 点击"确定"
- 例如,
a-z 匹配从A到Z的小写字母 - 例如,
0-5 匹配0到5之间的任何数字
- 例如,
[a] 与字母a匹配 - 例如,
[abc] 匹配单个字母,可以是a、b或c。 - 例如,
[a-z] 匹配字母表中任何一个小写字母。
- 例如,
[a]{2} 匹配两个连续的小写字母a:aa 。 - 例如,
[a]{1,3} 至少匹配一个最多三个小写字母a 、aa 、aaa 。
- 例如:
a+ 与连续A的a 、aa 、aaa 等匹配
- 例如,模式可能存在,也可能不存在,但只能匹配一次。
- 例如,
[a-z]? 匹配空字符串或任何单个小写字母。
- 例如,EDCOX1〔26〕匹配从A开始的两个字符串,除EDCOX1以外的任何25个结尾。
- 例如
a|b 意味着EDCOX1〔12〕或EDCX1〔31〕可以匹配。 - 例如,EDCOX1〔32〕正好匹配其中一种颜色。
- 例如,EDCOX1×34字符不能包含一个数字。
- 例如,
[^aA] 字符不能是小写a 或大写a 。
- 例如:
\. 、\\ 、\( 、\? 、\$ 、\^ 。
- 例如,
^a 的第一个字符必须是小写字母a 。 - 例如,
^[0-9] 的第一个字符必须是数字。
- 例如,
a$ 的最后一个字符必须是小写字母a 。
1 2 3 4 5 | Order Name Representation 1 Parentheses ( ) 2 Multipliers ? + * {m,n} {m, n}? 3 Sequence & Anchors abc ^ $ 4 Alternation | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | abr same as meaning \d [0-9] Any single digit \D [^0-9] Any single character that's not a digit \w [a-zA-Z0-9_] Any word character \W [^a-zA-Z0-9_] Any non-word character \s [ \t \f] Any space character \S [^ \t \f] Any non-space character [ ] New line |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | Private Sub simpleRegex() Dim strPattern As String: strPattern ="^[0-9]{1,2}" Dim strReplace As String: strReplace ="" Dim regEx As New RegExp Dim strInput As String Dim Myrange As Range Set Myrange = ActiveSheet.Range("A1") If strPattern <>"" Then strInput = Myrange.Value With regEx .Global = True .MultiLine = True .IgnoreCase = False .Pattern = strPattern End With If regEx.Test(strInput) Then MsgBox (regEx.Replace(strInput, strReplace)) Else MsgBox ("Not matched") End If End If End Sub |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | Function simpleCellRegex(Myrange As Range) As String Dim regEx As New RegExp Dim strPattern As String Dim strInput As String Dim strReplace As String Dim strOutput As String strPattern ="^[0-9]{1,3}" If strPattern <>"" Then strInput = Myrange.Value strReplace ="" With regEx .Global = True .MultiLine = True .IgnoreCase = False .Pattern = strPattern End With If regEx.test(strInput) Then simpleCellRegex = regEx.Replace(strInput, strReplace) Else simpleCellRegex ="Not matched" End If End If End Function |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | Private Sub simpleRegex() Dim strPattern As String: strPattern ="^[0-9]{1,2}" Dim strReplace As String: strReplace ="" Dim regEx As New RegExp Dim strInput As String Dim Myrange As Range Set Myrange = ActiveSheet.Range("A1:A5") For Each cell In Myrange If strPattern <>"" Then strInput = cell.Value With regEx .Global = True .MultiLine = True .IgnoreCase = False .Pattern = strPattern End With If regEx.Test(strInput) Then MsgBox (regEx.Replace(strInput, strReplace)) Else MsgBox ("Not matched") End If End If Next End Sub |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | Private Sub splitUpRegexPattern() Dim regEx As New RegExp Dim strPattern As String Dim strInput As String Dim Myrange As Range Set Myrange = ActiveSheet.Range("A1:A3") For Each C In Myrange strPattern ="(^[0-9]{3})([a-zA-Z])([0-9]{4})" If strPattern <>"" Then strInput = C.Value With regEx .Global = True .MultiLine = True .IgnoreCase = False .Pattern = strPattern End With If regEx.test(strInput) Then C.Offset(0, 1) = regEx.Replace(strInput,"$1") C.Offset(0, 2) = regEx.Replace(strInput,"$2") C.Offset(0, 3) = regEx.Replace(strInput,"$3") Else C.Offset(0, 1) ="(Not matched)" End If End If Next End Sub |
1 2 3 4 5 6 7 | String Regex Pattern Explanation a1aaa [a-zA-Z][0-9][a-zA-Z]{3} Single alpha, single digit, three alpha characters a1aaa [a-zA-Z]?[0-9][a-zA-Z]{3} May or may not have preceeding alpha character a1aaa [a-zA-Z][0-9][a-zA-Z]{0,3} Single alpha, single digit, 0 to 3 alpha characters a1aaa [a-zA-Z][0-9][a-zA-Z]* Single alpha, single digit, followed by any number of alpha characters </i8> \<\/[a-zA-Z][0-9]\> Exact non-word character except any single alpha followed by any single digit |
1 2 | =regex("Peter Gordon: [email protected], 47","\w+@\w+\.\w+") =regex("Peter Gordon: [email protected], 47","\w+@\w+\.\w+","$0") |
1 | =regex("Peter Gordon: [email protected], 47","^(.+): (.+), (\d+)$","E-Mail: $2, Name: $1") |
1 2 | =regex("Peter Gordon: [email protected], 47","^(.+): (.+), (\d+)$","$" & 1) =regex("Peter Gordon: [email protected], 47","^(.+): (.+), (\d+)$","$" & 2) |

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | Function regex(strInput As String, matchPattern As String, Optional ByVal outputPattern As String ="$0") As Variant Dim inputRegexObj As New VBScript_RegExp_55.RegExp, outputRegexObj As New VBScript_RegExp_55.RegExp, outReplaceRegexObj As New VBScript_RegExp_55.RegExp Dim inputMatches As Object, replaceMatches As Object, replaceMatch As Object Dim replaceNumber As Integer With inputRegexObj .Global = True .MultiLine = True .IgnoreCase = False .Pattern = matchPattern End With With outputRegexObj .Global = True .MultiLine = True .IgnoreCase = False .Pattern ="\$(\d+)" End With With outReplaceRegexObj .Global = True .MultiLine = True .IgnoreCase = False End With Set inputMatches = inputRegexObj.Execute(strInput) If inputMatches.Count = 0 Then regex = False Else Set replaceMatches = outputRegexObj.Execute(outputPattern) For Each replaceMatch In replaceMatches replaceNumber = replaceMatch.SubMatches(0) outReplaceRegexObj.Pattern ="\$" & replaceNumber If replaceNumber = 0 Then outputPattern = outReplaceRegexObj.Replace(outputPattern, inputMatches(0).Value) Else If replaceNumber > inputMatches(0).SubMatches.Count Then 'regex ="A to high $ tag found. Largest allowed is $" & inputMatches(0).SubMatches.Count &"." regex = CVErr(xlErrValue) Exit Function Else outputPattern = outReplaceRegexObj.Replace(outputPattern, inputMatches(0).SubMatches(replaceNumber - 1)) End If End If Next regex = outputPattern End If End Function |
保存并关闭Microsoft Visual Basic for Applications编辑器窗口。

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | Function RegxFunc(strInput As String, regexPattern As String) As String Dim regEx As New RegExp With regEx .Global = True .MultiLine = True .IgnoreCase = False .pattern = regexPattern End With If regEx.Test(strInput) Then Set matches = regEx.Execute(strInput) RegxFunc = matches(0).Value Else RegxFunc ="not matched" End If End Function |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | Function RegParse(ByVal pattern As String, ByVal html As String) Dim regex As RegExp Set regex = New RegExp With regex .IgnoreCase = True 'ignoring cases while regex engine performs the search. .pattern = pattern 'declaring regex pattern. .Global = False 'restricting regex to find only first match. If .Test(html) Then 'Testing if the pattern matches or not mStr = .Execute(html)(0) '.Execute(html)(0) will provide the String which matches with Regex RegParse = .Replace(mStr,"$1") '.Replace function will replace the String with whatever is in the first set of braces - $1. Else RegParse ="#N/A" End If End With End Function |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | Function REGPLACE(myRange As Range, matchPattern As String, outputPattern As String) As Variant Dim regex As New VBScript_RegExp_55.RegExp Dim strInput As String strInput = myRange.Value With regex .Global = True .MultiLine = True .IgnoreCase = False .Pattern = matchPattern End With REGPLACE = regex.Replace(strInput, outputPattern) End Function |
1 2 3 4 | =regex_subst("watermellon","[aeiou]","") ---> wtrmlln =regex_subst("watermellon","[^aeiou]","") ---> aeeo |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | Function regex_subst( _ strInput As String _ , matchPattern As String _ , Optional ByVal replacePattern As String ="" _ ) As Variant Dim inputRegexObj As New VBScript_RegExp_55.RegExp With inputRegexObj .Global = True .MultiLine = True .IgnoreCase = False .Pattern = matchPattern End With regex_subst = inputRegexObj.Replace(strInput, replacePattern) End Function |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | Public Sub RegExSearch() ' ' ' Dim regexp As Object 'Dim regex As New VBScript_RegExp_55.regexp 'Caused"User Defined Type Not Defined" Error Dim rng As Range, rcell As Range Dim strInput As String, strPattern As String Set regexp = CreateObject("vbscript.regexp") Set rng = ActiveSheet.Range("A1:A1") For Each rcell In rng.Cells strPattern ="([a-z]{2})([0-9]{8})" 'Search for 2 ## then 8 Digits Eg: XY12345678 = Matched If strPattern <>"" Then strInput = rcell.Value With regexp .Global = False .MultiLine = False .ignoreCase = True .Pattern = strPattern End With If regexp.test(strInput) Then MsgBox rcell &" Matched in Cell" & rcell.Address Else MsgBox"No Matches!" End If End If Next End Sub |