How do you access the matched groups in a JavaScript regular expression?
我想使用正则表达式匹配字符串的一部分,然后访问带圆括号的子字符串:
1 2 3 4 5 6 7 | var myString ="something format_abc"; // I want"abc" var arr = /(?:^|\s)format_(.*?)(?:\s|$)/.exec(myString); console.log(arr); // Prints: [" format_abc","abc"] .. so far so good. console.log(arr[1]); // Prints: undefined (???) console.log(arr[0]); // Prints: format_undefined (!!!) |
我做错什么了?
我发现上面的正则表达式代码没有任何问题:我测试的实际字符串是:
1 | "date format_%A" |
报告"%a"未定义似乎是一个非常奇怪的行为,但它与此问题没有直接关系,因此我打开了一个新的行为,为什么匹配的子字符串在javascript中返回"未定义"?.
问题是,
您可以访问这样的捕获组:
1 2 3 4 | var myString ="something format_abc"; var myRegexp = /(?:^|\s)format_(.*?)(?:\s|$)/g; var match = myRegexp.exec(myString); console.log(match[1]); // abc |
如果有多个匹配项,可以对它们进行迭代:
1 2 3 4 5 6 7 8 9 10 | var myString ="something format_abc"; var myRegexp = /(?:^|\s)format_(.*?)(?:\s|$)/g; match = myRegexp.exec(myString); while (match != null) { // matched text: match[0] // match start: match.index // capturing group n: match[n] console.log(match[0]) match = myRegexp.exec(myString); } |
这里有一个方法可以用来得到n?每个匹配的捕获组:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | function getMatches(string, regex, index) { index || (index = 1); // default to the first capturing group var matches = []; var match; while (match = regex.exec(string)) { matches.push(match[index]); } return matches; } // Example : var myString = 'something format_abc something format_def something format_ghi'; var myRegEx = /(?:^|\s)format_(.*?)(?:\s|$)/g; // Get an array containing the first capturing group for every match var matches = getMatches(myString, myRegEx, 1); // Log results document.write(matches.length + ' matches found: ' + JSON.stringify(matches)) console.log(matches); |
1 2 3 | var myString ="something format_abc"; var arr = myString.match(/\bformat_(.*?)\b/); console.log(arr[0] +"" + arr[1]); |
关于上面的多匹配括号示例,我在没有得到我想要的答案之后,正在这里寻找答案:
1 | var matches = mystring.match(/(?:neededToMatchButNotWantedInResult)(matchWanted)/igm); |
在看了上面while和.push()中稍微复杂的函数调用之后,我突然意识到,用mystring.replace()可以非常优雅地解决这个问题(替换不是重点,甚至还没有完成,第二个参数的干净的内置递归函数调用选项是!):
1 2 3 4 | var yourstring = 'something format_abc something format_def something format_ghi'; var matches = []; yourstring.replace(/format_([^\s]+)/igm, function(m, p1){ matches.push(p1); } ); |
在这之后,我想我再也不会用.match()来处理任何事情了。
最后但同样重要的是,我发现有一行代码对我来说很好(JSES6):
1 2 3 4 5 6 | let reg = /#([\S]+)/igm; // Get hashtags. let string = 'mi alegría es total! ??? #fiestasdefindea?o #PadreHijo #buenosmomentos #france #paris'; let matches = (string.match(reg) || []).map(e => e.replace(reg, '$1')); console.log(matches); |
这将返回:
1 | ['fiestasdefindea?o', 'PadreHijo', 'buenosmomentos', 'france', 'paris'] |
你的语法可能不是最好的。ff/gecko将regexp定义为函数的扩展。(ff2一直到
这似乎是特定于FF——即,Opera和Chrome的,都有例外。
相反,使用前面提到的任何一种方法:
1 2 3 4 5 6 | var regex = /(?:^|\s)format_(.*?)(?:\s|$)/; var input ="something format_abc"; regex(input); //=> [" format_abc","abc"] regex.exec(input); //=> [" format_abc","abc"] input.match(regex); //=> [" format_abc","abc"] |
本答案中使用的术语:
- match表示像这样对字符串运行regex模式的结果:
someString.match(regexPattern) 。 - 匹配的模式表示输入字符串的所有匹配部分,这些匹配部分都位于匹配数组中。这些都是输入字符串中模式的实例。
- 匹配的组表示要捕获的所有组,在regex模式中定义。(括号内的模式,如:
/format_(.*?)/g ,其中(.*?) 是一个匹配的组。)这些模式位于匹配的模式中。
描述
要访问匹配的组,在每个匹配的模式中,您需要一个函数或类似的东西来迭代匹配。有许多方法可以做到这一点,正如许多其他答案所示。大多数其他答案都使用while循环迭代所有匹配的模式,但我认为我们都知道这种方法的潜在危险。有必要与
下面是一个函数
由于它们基本上实现了
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | // Concise ES6/ES2015 syntax const searchString = (string, pattern) => string .match(new RegExp(pattern.source, pattern.flags)) .map(match => new RegExp(pattern.source, pattern.flags) .exec(match)); // Or if you will, with ES5 syntax function searchString(string, pattern) { return string .match(new RegExp(pattern.source, pattern.flags)) .map(match => new RegExp(pattern.source, pattern.flags) .exec(match)); } let string ="something format_abc", pattern = /(?:^|\s)format_(.*?)(?:\s|$)/; let result = searchString(string, pattern); // [[" format_abc","abc"], null] // The trailing `null` disappears if you add the `global` flag |
性能版本(更多的代码,更少的语法成分)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | // Performant ES6/ES2015 syntax const searchString = (string, pattern) => { let result = []; const matches = string.match(new RegExp(pattern.source, pattern.flags)); for (let i = 0; i < matches.length; i++) { result.push(new RegExp(pattern.source, pattern.flags).exec(matches[i])); } return result; }; // Same thing, but with ES5 syntax function searchString(string, pattern) { var result = []; var matches = string.match(new RegExp(pattern.source, pattern.flags)); for (var i = 0; i < matches.length; i++) { result.push(new RegExp(pattern.source, pattern.flags).exec(matches[i])); } return result; } let string ="something format_abc", pattern = /(?:^|\s)format_(.*?)(?:\s|$)/; let result = searchString(string, pattern); // [[" format_abc","abc"], null] // The trailing `null` disappears if you add the `global` flag |
我还没有将这些备选方案与前面在其他答案中提到的方案进行比较,但我怀疑这种方法的性能和故障安全性不如其他方法。
不需要调用
1 2 3 | var str ="This is cool"; var matches = str.match(/(This is)( cool)$/); console.log( JSON.stringify(matches) ); // will print ["This is cool","This is"," cool"] or something like that... |
位置0有一个包含所有结果的字符串。位置1有括号表示的第一个匹配项,位置2有括号中隔离的第二个匹配项。嵌套的圆括号很棘手,所以要小心!
只有当您有一对括号时才实用的一行代码:
1 | while ( ( match = myRegex.exec( myStr ) ) && matches.push( match[1] ) ) {}; |
使用您的代码:
1 2 | console.log(arr[1]); // prints: abc console.log(arr[0]); // prints: format_abc |
编辑:Safari 3,如果重要的话。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | function getMatches(string, regex, index) { index || (index = 1); // default to the first capturing group var matches = []; var match; while (match = regex.exec(string)) { matches.push(match[index]); } return matches; } // Example : var myString = 'Rs.200 is Debited to A/c ...2031 on 02-12-14 20:05:49 (Clear Bal Rs.66248.77) AT ATM. TollFree 1800223344 18001024455 (6am-10pm)'; var myRegEx = /clear bal.+?(\d+\.?\d{2})/gi; // Get an array containing the first capturing group for every match var matches = getMatches(myString, myRegEx, 1); // Log results document.write(matches.length + ' matches found: ' + JSON.stringify(matches)) console.log(matches); |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | function getMatches(string, regex, index) { index || (index = 1); // default to the first capturing group var matches = []; var match; while (match = regex.exec(string)) { matches.push(match[index]); } return matches; } // Example : var myString = 'something format_abc something format_def something format_ghi'; var myRegEx = /(?:^|\s)format_(.*?)(?:\s|$)/g; // Get an array containing the first capturing group for every match var matches = getMatches(myString, myRegEx, 1); // Log results document.write(matches.length + ' matches found: ' + JSON.stringify(matches)) console.log(matches); |
With
matchAll available, you can avoid thewhile loop andexec with/g ... Instead, by usingmatchAll , you get back an iterator which you can use with the more convenientfor...of , array spread, orArray.from() constructs
此方法生成类似于c_中的
请参阅JS演示(在Google Chrome 73.0.3683.67(官方版本)、beta(64位)中测试):
1 2 3 | var myString ="key1:value1, key2-value2!!@key3=value3"; var matches = myString.matchAll(/(\w+)[:=-](\w+)/g); console.log([...matches]); // All match with capturing group values |
您还可以使用
1 2 3 4 5 6 7 | let matchData ="key1:value1, key2-value2!!@key3=value3".matchAll(/(\w+)[:=-](\w+)/g) var matches = [...matchData]; // Note matchAll result is not re-iterable console.log(Array.from(matches, m => m[0])); // All match (Group 0) values // => ["key1:value1","key2-value2","key3=value3" ] console.log(Array.from(matches, m => m[1])); // All match (Group 1) values // => ["key1","key2","key3" ] |
注意:请参见浏览器兼容性详细信息。
即使我同意Philo的建议,您的代码也适用于我(Mac上的ff3),regex可能是:
1 | /\bformat_(.*?)\b/ |
(当然,我不确定,因为我不知道regex的上下文。)
您不需要显式循环来解析多个匹配项-将替换函数作为第二个参数传递,如:
1 2 3 4 5 6 7 | var str ="Our chief weapon is {1}, {0} and {2}!"; var params= ['surprise', 'fear', 'ruthless efficiency']; var patt = /{([^}]+)}/g; str=str.replace(patt, function(m0, m1, position){return params[parseInt(m1)];}); document.write(str); |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | /*Regex function for extracting object from"window.location.search" string. */ var search ="?a=3&b=4&c=7"; // Example search string var getSearchObj = function (searchString) { var match, key, value, obj = {}; var pattern = /(\w+)=(\w+)/g; var search = searchString.substr(1); // Remove '?' while (match = pattern.exec(search)) { obj[match[0].split('=')[0]] = match[0].split('=')[1]; } return obj; }; console.log(getSearchObj(search)); |
有了ES2018,您现在可以使用命名组来
1 2 3 4 5 | const url = 'https://stackoverflow.com/questions/432493/how-do-you-access-the-matched-groups-in-a-javascript-regular-expression?some=parameter'; const regex = /(?<protocol>https?):\/\/(?<hostname>[\w-\.]*)\/(?<pathname>[\w-\./]+)\??(?<querystring>.*?)?$/; const { groups: segments } = url.match(regex); console.log(segments); |
你会得到
{protocol:"https", hostname:"stackoverflow.com", pathname:"questions/432493/how-do-you-access-the-matched-groups-in-a-javascript-regular-expression", querystring:"some=parameter"}
< /块引用>< /块引用>