Javascript Split Array
我正在尝试编写一个自定义的字符串拆分函数,这比我想象的要困难。
基本上,我传入一个字符串和一个字符串将拆分的值数组,它将返回一个子字符串数组,删除空字符串并包括它拆分的值。如果字符串可以在同一位置被两个不同的值拆分,则较长的值具有优先权。
也就是说,
1 | split("Go ye away, I want some peace && quiet. & Thanks.", ["Go",",","&&","&","."]); |
应该返回
1 | ["Go","ye away",","," I want some peace","&&"," quiet",".","","&"," Thanks","."] |
你能想出一个相当简单的算法吗?如果有一种内置的方法可以在javascript中实现这一点(我不认为有),那就更好了。
像这样?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | function mySplit(input, delimiters) { // Sort delimiters array by length to avoid ambiguity delimiters.sort(function(a, b) { if (a.length > b.length) { return -1; } return 0; } var result = []; // Examine input one character at a time for (var i = 0; i < input.length; i++) { for (var j = 0; j < delimiters.length; j++) { if (input.substr(i, delimiters[j].length) == delimiters[j]) { // Add first chunk of input to result if (i > 0) { result.push(input.substr(0, i)); } result.push(delimiters[j]); // Reset input and iteration input = input.substr(i + delimiters[j].length); i = 0; j = 0; } } } return result; } var input ="Go ye away, I want some peace && quiet. & Thanks."; var delimiters = ["Go",",","&&","&","."]; console.log(mySplit(input, delimiters)); // Output: ["Go","ye away",","," I want some peace", // "&&"," quiet",".","","&"," Thanks","."] |
要求的确切解决方案:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | function megasplit(toSplit, splitters) { var splitters = splitters.sorted(function(a,b) {return b.length-a.length}); // sort by length; put here for readability, trivial to separate rest of function into helper function if (!splitters.length) return toSplit; else { var token = splitters[0]; return toSplit .split(token) // split on token .map(function(segment) { // recurse on segments return megasplit(segment, splitters.slice(1)) }) .intersperse(token) // re-insert token .flatten() // rejoin segments .filter(Boolean); } } |
演示:
1 2 3 4 5 | > megasplit( "Go ye away, I want some peace && quiet. & Thanks.", ["Go",",","&&","&","."] ) ["Go","ye away",","," I want some peace","&","&"," quiet",".","","&"," Thanks","."] |
机械(可重复使用!):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | Array.prototype.copy = function() { return this.slice() } Array.prototype.sorted = function() { var copy = this.copy(); copy.sort.apply(copy, arguments); return copy; } Array.prototype.flatten = function() { return [].concat.apply([], this) } Array.prototype.mapFlatten = function() { return this.map.apply(this,arguments).flatten() } Array.prototype.intersperse = function(token) { // [1,2,3].intersperse('x') -> [1,'x',2,'x',3] return this.mapFlatten(function(x){return [token,x]}).slice(1) } |
笔记:
- 这需要大量的研究才能优雅地完成:
- (深)使用jquery复制数组
- 在javascript中连接n个数组最有效的方法是什么?(创造了我自己不那么难看的方法)
- 如何在保留引号的同时,不在双引号内拆分逗号上的文本?(垃圾答案,再次创建了我自己的方法)
- 由于规范要求令牌(尽管它们将保留在字符串中)不应被拆分(否则您将得到
"&","&" ),这一事实进一步复杂了。这使得使用EDOCX1[1]成为不可能,并且需要递归。 - 我个人也不会忽略带有分裂的空字符串。我可以理解,我不想递归地拆分令牌,但我个人会简化函数,使输出像正常的
.split 和["","Go","ye away",","," I want some peace","&&"," quiet",".","","&"," Thanks",".",""] 一样工作。 - 我要指出的是,如果你愿意稍微放宽一下你的要求,这就从15/20线性到1/3线性:
如果遵循规范的拆分行为,则为1行:
1 2 3 4 5 6 7 8 | Array.prototype.mapFlatten = function() { ... } function megasplit(toSplit, splitters) { return splitters.sorted(...).reduce(function(strings, token) { return strings.mapFlatten(function(s){return s.split(token)}); }, [toSplit]); } |
3行,如果上述内容难以阅读:
1 2 3 4 5 6 7 8 9 10 | Array.prototype.mapFlatten = function() { ... } function megasplit(toSplit, splitters) { var strings = [toSplit]; splitters.sorted(...).forEach(function(token) { strings = strings.mapFlatten(function(s){return s.split(token)}); }); return strings; } |