How to efficiently concatenate strings in Go?
在go中,
因此,如果我想在不知道结果字符串长度的情况下多次连接字符串,那么最好的方法是什么?
天真的做法是:
1 2 3 4 5 | s :="" for i := 0; i < 1000; i++ { s += getShortStringFromSomewhere() } return s |
但这似乎不是很有效。
2018年增加说明
从Go 1.10开始,有一个
最好的方法是使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | package main import ( "bytes" "fmt" ) func main() { var buffer bytes.Buffer for i := 0; i < 1000; i++ { buffer.WriteString("a") } fmt.Println(buffer.String()) } |
这在O(N)时间内完成。
连接字符串的最有效方法是使用内置函数
我创建了一个测试用例来证明这一点,结果如下:
1 2 3 | BenchmarkConcat 1000000 64497 ns/op 502018 B/op 0 allocs/op BenchmarkBuffer 100000000 15.5 ns/op 2 B/op 0 allocs/op BenchmarkCopy 500000000 5.39 ns/op 0 B/op 0 allocs/op |
以下是测试代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | package main import ( "bytes" "strings" "testing" ) func BenchmarkConcat(b *testing.B) { var str string for n := 0; n < b.N; n++ { str +="x" } b.StopTimer() if s := strings.Repeat("x", b.N); str != s { b.Errorf("unexpected result; got=%s, want=%s", str, s) } } func BenchmarkBuffer(b *testing.B) { var buffer bytes.Buffer for n := 0; n < b.N; n++ { buffer.WriteString("x") } b.StopTimer() if s := strings.Repeat("x", b.N); buffer.String() != s { b.Errorf("unexpected result; got=%s, want=%s", buffer.String(), s) } } func BenchmarkCopy(b *testing.B) { bs := make([]byte, b.N) bl := 0 b.ResetTimer() for n := 0; n < b.N; n++ { bl += copy(bs[bl:],"x") } b.StopTimer() if s := strings.Repeat("x", b.N); string(bs) != s { b.Errorf("unexpected result; got=%s, want=%s", string(bs), s) } } // Go 1.10 func BenchmarkStringBuilder(b *testing.B) { var strBuilder strings.Builder b.ResetTimer() for n := 0; n < b.N; n++ { strBuilder.WriteString("x") } b.StopTimer() if s := strings.Repeat("x", b.N); strBuilder.String() != s { b.Errorf("unexpected result; got=%s, want=%s", strBuilder.String(), s) } } |
字符串包中有一个名为
查看
用途:
1 2 3 4 5 6 7 8 9 10 11 12 13 | import ( "fmt"; "strings"; ) func main() { s := []string{"this","is","a","joined","string "}; fmt.Printf(strings.Join(s,"")); } $ ./test.bin this is a joined string |
从Go 1.10开始,这里有一个
A Builder is used to efficiently build a string using Write methods. It minimizes memory copying. The zero value is ready to use.
用途:
与
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | package main import ( "strings" "fmt" ) func main() { var str strings.Builder for i := 0; i < 1000; i++ { str.WriteString("a") } fmt.Println(str.String()) } |
它支持的StringBuilder方法和接口:
它的方法是在考虑现有接口的情况下实现的,这样您就可以在代码中轻松地切换到新的生成器。
- 增长(int)->字节。缓冲区增长
- len()int->bytes.buffer len
- reset()->bytes.buffer reset
- string()字符串->fmt.stringer
- 写入([]字节)(int,错误)->io.writer
- WriteByte(字节)错误->IO.ByteWriter
- writerune(rune)(int,error)->bufio.writer writerune-bytes.buffer writerune
- WriteString(String)(int,error)->io.StringWriter
零值用法:
1 | var buf strings.Builder |
与字节的差异。缓冲区:
它只能增长或重置。
在
bytes.Buffer 中,底层字节可以这样转义:(*Buffer).Bytes() ;strings.Builder 防止了这个问题。有时,这不是问题,而是需要的(例如,当字节被传递到io.Reader 等)时的偷看行为)。它还内置了一个复制检查机制,防止意外复制(
func (b *Builder) copyCheck() { ... } )。
在这里查看它的源代码。
我刚刚用自己的代码(递归树遍历)对上面发布的顶部答案进行了基准测试,而简单的concat运算符实际上比
1 2 3 4 5 6 7 8 9 10 | func (r *record) String() string { buffer := bytes.NewBufferString(""); fmt.Fprint(buffer,"(",r.name,"[") for i := 0; i < len(r.subs); i++ { fmt.Fprint(buffer,"\t",r.subs[i]) } fmt.Fprint(buffer,"]",r.size,") ") return buffer.String() } |
这需要0.81秒,而下面的代码:
1 2 3 4 5 6 7 8 9 | func (r *record) String() string { s :="("" + r.name +"" [" for i := 0; i < len(r.subs); i++ { s += r.subs[i].String() } s +="]" + strconv.FormatInt(r.size,10) +") " return s } |
只花了0.61秒。这可能是由于创建新
更新:我还测试了
1 2 3 4 5 6 7 8 9 10 | func (r *record) String() string { var parts []string parts = append(parts,"("", r.name,"" [" ) for i := 0; i < len(r.subs); i++ { parts = append(parts, r.subs[i].String()) } parts = append(parts, strconv.FormatInt(r.size,10),") ") return strings.Join(parts,"") } |
您可以创建一大块字节,并使用字符串切片将短字符串的字节复制到其中。"有效执行"中给出了一个函数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | func Append(slice, data[]byte) []byte { l := len(slice); if l + len(data) > cap(slice) { // reallocate // Allocate double what's needed, for future growth. newSlice := make([]byte, (l+len(data))*2); // Copy data (could use bytes.Copy()). for i, c := range slice { newSlice[i] = c } slice = newSlice; } slice = slice[0:l+len(data)]; for i, c := range data { slice[l+i] = c } return slice; } |
然后在操作完成后,在大字节片上使用
2018年增加说明
从Go 1.10开始,有一个
@cd1的基准代码和其他答案是错误的。
基准函数应该运行相同的测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | package main import ( "bytes" "strings" "testing" ) const ( sss ="xfoasneobfasieongasbg" cnt = 10000 ) var ( bbb = []byte(sss) expected = strings.Repeat(sss, cnt) ) func BenchmarkCopyPreAllocate(b *testing.B) { var result string for n := 0; n < b.N; n++ { bs := make([]byte, cnt*len(sss)) bl := 0 for i := 0; i < cnt; i++ { bl += copy(bs[bl:], sss) } result = string(bs) } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) } } func BenchmarkAppendPreAllocate(b *testing.B) { var result string for n := 0; n < b.N; n++ { data := make([]byte, 0, cnt*len(sss)) for i := 0; i < cnt; i++ { data = append(data, sss...) } result = string(data) } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) } } func BenchmarkBufferPreAllocate(b *testing.B) { var result string for n := 0; n < b.N; n++ { buf := bytes.NewBuffer(make([]byte, 0, cnt*len(sss))) for i := 0; i < cnt; i++ { buf.WriteString(sss) } result = buf.String() } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) } } func BenchmarkCopy(b *testing.B) { var result string for n := 0; n < b.N; n++ { data := make([]byte, 0, 64) // same size as bootstrap array of bytes.Buffer for i := 0; i < cnt; i++ { off := len(data) if off+len(sss) > cap(data) { temp := make([]byte, 2*cap(data)+len(sss)) copy(temp, data) data = temp } data = data[0 : off+len(sss)] copy(data[off:], sss) } result = string(data) } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) } } func BenchmarkAppend(b *testing.B) { var result string for n := 0; n < b.N; n++ { data := make([]byte, 0, 64) for i := 0; i < cnt; i++ { data = append(data, sss...) } result = string(data) } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) } } func BenchmarkBufferWrite(b *testing.B) { var result string for n := 0; n < b.N; n++ { var buf bytes.Buffer for i := 0; i < cnt; i++ { buf.Write(bbb) } result = buf.String() } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) } } func BenchmarkBufferWriteString(b *testing.B) { var result string for n := 0; n < b.N; n++ { var buf bytes.Buffer for i := 0; i < cnt; i++ { buf.WriteString(sss) } result = buf.String() } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) } } func BenchmarkConcat(b *testing.B) { var result string for n := 0; n < b.N; n++ { var str string for i := 0; i < cnt; i++ { str += sss } result = str } b.StopTimer() if result != expected { b.Errorf("unexpected result; got=%s, want=%s", string(result), expected) } } |
环境是OS X 10.11.6,2.2 GHz Intel Core i7
试验结果:
1 2 3 4 5 6 7 8 | BenchmarkCopyPreAllocate-8 20000 84208 ns/op 425984 B/op 2 allocs/op BenchmarkAppendPreAllocate-8 10000 102859 ns/op 425984 B/op 2 allocs/op BenchmarkBufferPreAllocate-8 10000 166407 ns/op 426096 B/op 3 allocs/op BenchmarkCopy-8 10000 160923 ns/op 933152 B/op 13 allocs/op BenchmarkAppend-8 10000 175508 ns/op 1332096 B/op 24 allocs/op BenchmarkBufferWrite-8 10000 239886 ns/op 933266 B/op 14 allocs/op BenchmarkBufferWriteString-8 10000 236432 ns/op 933266 B/op 14 allocs/op BenchmarkConcat-8 10 105603419 ns/op 1086685168 B/op 10000 allocs/op |
结论:
建议:
这是最快的解决方案,不需要首先要知道或计算整个缓冲区的大小:
1 2 3 4 5 | var data []byte for i := 0; i < 1000; i++ { data = append(data, getShortStringFromSomewhere()...) } return string(data) |
以我的基准来看,它比复制解决方案慢20%(8.1ns/追加而不是6.72ns),但仍然比使用bytes.buffer快55%。
1 2 3 4 5 6 7 8 9 10 11 12 | package main import ( "fmt" ) func main() { var str1 ="string1" var str2 ="string2" out := fmt.Sprintf("%s %s",str1, str2) fmt.Println(out) } |
我最初的建议是
1 | s12 := fmt.Sprint(s1,s2) |
但是上面的答案使用bytes.buffer-writeString()是最有效的方法。
我最初的建议是使用反射和类型转换。见
但sprint()至少在内部使用了bytes.buffer。因此
1 | `s12 := fmt.Sprint(s1,s2,s3,s4,...,s1000)` |
在内存分配方面是可接受的。
=>sprint()串联可用于快速调试输出。=>否则使用bytes.buffer…小精灵
扩展CD1的答案:您可以使用append()而不是copy()。append()提供了更大的预付款,占用了更多的内存,但节省了时间。我在你的顶部又加了两个基准点。在本地运行
1 | go test -bench=. -benchtime=100ms |
在我的ThinkPad T400S上,它产生:
1 2 3 | BenchmarkAppendEmpty 50000000 5.0 ns/op BenchmarkAppendPrealloc 50000000 3.5 ns/op BenchmarkCopy 20000000 10.2 ns/op |
这是@cd1(
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | package performance_test import ( "bytes" "fmt" "testing" ) const ( concatSteps = 100 ) func BenchmarkConcat(b *testing.B) { for n := 0; n < b.N; n++ { var str string for i := 0; i < concatSteps; i++ { str +="x" } } } func BenchmarkBuffer(b *testing.B) { for n := 0; n < b.N; n++ { var buffer bytes.Buffer for i := 0; i < concatSteps; i++ { buffer.WriteString("x") } } } |
计时:
1 2 | BenchmarkConcat-4 300000 6869 ns/op BenchmarkBuffer-4 1000000 1186 ns/op |
痛风。连接
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | func JoinBetween(in []string, separator string, startIndex, endIndex int) string { if in == nil { return"" } noOfItems := endIndex - startIndex if noOfItems <= 0 { return EMPTY } var builder strings.Builder for i := startIndex; i < endIndex; i++ { if i > startIndex { builder.WriteString(separator) } builder.WriteString(in[i]) } return builder.String() } |
我用以下方法来做:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | package main import ( "fmt" "strings" ) func main (){ concatenation:= strings.Join([]string{"a","b","c <p><center>[wp_ad_camp_3]</center></p><hr><P>对于那些来自<wyn>StringBuilder</wyn>的Java世界,对于高效的字符串连接,似乎最新的GO版本具有它的等价性,它被称为EDCOX1(18):HTTPS:/Github. CO/Gangang/Go/Bulb/Mask/Src/String s/Buffel.Go。</P><hr> [cc]package main import ( "fmt" ) func main() { var str1 ="string1" var str2 ="string2" result := make([]byte, 0) result = append(result, []byte(str1)...) result = append(result, []byte(str2)...) result = append(result, []byte(str1)...) result = append(result, []byte(str2)...) fmt.Println(string(result)) } |
看看Golang的strconv库,它提供了对几个appendxx函数的访问,使我们能够将字符串与字符串和其他数据类型连接起来。
内存分配统计的基准结果。检查Github的基准代码。
使用Strings.Builder优化性能。
1 2 3 4 5 6 7 8 9 10 | go test -bench . -benchmem goos: darwin goarch: amd64 pkg: github.com/hechen0/goexp/exps BenchmarkConcat-8 1000000 60213 ns/op 503992 B/op 1 allocs/op BenchmarkBuffer-8 100000000 11.3 ns/op 2 B/op 0 allocs/op BenchmarkCopy-8 300000000 4.76 ns/op 0 B/op 0 allocs/op BenchmarkStringBuilder-8 1000000000 4.14 ns/op 6 B/op 0 allocs/op PASS ok github.com/hechen0/goexp/exps 70.071s |
1 | s := fmt.Sprintf("%s%s", []byte(s1), []byte(s2)) |
"字符串"包中的
如果您的类型不匹配(例如,如果您试图连接一个int和一个字符串),则执行randomtype(您想要更改的内容)。
前任:
1 2 3 4 5 6 7 8 9 10 11 | package main import"strings" var intEX = 0 var stringEX ="hello all you" var stringEX2 =" people in here" func main() { strings.Join(stringEX, string(intEX), stringEX2) } |