I used to reach for []byte over string in performance-sensitive code, assuming strings were somehow more expensive because “immutability must cost something.” That intuition was wrong. Understanding what a string actually is — a read-only slice header — corrected my assumptions and simplified a lot of code I had unnecessarily complicated.
The Problem
Strings in Go are everywhere, and most developers use them without thinking about what they are at the memory level. This leads to a few common mistakes:
- Unnecessary conversions between
stringand[]bytethat cause allocations - Confusion about why modifying a
[]bytederived from a string doesn’t affect the string - Suboptimal string concatenation patterns in loops
- Misunderstanding why comparing strings is cheap even for long strings (sometimes)
The root cause of all these is not knowing that a Go string is literally just two words: a pointer to an immutable byte array and a length. Once you see that, the behavior follows naturally.
The Idiomatic Way
A Go string is represented identically to a slice header, minus the capacity field:
// Conceptually, a string looks like this:
type StringHeader struct {
Data uintptr // pointer to underlying bytes
Len int // number of bytes
}
You can inspect this directly:
package main
import (
"fmt"
"reflect"
"unsafe"
)
func main() {
s := "hello, world"
// A string is always 16 bytes on 64-bit (pointer + length)
fmt.Println("size of string:", unsafe.Sizeof(s)) // 16
// Access the string header via reflect
hdr := (*reflect.StringHeader)(unsafe.Pointer(&s))
fmt.Printf("data ptr: 0x%x\n", hdr.Data)
fmt.Printf("length: %d\n", hdr.Len)
// Substrings share the same backing array — no allocation!
sub := s[7:] // "world"
subHdr := (*reflect.StringHeader)(unsafe.Pointer(&sub))
fmt.Printf("sub data ptr: 0x%x\n", subHdr.Data)
fmt.Printf("original data: 0x%x\n", hdr.Data)
fmt.Printf("offset: %d\n", subHdr.Data-hdr.Data) // 7
// sub.Data == s.Data + 7 — they share the same bytes
}
The critical implication: taking a substring creates a new string header (two words on the stack) but does not copy any bytes. The substring’s data pointer is just the original pointer advanced by an offset. This makes substring operations essentially free.
package main
import (
"fmt"
"strings"
"unicode"
)
// Demonstrating zero-copy substring operations
func trimAndSplit(input string) []string {
// strings.TrimSpace, strings.Split — all return substrings
// sharing the original backing bytes
trimmed := strings.TrimSpace(input)
parts := strings.FieldsFunc(trimmed, unicode.IsSpace)
// Each element of parts is a string header pointing into `input`
// No byte copying occurred
return parts
}
// String comparison: compares lengths first, then bytes
// Short-circuits on length mismatch — O(1) for different-length strings
func compareStrings() {
s1 := strings.Repeat("a", 1_000_000)
s2 := strings.Repeat("a", 999_999) // one byte shorter
// This is O(1) — length check short-circuits immediately
fmt.Println(s1 == s2) // false — instant
s3 := s1 + "x"
s4 := s1 + "y"
// These have the same length — comparison must scan all bytes: O(n)
// In practice the runtime uses optimized SIMD comparison routines
fmt.Println(s3 == s4) // false — scanned to last byte
}
func main() {
parts := trimAndSplit(" hello world go ")
fmt.Println(parts)
compareStrings()
}
Now, string-to-[]byte conversion. This is where allocation happens — and where many developers are more cautious than they need to be:
package main
import (
"fmt"
"unsafe"
)
func main() {
s := "immutable string"
// Standard conversion: allocates a new byte slice, copies the data
// Required because []byte is mutable and string is not
b := []byte(s)
b[0] = 'I' // modifying the byte slice
fmt.Println(string(b)) // "Immutable string"
fmt.Println(s) // "immutable string" — unchanged
// The Go compiler is smart: in certain contexts, it avoids the copy
// For example, passing string to a function expecting []byte in a
// range over bytes — the compiler may optimize away the allocation
// Safe zero-copy read-only access (don't do this in production
// unless you're absolutely sure about lifetime and immutability):
bSlice := unsafe.Slice(unsafe.StringData(s), len(s))
fmt.Println(bSlice[0]) // 105 ('i') — reading is fine
// bSlice[0] = 'x' // DO NOT DO: undefined behavior, string is immutable
}
In The Wild
The most common string performance issue I see is naive concatenation in loops:
package main
import (
"fmt"
"strings"
)
// BAD: O(n²) allocations — each iteration creates a new string
func buildStringBad(parts []string) string {
result := ""
for _, p := range parts {
result += p + ", " // allocates a new string each time
}
return result
}
// GOOD: strings.Builder — single allocation, amortized growth
func buildStringGood(parts []string) string {
var sb strings.Builder
// Optional but helpful: pre-size if you know the total length
total := 0
for _, p := range parts {
total += len(p) + 2
}
sb.Grow(total)
for i, p := range parts {
sb.WriteString(p)
if i < len(parts)-1 {
sb.WriteString(", ")
}
}
return sb.String()
}
// ALSO GOOD: strings.Join for simple cases
func buildStringJoin(parts []string) string {
return strings.Join(parts, ", ")
}
func main() {
data := make([]string, 1000)
for i := range data {
data[i] = fmt.Sprintf("item%d", i)
}
// Don't run buildStringBad with a large slice in production
r1 := buildStringGood(data)
r2 := buildStringJoin(data)
fmt.Println(len(r1) == len(r2)) // true — same result
}
strings.Builder is backed by a []byte that grows as needed. Its String() method performs a single conversion at the end. Since Go 1.20, the compiler can sometimes convert []byte to string in sb.String() without copying (using the unsafe string trick internally), making it even more efficient.
Another real pattern: using strings.Reader for zero-copy streaming:
package main
import (
"fmt"
"io"
"strings"
)
func processFromString(data string) (int, error) {
// strings.NewReader wraps a string as an io.Reader
// No copy: the reader holds the string header directly
r := strings.NewReader(data)
buf := make([]byte, 256)
total := 0
for {
n, err := r.Read(buf)
total += n
if err == io.EOF {
break
}
if err != nil {
return total, err
}
}
return total, nil
}
func main() {
payload := strings.Repeat("Go is great! ", 100)
n, err := processFromString(payload)
fmt.Printf("processed %d bytes, err=%v\n", n, err)
// The original string was never copied — just read through a Reader interface
}
The Gotchas
Unicode and byte indexing: Go strings are sequences of bytes, not characters. A single Unicode code point (rune) may be 1–4 bytes in UTF-8. Indexing a string with s[i] gives you a byte, not a character:
package main
import "fmt"
func main() {
s := "Hello, 世界"
fmt.Println(len(s)) // 13 — bytes, not characters
fmt.Println(s[7]) // 228 — first byte of '世' (0xE4), not a character
// To iterate by character (rune), use range:
for i, r := range s {
if i < 10 {
fmt.Printf("s[%d] = %c (U+%04X)\n", i, r, r)
}
}
// Note: i jumps by the rune's byte width, not by 1
}
The compiler’s string([]byte) optimization: the Go compiler avoids allocation in specific narrow cases when converting []byte to string — notably in map lookups and string comparisons. You don’t need to cache these conversions manually:
package main
import "fmt"
func main() {
m := map[string]int{"hello": 1, "world": 2}
key := []byte("hello")
// This does NOT allocate a new string in modern Go:
// the compiler uses the byte slice bytes directly for the map lookup
val := m[string(key)]
fmt.Println(val) // 1
// Similarly for switch statements with string(b)
switch string(key) {
case "hello":
fmt.Println("matched")
}
}
String interning: Go interns compile-time string constants — two string literals with identical content share the same backing bytes. This means pointer comparison (via reflect.StringHeader) can sometimes tell you if two strings are “the same allocation,” but you should never rely on this for correctness. Use == for equality, which compares bytes.
Large string substrings and GC: because a substring keeps the entire original backing array alive (it holds a pointer into it), a small substring of a very large string can prevent the large string from being collected. If you extract a small piece of a large string and need to hold it long-term, convert it through a strings.Builder or string([]byte(...)) to get an independent copy:
package main
import "fmt"
func extractSmallPiece(huge string) string {
// This keeps the entire huge string alive in memory!
// return huge[1000:1010]
// This creates an independent copy — the huge string can be GC'd
piece := huge[1000:1010]
independent := string([]byte(piece)) // force copy
return independent
}
func main() {
large := make([]byte, 1_000_000)
for i := range large {
large[i] = byte('a' + i%26)
}
s := string(large)
piece := extractSmallPiece(s)
fmt.Println(piece) // 10 bytes, independently allocated
// s and large are now eligible for GC
}
Key Takeaway
A Go string is a read-only two-word value: a pointer to bytes and a length. This design means:
- Substrings are free — they’re just adjusted headers pointing into the same bytes
- String copying is cheap — only 16 bytes, regardless of content
stringto[]byteconversion allocates and copies (required for mutability)[]bytetostringconversion also normally allocates, but the compiler optimizes specific patterns (map lookups, comparisons) to avoid it- Long-lived substrings of large strings prevent GC of the backing bytes — copy them if needed
The practical rules:
- Use
strings.Builderfor concatenation in loops, not+= - Use
strings.Readerto wrap strings asio.Readerwithout copying - Use range loops (not index access) when you care about Unicode characters
- Be aware that a small substring can pin a large backing array — copy when holding long-term
Series: Go Memory Model & Internals
← Lesson 7: Values Copy vs Share
🎓 Course Complete! You’ve finished the Go Memory Model & Internals series. From interface two-word representation to string headers, you now have a mental model of how Go manages memory at the runtime level. These foundations will inform every performance-sensitive design decision you make going forward.
Full series: