Logo
Logo

Atharva Pandey/Lesson 7: go:generate and Code Generation — Let the machine write the boring code

Created Tue, 10 Jun 2025 00:00:00 +0000 Modified Tue, 10 Jun 2025 00:00:00 +0000

go:generate is Go’s mechanism for attaching arbitrary code generation commands to your source files. It is not magic — it is a convention that tells go generate which commands to run and where. When you run go generate ./..., the tool reads every //go:generate comment in your source tree and executes the commands they reference. What those commands produce is up to you: String() methods for enum types, serialization code, mock implementations, database query functions, or anything else you can express as a code generator.

Code generation occupies a specific niche in the Go ecosystem: it produces code that would be tedious to write by hand, brittle to maintain manually, and that benefits from being regular and consistent. The best use of go:generate is when you find yourself writing the same structural pattern over and over — and the structure is determined by a type definition you already have.

The Problem

The classic motivation for code generation is String() methods on integer constants that represent enumerated values:

// WRONG — hand-written String() that needs manual updates
type Status int

const (
    StatusPending Status = iota
    StatusActive
    StatusSuspended
    StatusDeleted
)

func (s Status) String() string {
    switch s {
    case StatusPending:
        return "pending"
    case StatusActive:
        return "active"
    case StatusSuspended:
        return "suspended"
    case StatusDeleted:
        return "deleted"
    }
    return fmt.Sprintf("Status(%d)", s)
}

This compiles and works. But every time you add a new status constant, you must remember to add a case to String(). If you forget, the method silently returns "Status(4)" instead of the name — a bug that is very easy to miss in review.

For larger enumerations, mock generation, and serialization code, the hand-written approach scales worse:

// Imagine writing this for every enum type in a codebase with 20 enums
// Or maintaining mocks for every interface with 8 methods
// These are solved problems — code generation should own them

The Idiomatic Way

stringer is the standard tool for enum String() methods. Install it once, add a //go:generate directive, and regenerate whenever the constants change:

// status.go
package model

//go:generate stringer -type=Status -linecomment

type Status int

const (
    StatusPending   Status = iota // pending
    StatusActive                  // active
    StatusSuspended               // suspended
    StatusDeleted                 // deleted
)

Running go generate ./... produces status_string.go with a lookup table implementation that is faster than a switch statement and exactly correct. The -linecomment flag uses the comment text as the string value — so StatusPending prints as "pending".

For more complex code generation, text/template is the standard tool. Here is a generator that produces type-specific Set implementations (avoiding reflection for simple contains checks):

// generate/set_gen.go — generator program, lives in a generate/ subdirectory
package main

import (
    "go/format"
    "os"
    "text/template"
)

const setTemplate = `// Code generated by generate/set_gen.go; DO NOT EDIT.
package {{.Package}}

// {{.TypeName}}Set is a set of {{.ElementType}} values.
type {{.TypeName}}Set map[{{.ElementType}}]struct{}

func New{{.TypeName}}Set(vals ...{{.ElementType}}) {{.TypeName}}Set {
    s := make({{.TypeName}}Set, len(vals))
    for _, v := range vals {
        s[v] = struct{}{}
    }
    return s
}

func (s {{.TypeName}}Set) Contains(v {{.ElementType}}) bool {
    _, ok := s[v]
    return ok
}

func (s {{.TypeName}}Set) Add(v {{.ElementType}}) {
    s[v] = struct{}{}
}

func (s {{.TypeName}}Set) Remove(v {{.ElementType}}) {
    delete(s, v)
}

func (s {{.TypeName}}Set) Len() int {
    return len(s)
}
`

type SetConfig struct {
    Package     string
    TypeName    string
    ElementType string
}

func main() {
    configs := []SetConfig{
        {Package: "model", TypeName: "String", ElementType: "string"},
        {Package: "model", TypeName: "Int",    ElementType: "int"},
        {Package: "model", TypeName: "UserID", ElementType: "UserID"},
    }

    tmpl := template.Must(template.New("set").Parse(setTemplate))

    for _, cfg := range configs {
        filename := fmt.Sprintf("../model/%s_set_gen.go",
            strings.ToLower(cfg.TypeName))
        f, err := os.Create(filename)
        if err != nil {
            panic(err)
        }

        var buf bytes.Buffer
        if err := tmpl.Execute(&buf, cfg); err != nil {
            panic(err)
        }

        // Format the generated code with gofmt rules
        formatted, err := format.Source(buf.Bytes())
        if err != nil {
            panic(err)
        }
        f.Write(formatted)
        f.Close()
    }
}

The //go:generate directive in the model package:

// model/types.go
package model

//go:generate go run ../generate/set_gen.go

Running go generate ./... produces string_set_gen.go, int_set_gen.go, and userid_set_gen.go — each a zero-reflection, fast, type-safe set implementation.

In The Wild

The most impactful code generation I have set up was for database query builders. We had a pattern where every repository method needed the same boilerplate: open connection, build query, scan results, handle errors, close rows. Writing this by hand for thirty repository functions was tedious and error-prone. A generator that read //go:generate annotations and produced the boilerplate from SQL file templates reduced the repository layer from ~2,000 lines to ~400 lines of generator definition plus ~1,600 lines of generated code — but the generated code was correct by construction.

The key discipline for working with generated code:

// ALWAYS add this comment at the top of every generated file
// Code generated by generate/queries.go; DO NOT EDIT.

// And add generated files to .gitignore or commit them explicitly — pick one convention
// Committing is often better because CI can verify that checked-in generated code
// matches what the generator produces (catching forgotten regenerations):

CI check for regeneration drift:

# In CI pipeline: regenerate and check for diff
go generate ./...
if ! git diff --quiet; then
    echo "Generated files are out of date. Run 'go generate ./...' and commit."
    git diff
    exit 1
fi

This catches the common mistake of editing a source type without regenerating its derived code.

The Gotchas

Generated files should start with // Code generated ... DO NOT EDIT. This is a recognized Go convention. Go tools, linters, and code coverage tools that see this comment know to treat the file as generated and apply different rules (no coverage requirements, no lint warnings for generated patterns).

go generate does not run automatically. It requires an explicit go generate ./... invocation. It is not part of go build or go test. You must add it to your CI pipeline and developer workflow explicitly.

Generator programs need to be available. If your generator is an external binary (stringer, mockgen), it must be installed. The Go community convention is to add generator tools as dependencies in tools.go:

// tools.go — a blank import that pins the version of generator tools
//go:build tools

package tools

import (
    _ "golang.org/x/tools/cmd/stringer"
    _ "github.com/golang/mock/mockgen"
)

This makes go mod tidy track the tools as module dependencies and go install can install them at the pinned version.

Key Takeaway

🎓 Course Complete! You have reached the end of “Go Reflection & Metaprogramming.” The arc of these seven lessons is a deliberate progression: understand when reflection is justified, measure its costs honestly, learn to use struct tags effectively, build a serializer to internalize the patterns, apply those patterns to validation, recognize when generics or code generation are better alternatives, and finally embrace code generation as a first-class tool for the regular, structural code that would otherwise be written tediously by hand. Reflection, generics, and code generation are three points on a spectrum from maximum flexibility (reflection) to maximum performance and safety (code generation), with generics as the ergonomic middle ground for typed algorithms. Using each in its appropriate place produces Go code that is efficient where it must be, flexible where it needs to be, and maintainable everywhere.


Lesson 6: Avoiding Reflection | Course Index