Lesson 29: State Machine DP — Track what state you're in -

Most DP problems have a clean one-dimensional state: position in an array, remaining capacity, current index. State machine DP adds another dimension that isn’t just a number — it’s a mode or status that your system is in. “Am I currently holding a stock? Am I in a cooldown period? Which color did I just paint the last house?”

The stock trading problems are the canonical example. LeetCode has an entire series of them (I, II, III, IV, with Cooldown, with Transaction Fee) that all share the same structure but add constraints one by one. Solving them all with the same mental framework is satisfying once the state machine clicks.

The key skill: identify the finite set of states, define a DP value for each state at each time step, and write transitions that respect the rules for moving between states.

The Pattern

State machine DP follows this shape:

Define the states: what distinct “modes” can the system be in?
For each day/step, maintain a value for each state.
Transitions: how can you move from state A to state B on the current step? What cost/reward does that transition carry?
Initialize states at day 0 (or day 1) carefully — some states are impossible at the start.

Problem 1: Best Time to Buy and Sell Stock — Variants II, III, IV, and Cooldown

Let me work through the variants systematically, because the state machine approach makes all of them fall from a single template.

Variant II: Unlimited Transactions (LeetCode 122)

States: held (holding a stock), free (not holding, no restriction).

Transitions:

free → held: buy on this day. held[i] = max(held[i-1], free[i-1] - prices[i])
held → free: sell on this day. free[i] = max(free[i-1], held[i-1] + prices[i])
Stay in same state (do nothing).

func maxProfitII(prices []int) int {
    held := -prices[0] // bought on day 0
    free := 0          // didn't buy on day 0

    for i := 1; i < len(prices); i++ {
        newHeld := held
        if free-prices[i] > newHeld {
            newHeld = free - prices[i]
        }
        newFree := free
        if held+prices[i] > newFree {
            newFree = held + prices[i]
        }
        held = newHeld
        free = newFree
    }
    return free
}

O(n) time, O(1) space. The newHeld/newFree temporaries prevent using today’s updated values in the same step.

Variant with Cooldown (LeetCode 309)

After selling, you must wait one day before buying again. Add a cooldown state:

States: held (holding), cooldown (just sold, must wait), free (can buy freely).

Transitions:

free → held: buy. held[i] = max(held[i-1], free[i-1] - prices[i])
held → cooldown: sell. cooldown[i] = held[i-1] + prices[i]
cooldown → free: cooldown expires. free[i] = max(free[i-1], cooldown[i-1])

func maxProfitCooldown(prices []int) int {
    held := -prices[0]
    cooldown := 0
    free := 0

    for i := 1; i < len(prices); i++ {
        newHeld := held
        if free-prices[i] > newHeld {
            newHeld = free - prices[i]
        }
        newCooldown := held + prices[i]
        newFree := free
        if cooldown > newFree {
            newFree = cooldown
        }
        held = newHeld
        cooldown = newCooldown
        free = newFree
    }
    return free
}

The cooldown variant is where the state machine framing pays off. Without it, you’d try to handle the “skip next day” rule inline with awkward index arithmetic.

Variant III: At Most 2 Transactions (LeetCode 123)

Now transactions are limited. Add a “transaction count” dimension to the state.

States: hold1 (holding after first buy), free1 (sold once, or never bought), hold2 (holding after second buy), free2 (sold twice — or reached after any second sale).

Transitions in order of state progression:

free1 → hold1: first buy. hold1 = max(hold1, free1 - prices[i])
hold1 → free1': first sell. free1 = max(free1, hold1 + prices[i])` ← this updates in-place because we use the sell state
Actually cleaner to track: hold1, sold1, hold2, sold2.

func maxProfitIII(prices []int) int {
    hold1 := -prices[0]
    sold1 := 0
    hold2 := -prices[0]
    sold2 := 0

    for i := 1; i < len(prices); i++ {
        p := prices[i]
        // Update in reverse order of state chain to avoid using today's values
        sold2 = max2(sold2, hold2+p)
        hold2 = max2(hold2, sold1-p)
        sold1 = max2(sold1, hold1+p)
        hold1 = max2(hold1, -p)
    }
    return sold2
}

func max2(a, b int) int {
    if a > b {
        return a
    }
    return b
}

Why update in reverse order: sold2 uses hold2 from yesterday. If we update hold2 first, we’d be using today’s hold2 to update sold2. Updating in reverse (sold2, hold2, sold1, hold1) ensures each state uses yesterday’s values for the previous state.

Variant IV: At Most k Transactions (LeetCode 188)

Generalize: maintain arrays hold[k] and sold[k].

func maxProfitIV(k int, prices []int) int {
    n := len(prices)
    if n == 0 {
        return 0
    }
    // If k >= n/2, it's equivalent to unlimited transactions
    if k >= n/2 {
        return maxProfitII(prices)
    }

    hold := make([]int, k)
    sold := make([]int, k)
    for i := range hold {
        hold[i] = -prices[0]
    }

    for i := 1; i < n; i++ {
        p := prices[i]
        for j := k - 1; j >= 0; j-- {
            sold[j] = max2(sold[j], hold[j]+p)
            if j == 0 {
                hold[j] = max2(hold[j], -p)
            } else {
                hold[j] = max2(hold[j], sold[j-1]-p)
            }
        }
    }
    return sold[k-1]
}

The k >= n/2 short-circuit is important: with enough allowed transactions, you can capture every upward move, which equals the unlimited variant.

Problem 2: Paint Houses

LeetCode 256 / 265. There are n houses to paint. Paint house i with one of k colors. Adjacent houses cannot be the same color. The cost of painting house i with color c is costs[i][c]. Minimize total cost.

Define the subproblem

dp[i][c] = minimum cost to paint houses 0 through i such that house i is painted color c.

Recurrence

dp[i][c] = costs[i][c] + min over all c' != c of dp[i-1][c']

For k = 3 colors, this is:

dp[i][0] = costs[i][0] + min(dp[i-1][1], dp[i-1][2])
dp[i][1] = costs[i][1] + min(dp[i-1][0], dp[i-1][2])
dp[i][2] = costs[i][2] + min(dp[i-1][0], dp[i-1][1])

Build bottom-up (3 colors)

func minCostPaint(costs [][]int) int {
    n := len(costs)
    if n == 0 {
        return 0
    }

    dp := make([][]int, n)
    for i := range dp {
        dp[i] = make([]int, 3)
    }
    dp[0][0] = costs[0][0]
    dp[0][1] = costs[0][1]
    dp[0][2] = costs[0][2]

    for i := 1; i < n; i++ {
        dp[i][0] = costs[i][0] + min2(dp[i-1][1], dp[i-1][2])
        dp[i][1] = costs[i][1] + min2(dp[i-1][0], dp[i-1][2])
        dp[i][2] = costs[i][2] + min2(dp[i-1][0], dp[i-1][1])
    }

    result := dp[n-1][0]
    if dp[n-1][1] < result {
        result = dp[n-1][1]
    }
    if dp[n-1][2] < result {
        result = dp[n-1][2]
    }
    return result
}

func min2(a, b int) int {
    if a < b {
        return a
    }
    return b
}

Space optimization and scaling to k colors

For k colors (LeetCode 265), the naive transition is O(k) per state, giving O(nk²) total. But you only need the two smallest values from the previous row — not the full array — to compute each transition in O(1):

func minCostII(costs [][]int) int {
    n, k := len(costs), len(costs[0])
    if n == 0 {
        return 0
    }

    prev := make([]int, k)
    copy(prev, costs[0])

    for i := 1; i < n; i++ {
        // Find two smallest values in prev
        min1, min2 := 1<<31-1, 1<<31-1
        min1Idx := -1
        for c := 0; c < k; c++ {
            if prev[c] < min1 {
                min2 = min1
                min1 = prev[c]
                min1Idx = c
            } else if prev[c] < min2 {
                min2 = prev[c]
            }
        }

        curr := make([]int, k)
        for c := 0; c < k; c++ {
            if c == min1Idx {
                curr[c] = costs[i][c] + min2
            } else {
                curr[c] = costs[i][c] + min1
            }
        }
        prev = curr
    }

    result := prev[0]
    for _, v := range prev[1:] {
        if v < result {
            result = v
        }
    }
    return result
}

O(nk) time. The two-minimum trick is broadly applicable: any time a DP transition says “minimum of all neighbors except self,” precompute the global min and second min, and handle the “except self” case by using the second min only when the current index matches the global minimum’s index.

How to Recognize This Pattern

You’re looking at state machine DP when:

The problem has a finite set of modes the system can be in at each time step.
The mode affects what transitions are legal (can’t buy while holding, can’t buy during cooldown).
The state isn’t just “which index am I at” but also “what status am I currently in.”

Watch for: transactions with limits, cooldown periods, consecutive constraints (can’t do X twice in a row), color/category constraints on adjacent items.

The number of states is usually small (2-5). If you find yourself needing a large number of states, the state probably encodes too much — look for a simpler reformulation.

Key Takeaway

State machine DP names the modes explicitly and writes one transition equation per valid state change. The stock problems are all the same framework: states represent (holding/not holding) with added constraints (cooldown, transaction count) expanding the state space.

The practical rule: write the state transition diagram first (on paper or mentally). Each arrow is one line of your recurrence. Then implement.

Update states in reverse dependency order to avoid using today’s computed values instead of yesterday’s.

Previous: Lesson 28: Interval DP · Next: Lesson 30: Bitmask DP

Atharva Pandey/Lesson 29: State Machine DP — Track what state you're in