Logo
Logo

Atharva Pandey/Lesson 1: clap — Argument parsing done right

Created Sun, 01 Sep 2024 14:22:00 +0000 Modified Sun, 01 Sep 2024 14:22:00 +0000

I’ve written argument parsers by hand in four different languages. Every single time, I ended up with a tangled mess of string matching, edge cases around flags that take optional values, and help text that drifted out of sync with reality within a week. Then I found clap.

The Problem With Rolling Your Own

Parsing command-line arguments seems simple. You grab std::env::args(), maybe split on =, handle some --flag and -f cases. Works great until someone passes --output= with no value. Or uses -vvv for triple verbosity. Or expects --help to just work. Or wants --color=always|never|auto. Suddenly your 40-line parser is 400 lines of spaghetti.

The real issue isn’t parsing — it’s everything around parsing. Validation, help generation, shell completions, error messages that don’t make users want to throw their laptop. That’s what clap handles.

Setting Up clap

Add it to your Cargo.toml:

[package]
name = "myapp"
version = "0.1.0"
edition = "2021"

[dependencies]
clap = { version = "4", features = ["derive"] }

The derive feature is what makes clap worth using. Without it, you’re writing builder-pattern code that’s fine but verbose. With derive, your argument definitions are your struct:

use clap::Parser;

#[derive(Parser, Debug)]
#[command(name = "myapp")]
#[command(about = "A tool that does something useful")]
#[command(version)]
struct Cli {
    /// Input file to process
    input: String,

    /// Output file (defaults to stdout)
    #[arg(short, long)]
    output: Option<String>,

    /// Enable verbose output
    #[arg(short, long, action = clap::ArgAction::Count)]
    verbose: u8,
}

fn main() {
    let cli = Cli::parse();

    println!("Input: {}", cli.input);

    if let Some(ref out) = cli.output {
        println!("Output: {}", out);
    }

    match cli.verbose {
        0 => println!("Quiet mode"),
        1 => println!("Verbose"),
        2 => println!("Very verbose"),
        _ => println!("Maximum verbosity"),
    }
}

Run it with --help and you get clean, formatted output — for free. The doc comments (///) become the help descriptions. The type system drives the validation. An Option<String> means the flag is optional. A plain String means it’s required. A u8 with ArgAction::Count means you can stack -vvv.

Real-World Patterns

Here’s what a production-quality CLI actually looks like. I’ll build a file search tool:

use clap::Parser;
use std::path::PathBuf;

#[derive(Parser, Debug)]
#[command(
    name = "hunt",
    about = "Search files for patterns",
    version,
    after_help = "Examples:\n  hunt 'TODO' src/\n  hunt -i 'fixme' --ext rs,toml ."
)]
struct Cli {
    /// Pattern to search for (regex supported)
    pattern: String,

    /// Directory to search in
    #[arg(default_value = ".")]
    path: PathBuf,

    /// Case-insensitive search
    #[arg(short, long)]
    ignore_case: bool,

    /// File extensions to include (comma-separated)
    #[arg(long, value_delimiter = ',')]
    ext: Vec<String>,

    /// Maximum search depth
    #[arg(short = 'd', long, default_value = "10")]
    max_depth: usize,

    /// Number of context lines before and after match
    #[arg(short = 'C', long, default_value = "0")]
    context: usize,

    /// Output format
    #[arg(long, value_enum, default_value = "text")]
    format: OutputFormat,

    /// Suppress all output except matches
    #[arg(short, long)]
    quiet: bool,
}

#[derive(clap::ValueEnum, Clone, Debug)]
enum OutputFormat {
    Text,
    Json,
    Count,
}

fn main() {
    let cli = Cli::parse();

    if !cli.path.exists() {
        eprintln!("Error: path '{}' does not exist", cli.path.display());
        std::process::exit(1);
    }

    println!("Searching for '{}' in {}", cli.pattern, cli.path.display());
    println!("Case insensitive: {}", cli.ignore_case);
    println!("Extensions: {:?}", cli.ext);
    println!("Max depth: {}", cli.max_depth);
    println!("Context lines: {}", cli.context);
    println!("Format: {:?}", cli.format);
}

A few things worth noting:

value_delimiter — the --ext rs,toml flag splits on commas and gives you a Vec<String>. No manual splitting needed.

ValueEnum — the OutputFormat enum restricts the --format flag to exactly text, json, or count. Pass anything else and clap prints an error with the valid options.

default_value — sets sensible defaults without Option<T>. The user doesn’t need to know about max depth unless they want to change it.

after_help — those usage examples at the bottom of --help that actually make your tool usable.

Validation Beyond Types

Sometimes type-level validation isn’t enough. You want to check that a port number is in range, or that a file exists, or that two flags aren’t used together:

use clap::Parser;
use std::path::PathBuf;

#[derive(Parser, Debug)]
#[command(name = "serve")]
struct Cli {
    /// Port to listen on
    #[arg(short, long, default_value = "8080", value_parser = clap::value_parser!(u16).range(1024..65535))]
    port: u16,

    /// Config file (must exist)
    #[arg(short, long, value_parser = existing_file)]
    config: Option<PathBuf>,

    /// Run in daemon mode
    #[arg(long)]
    daemon: bool,

    /// Run in foreground mode
    #[arg(long, conflicts_with = "daemon")]
    foreground: bool,

    /// Workers count (requires daemon mode)
    #[arg(long, requires = "daemon", default_value = "4")]
    workers: usize,
}

fn existing_file(s: &str) -> Result<PathBuf, String> {
    let path = PathBuf::from(s);
    if path.exists() {
        Ok(path)
    } else {
        Err(format!("file '{}' does not exist", s))
    }
}

fn main() {
    let cli = Cli::parse();
    println!("{:#?}", cli);
}

The value_parser on port restricts it to 1024-65535. The conflicts_with attribute means you can’t pass both --daemon and --foreground. The requires attribute means --workers is only valid with --daemon. Try breaking any of these rules and clap gives you a clear error — no manual validation code on your end.

Environment Variable Fallbacks

Almost every production CLI needs to read from environment variables too. clap handles this natively:

use clap::Parser;

#[derive(Parser, Debug)]
#[command(name = "deploy")]
struct Cli {
    /// API token for authentication
    #[arg(long, env = "DEPLOY_API_TOKEN")]
    api_token: String,

    /// Target environment
    #[arg(long, env = "DEPLOY_ENV", default_value = "staging")]
    environment: String,

    /// Enable dry run
    #[arg(long, env = "DEPLOY_DRY_RUN")]
    dry_run: bool,
}

fn main() {
    let cli = Cli::parse();
    println!("Deploying to {} (dry_run: {})", cli.environment, cli.dry_run);
    println!("Token: {}...", &cli.api_token[..8]);
}

Now --api-token can be passed on the command line or via DEPLOY_API_TOKEN. The flag takes priority if both are set. The --help output even shows [env: DEPLOY_API_TOKEN] next to the flag description.

Global Options With Flattening

When your CLI gets complex, you’ll want to share options across subcommands. #[command(flatten)] does this:

use clap::Parser;

#[derive(Parser, Debug)]
struct GlobalOpts {
    /// Enable verbose output
    #[arg(short, long, global = true)]
    verbose: bool,

    /// Config file path
    #[arg(short, long, global = true, default_value = "~/.config/myapp/config.toml")]
    config: String,

    /// Disable color output
    #[arg(long, global = true)]
    no_color: bool,
}

#[derive(Parser, Debug)]
#[command(name = "myapp")]
struct Cli {
    #[command(flatten)]
    global: GlobalOpts,

    /// What to do
    #[arg(default_value = "run")]
    action: String,
}

fn main() {
    let cli = Cli::parse();
    if cli.global.verbose {
        println!("Verbose mode enabled");
    }
    println!("Action: {}", cli.action);
}

This is how you avoid repeating --verbose, --config, and --no-color across every subcommand definition. Define them once, flatten them in.

Common Mistakes

Forgetting #[arg(short, long)]. Without these attributes, clap treats the field as a positional argument. If you meant it to be --output, you need to say so explicitly.

Using String when PathBuf works. If the argument is a file path, use PathBuf. It handles platform-specific path separators and integrates with Rust’s filesystem APIs.

Not using action = ArgAction::SetTrue for booleans. Actually, you don’t need to — clap 4 does this automatically for bool fields. But if you’re migrating from clap 3, this tripped a lot of people up.

Ignoring try_parse. Cli::parse() calls std::process::exit(1) on failure. If you’re in a library or want custom error handling, use Cli::try_parse() which returns a Result.

clap is one of those crates that gets out of your way once you learn the derive API. You describe what you want, the library handles the rest. The next lesson covers what happens after you’ve parsed your arguments — actually reading from stdin, writing to stdout, and handling I/O in a CLI context.