Skip to main content

Capture Patterns and Variable Binding: Extracting Values

Capture patterns let you extract values from matched data and bind them to variables. Unlike literal patterns that only test equality, capture patterns extract the matched value so you can use it in the case body. A capture pattern is a bare identifier (like x, user, or value) that binds whatever was matched to that name. Combined with as-patterns, you can destructure complex nested data in a single line—a feature that makes processing JSON, API responses, and configuration files far simpler than manual dictionary access.

I've used capture patterns to process thousands of API events daily. Where old code had data = response['payload'] followed by 10 manual key lookups, a single pattern now extracts and validates the structure simultaneously. This article covers capture variables, the as keyword, binding in nested patterns, and avoiding common pitfalls like variable shadowing.

How Capture Patterns Extract and Bind Values

A capture pattern is the simplest way to extract a value. Use a lowercase identifier to bind the matched expression to a variable available in the case body. Any value that reaches a capture pattern matches—it's essentially a wildcard that also assigns the value.

def process_message(msg):
"""Extract a message and print it."""
match msg:
case 1 | 2 | 3:
print("Small number")
case value: # Capture pattern: binds matched value to 'value'
print(f"Caught: {value}")

process_message(42) # Output: Caught: 42
process_message(3) # Output: Small number (literal pattern matches first)
process_message("hi") # Output: Caught: hi

The identifier value is a capture pattern. Since it comes after the literal cases, it matches anything that didn't match 1, 2, or 3. Inside the case body, value is bound to the actual argument passed to process_message. This is safer than a bare _ because you can reference the matched value without accessing the original variable.

Using As-Patterns for Structured Extraction

The as keyword binds a pattern to a variable, letting you name a matched structure. This is powerful with sequences and mappings (covered later), but it works with any pattern. Syntax: pattern as name. The name is bound in the case body.

def validate_and_process(data: dict) -> str:
"""Extract and validate a user record."""
match data:
case {"name": str(name), "age": int(age)} as user:
return f"Valid user: {name}, {age} years old. Data: {user}"
case {"name": str(name)} as user:
return f"User {name} is missing age. Data: {user}"
case _:
return "Invalid data structure"

result1 = validate_and_process({"name": "Alice", "age": 30})
print(result1)
# Output: Valid user: Alice, 30 years old. Data: {'name': 'Alice', 'age': 30}

result2 = validate_and_process({"name": "Bob"})
print(result2)
# Output: User Bob is missing age. Data: {'name': 'Bob'}

Here, as user captures the entire matched dictionary. Inside the case body, user is the original dictionary, while name and age are extracted. This is useful for logging or further validation—you have both the parts and the whole.

Binding Values in Nested Structures

Capture patterns shine when extracting from nested data like API responses. You can bind multiple variables from a single pattern, extracting exactly what you need while ignoring the rest.

def handle_api_response(response: dict) -> str:
"""Process an API response and extract error or user data."""
match response:
case {"status": "error", "code": code, "message": msg}:
return f"Error {code}: {msg}"
case {"status": "success", "data": {"user": {"id": user_id, "email": email}}}:
return f"User ID {user_id} email: {email}"
case {"status": status}:
return f"Unknown status: {status}"
case _:
return "Invalid response format"

# Test with nested JSON-like data
response1 = {
"status": "success",
"data": {
"user": {"id": 123, "email": "[email protected]", "role": "admin"}
}
}
print(handle_api_response(response1))
# Output: User ID 123 email: [email protected]

response2 = {"status": "error", "code": 401, "message": "Unauthorized"}
print(handle_api_response(response2))
# Output: Error 401: Unauthorized

Each capture variable (code, msg, user_id, email) extracts a nested value without manual dictionary lookups. Python validates the structure simultaneously—if a key is missing or a type is wrong, the pattern doesn't match. Compare this to data["status"]["user"]["id"], which would raise KeyError if any level is missing.

Understanding Variable Scope and Shadowing

Capture variables are scoped to the case body. Each case has its own scope, so you can reuse variable names across different cases. However, within a case, you can't capture the same name twice—Python raises a SyntaxError. This prevents accidental shadowing.

def process_transaction(txn: dict) -> str:
"""Process a transaction and return its status."""
match txn:
case {"type": "deposit", "amount": amount, "account": account}:
# amount and account are bound here
return f"Deposit of {amount} to {account}"
case {"type": "withdrawal", "amount": amount, "account": account}:
# Different scope: amount and account can be reused
return f"Withdrawal of {amount} from {account}"
case {"type": tx_type, "amount": amount}:
# amount is bound again; different case scope is fine
return f"Transaction type {tx_type} for {amount}"
case _:
return "Invalid transaction"

# Each case has independent scope
print(process_transaction({"type": "deposit", "amount": 100, "account": "savings"}))
# Output: Deposit of 100 to savings

print(process_transaction({"type": "withdrawal", "amount": 50, "account": "checking"}))
# Output: Withdrawal of 50 from checking

Each case has its own scope for captured variables. You can safely use amount in multiple cases because they don't overlap. However, within a single case, duplicating a capture name is an error:

# This raises SyntaxError: binding to name 'x' is ambiguous
# match value:
# case [x, x]: # Can't bind x twice
# pass

This restriction prevents subtle bugs where duplicate captures could hide the actual data structure.

Comparison: Capture Patterns vs. Manual Extraction

Capture patterns make your intent explicit and validate structure simultaneously. Here's how they compare to manual dictionary access:

ApproachReadabilitySafetyPerformance
Manual dict access: data['key']['nested']Low; syntax is verboseLow; KeyError if missingFastest (direct lookup)
.get() chaining: data.get('key', {}).get('nested')Medium; more codeMedium; returns None silentlyGood (one lookup per key)
Capture pattern: case {"key": {"nested": val}}:High; intent is clearHigh; pattern doesn't match if missingGood; one comparison, one binding
# Manual approach: verbose and error-prone
def get_user_email_manual(data: dict):
if "user" in data and "email" in data["user"]:
return data["user"]["email"]
return None

# Capture pattern approach: clean and safe
def get_user_email_pattern(data: dict):
match data:
case {"user": {"email": email}}:
return email
case _:
return None

# Both return None if structure is wrong, but pattern version is clearer
test_data = {"user": {"email": "[email protected]"}}
print(get_user_email_manual(test_data)) # Output: [email protected]
print(get_user_email_pattern(test_data)) # Output: [email protected]

Key Takeaways

  • Capture patterns are identifiers that bind matched values to variables in the case body.
  • The as keyword attaches a name to any pattern, binding the entire matched structure as well as its extracted parts.
  • Nested capture patterns extract deeply nested values in a single expression, avoiding manual dictionary lookups and KeyError.
  • Variable scope is per-case: you can reuse variable names across different cases without conflict.
  • You cannot capture the same name twice within a single case (Python raises SyntaxError).
  • Capture patterns validate structure and extract values simultaneously, making code more concise and less error-prone than manual .get() chaining.

Frequently Asked Questions

Can I use underscore _ as a capture variable?

No. The underscore _ is reserved as the wildcard/ignore pattern. Use _ when you don't need the matched value. To capture a value, use any other identifier like x, value, or data.

What's the difference between case x: and case x as y:?

case x: requires x to be an already-defined variable and matches if the matched value equals x. case y as z: doesn't work (invalid syntax). Use case pattern as name: to bind a pattern to a variable. To capture the value itself, use case value: where value is a bare identifier.

Can I use type hints in capture patterns?

Not directly in the pattern syntax. However, you can extract the value and check its type with a guard clause: case value if isinstance(value, int):. Type checking in patterns is done via type patterns (covered in article 4).

How do I extract a value and check if it's not None?

Use a guard: case value if value is not None: or combine with a type pattern: case int(x) if x > 0:. Guards (article 3) let you add conditions after a pattern matches.

Does binding a variable in a pattern modify the original data?

No. Capture patterns extract and bind values but don't modify the original data structure. The binding is read-only—you can't assign back to the matched data through the pattern variable.

Further Reading