Generators: Creating Iterators with `yield`
In the last article, we learned how to create a custom iterator by building a class that implements the full iterator protocol (__iter__ and __next__). While this is powerful, it's also quite verbose for many common use cases.
Python provides a much more elegant and concise way to create iterators: generator functions. A generator function looks like a normal function, but instead of using return to send back a single value, it uses the yield keyword to produce a sequence of values over time.
📚 Prerequisites
You should understand the concepts of iterables and iterators from the previous article.
🎯 Article Outline: What You'll Master
In this article, you will learn:
- ✅ What a Generator Is: Understand that a generator is a simpler way to create an iterator.
- ✅ The
yieldKeyword: Learn howyieldpauses a function's execution and produces a value, without terminating the function. - ✅ Generator Functions: How to write a function that uses
yieldto create a generator object. - ✅ Why Generators are Powerful: Appreciate their memory efficiency and clean syntax.
🧠 Section 1: The Magic of yield
The yield keyword is the heart of generators. It might look like return, but its behavior is completely different.
return: Exits a function completely, and the function's local state is destroyed.yield: Pauses the function, saves its current state, and sends a value back to the caller. When the caller asks for the next value, the function resumes execution right where it left off, with its local state intact.
Any function that contains a yield keyword is automatically a generator function. When you call a generator function, it doesn't run the code. Instead, it immediately returns a generator object, which is a special kind of iterator.
💻 Section 2: Your First Generator Function
Let's rewrite the Counter iterator class from our last article as a generator function. The goal is to produce a sequence of numbers from a start to an end value.
The Old Way (Class-based Iterator):
class Counter:
def __init__(self, start, end):
self.current = start
self.end = end
def __iter__(self):
return self
def __next__(self):
if self.current >= self.end:
raise StopIteration
value = self.current
self.current += 1
return value
This is 10 lines of code.
The New Way (Generator Function):
# generator_example.py
def counter_generator(start, end):
"""A generator that yields numbers from start to end."""
print("Generator started...")
current = start
while current < end:
print(f"Yielding {current}")
yield current
current += 1
print("Generator finished.")
# --- Let's use it ---
my_gen = counter_generator(5, 8)
print(f"The type of my_gen is: {type(my_gen)}")
# The code inside the generator does NOT run until we ask for a value
print("\nAbout to ask for the first value...")
print(f"Received: {next(my_gen)}")
print("\nAbout to ask for the second value...")
print(f"Received: {next(my_gen)}")
print("\nNow, let's use it in a for loop:")
# The loop will pick up where the generator left off
for num in my_gen:
print(f"For loop received: {num}")
Output:
The type of my_gen is: <class 'generator'>
About to ask for the first value...
Generator started...
Yielding 5
Received: 5
About to ask for the second value...
Yielding 6
Received: 6
Now, let's use it in a for loop:
Yielding 7
For loop received: 7
Generator finished.
This is much more concise and, for many, more intuitive. The while loop, the current variable, and the incrementing logic are all handled naturally within the function's scope. The generator automatically implements the __iter__ and __next__ methods for us.
🛠️ Section 3: Why Use Generators?
Generators provide two major benefits:
-
Readability and Simplicity: They are much easier to write and read than a full class-based iterator. The logic is contained in a simple function instead of being split between
__init__,__iter__, and__next__. -
Memory Efficiency: This is the most important advantage. Because generators produce values one at a time (lazy evaluation), they don't store the entire sequence in memory.
Example: Processing a Large File Imagine you need to process a log file that is several gigabytes in size.
Bad Approach (using a list):
def read_log_bad(filepath):
with open(filepath, 'r') as f:
return f.readlines() # This loads the ENTIRE file into memory!
This would likely crash your program.
Good Approach (using a generator):
def read_log_good(filepath):
with open(filepath, 'r') as f:
for line in f:
# Yield one line at a time, using very little memory
yield line.strip()
# You can now process the huge file line by line without memory issues
# for log_entry in read_log_good('massive.log'):
# if "ERROR" in log_entry:
# print(log_entry)
This pattern is perfect for data streaming, processing large files, or working with infinite sequences (like sensor readings).
✨ Conclusion & Key Takeaways
Generators are a cornerstone of idiomatic and efficient Python. They provide a clean and simple syntax for creating iterators, which are fundamental to how Python handles data sequences.
Let's summarize the key takeaways:
- A generator is a simpler way to create an iterator.
- Any function containing the
yieldkeyword is a generator function. yieldpauses the function and produces a value, saving the function's state for the next call.- Generators are memory-efficient because they produce items one at a time ("lazy evaluation"), making them ideal for large datasets.
Challenge Yourself:
Write a generator function called fibonacci_sequence() that yields the Fibonacci numbers one by one indefinitely. (The sequence starts 1, 1, 2, 3, 5, 8, ...)
➡️ Next Steps
Generators are a powerful tool, and Python provides an even more concise way to write them for simple cases. In the next article, we'll explore "Generator Expressions," which look like list comprehensions but create generators instead.
Happy yielding!