Iterators and Iterables: The Iterator Protocol
One of the most powerful and frequently used features in Python is the for loop. We use it to loop over lists, strings, dictionaries, and files. But how does it actually work? How does the for loop know how to handle all these different types of objects?
The answer lies in the iterator protocol, a fundamental design pattern in Python. Understanding this protocol is key to understanding a huge part of the language, and it's the foundation for more advanced concepts like generators. This article will demystify the difference between an iterable and an iterator.
📚 Prerequisites
You should be comfortable with basic Python data structures like lists and the for loop. A basic understanding of classes and dunder methods (__init__, etc.) is also helpful.
🎯 Article Outline: What You'll Master
In this article, you will learn:
- ✅ What an Iterable Is: An object capable of returning its members one at a time.
- ✅ What an Iterator Is: The object that actually does the iterating and keeps track of the state.
- ✅ The Iterator Protocol: The two magic methods,
__iter__()and__next__(), that make everything work. - ✅ Creating a Custom Iterator: How to build your own class that follows the iterator protocol.
🧠 Section 1: Iterable vs. Iterator
These two terms sound similar but have distinct meanings.
Iterable
An iterable is any Python object that you can loop over with a for loop. It's an object that can give you an iterator. If an object has an __iter__() method, it is an iterable.
- Examples: Lists, tuples, strings, dictionaries, sets, files.
- Analogy: An iterable is like a book. It contains all the items (the pages), but it isn't responsible for keeping track of your reading progress.
Iterator
An iterator is the object that produces the next value in a sequence. It's responsible for keeping track of the current state (i.e., which item comes next). An iterator is defined by having a __next__() method.
- Analogy: An iterator is like a bookmark. It knows where you are in the book (the iterable) and can give you the next page when you ask for it.
Key Relationship: You get an iterator from an iterable. Every iterator is also an iterable (it can return itself from its __iter__ method), but not every iterable is an iterator.
💻 Section 2: The Iterator Protocol in Action
The protocol consists of two methods:
__iter__(self): Called on an iterable to get an iterator. It should return an object that has a__next__method.__next__(self): Called on an iterator to get the next item. When there are no more items, it must raise theStopIterationexception.
The for loop uses this protocol under the hood. When you write for item in my_list:, Python does the following:
- Calls
iter(my_list), which in turn callsmy_list.__iter__()to get an iterator object. - On each loop, it calls
next()on the iterator object to get the next item. - When the iterator raises
StopIteration, the loop knows all items have been processed and it terminates cleanly.
Let's see this manually:
my_list = ['a', 'b', 'c']
# 1. Get an iterator from the iterable list
my_iterator = iter(my_list)
print(f"The type of my_list is: {type(my_list)}")
print(f"The type of my_iterator is: {type(my_iterator)}")
# 2. Call next() on the iterator to get items
print(next(my_iterator))
print(next(my_iterator))
print(next(my_iterator))
# 3. Calling next() again will raise StopIteration
try:
next(my_iterator)
except StopIteration:
print("StopIteration was raised: no more items!")
Output:
The type of my_list is: <class 'list'>
The type of my_iterator is: <class 'list_iterator'>
a
b
c
StopIteration was raised: no more items!
🛠️ Section 3: Building a Custom Iterator
The best way to understand the protocol is to implement it yourself. Let's create a class that acts like range(), counting up from a start value to an end value.
# custom_iterator.py
class Counter:
"""A simple iterator that counts from start to end."""
def __init__(self, start, end):
self.current = start
self.end = end
# This makes our class an iterable
def __iter__(self):
# It returns itself because this object is also the iterator
return self
# This makes our class an iterator
def __next__(self):
if self.current >= self.end:
# Signal that the iteration is complete
raise StopIteration
else:
# Return the current value and increment for the next call
value = self.current
self.current += 1
return value
# --- Let's use our custom iterator ---
my_counter = Counter(5, 8)
# Because it's an iterable, we can use it in a for loop!
for num in my_counter:
print(num)
Output:
5
6
7
Our Counter class successfully implements the iterator protocol. The for loop is able to get an iterator from it (via __iter__) and then repeatedly call __next__ on it until StopIteration is raised.
✨ Conclusion & Key Takeaways
The iterator protocol is a core concept that underpins a huge amount of Python's functionality. It provides a consistent and memory-efficient way to process sequences of data.
Let's summarize the key takeaways:
- An iterable is an object that can be looped over (it has an
__iter__method). - An iterator is an object that produces the next value in a sequence and maintains state (it has a
__next__method). - The
forloop works by getting an iterator from an iterable and then callingnext()on it untilStopIterationis raised. - Lazy Evaluation: Iterators produce items one at a time, only when requested. This is extremely memory-efficient for large datasets.
Challenge Yourself:
Create a custom iterator class called ReverseString that takes a string in its constructor and, when looped over, yields the characters of the string in reverse order.
➡️ Next Steps
While creating a class-based iterator is powerful, it can be verbose for simple cases. Python provides a much more concise and elegant way to create iterators called generators. In the next article, we'll explore how to use the yield keyword to create "Generators: Creating iterators with yield."
Happy iterating!