Skip to main content

Sets: Unordered collections of unique items. Set operations.

We've explored ordered collections like lists and tuples, and the key-value world of dictionaries. Now we introduce the final core collection type: the set. Sets are all about uniqueness and are modeled after the mathematical concept of a set.


📚 Prerequisites

Before we begin, please ensure you have a solid grasp of the following concepts:

  • Basic Python syntax (variables, data types).
  • A general understanding of Python lists.

🎯 Article Outline: What You'll Master

In this article, you will learn:

  • Foundational Theory: The core properties of a set (unordered, mutable, unique items).
  • Core Implementation: How to create sets and add or remove items.
  • Set Operations: How to perform mathematical operations like union, intersection, and difference.
  • Practical Application: Building a program to find the unique and common skills between two developers.

🧠 Section 1: The Core Concepts of Python Sets

A set is a collection that is unordered, mutable, and does not allow duplicate items.

Key Principles:

  • Unique Items: This is the most important feature of a set. If you try to add an item to a set that already contains that item, the set will not change. This makes sets perfect for removing duplicates from other collections.
  • Unordered: Sets do not maintain any order. When you print a set, the items may appear in a random order. You cannot access items by an index.
  • Mutable: You can add and remove items from a set after it has been created. However, the items themselves must be of an immutable type (like a string, number, or tuple).

💻 Section 2: Deep Dive - Creating and Modifying Sets

Let's see how to create and work with sets.

2.1 - Creating a Set

You can create a set using curly braces {} or the set() constructor.

# CodeBlock1.py
# Creating sets

# Create a set with curly braces
fruits = {"apple", "banana", "cherry"}
print(f"A set of fruits: {fruits}")

# Create a set from a list (duplicates are automatically removed)
numbers_list = [1, 2, 2, 3, 4, 3]
numbers_set = set(numbers_list)
print(f"Set from a list with duplicates: {numbers_set}")

# IMPORTANT: Creating an empty set
empty_dict = {}
empty_set = set() # You MUST use set() for an empty set
print(f"Type of {{}}: {type(empty_dict)}")
print(f"Type of set(): {type(empty_set)}")

Key Point: To create an empty set, you must use set(). Using {} creates an empty dictionary.

2.2 - Adding and Removing Items

# CodeBlock2.py
# Adding and removing set items

skills = {"Python", "Git"}
print(f"Original skills: {skills}")

# Add a single item
skills.add("SQL")
print(f"After add('SQL'): {skills}")

# Add multiple items from another iterable
skills.update(["HTML", "CSS"])
print(f"After update(['HTML', 'CSS']): {skills}")

# Remove an item - raises an error if not found
skills.remove("Git")
print(f"After remove('Git'): {skills}")

# Remove an item - does NOT raise an error if not found
skills.discard("JavaScript") # 'JavaScript' is not in the set
print(f"After discard('JavaScript'): {skills}")

🚀 Section 3: Set Operations

This is where sets truly shine. They allow you to perform mathematical set operations to compare collections.

# SetOperations.py
# Demonstrating set operations

dev1_skills = {"Python", "JavaScript", "HTML", "CSS"}
dev2_skills = {"Python", "SQL", "HTML", "PowerBI"}

# Union: All unique skills from both developers
all_skills = dev1_skills.union(dev2_skills)
# Or using the | operator: all_skills = dev1_skills | dev2_skills
print(f"Union (all skills): {all_skills}")

# Intersection: Skills both developers have in common
common_skills = dev1_skills.intersection(dev2_skills)
# Or using the & operator: common_skills = dev1_skills & dev2_skills
print(f"Intersection (common skills): {common_skills}")

# Difference: Skills that dev1 has but dev2 does not
unique_to_dev1 = dev1_skills.difference(dev2_skills)
# Or using the - operator: unique_to_dev1 = dev1_skills - dev2_skills
print(f"Difference (skills unique to dev1): {unique_to_dev1}")

# Symmetric Difference: Skills that are in one set or the other, but not both
unique_skills_overall = dev1_skills.symmetric_difference(dev2_skills)
# Or using the ^ operator: unique_skills_overall = dev1_skills ^ dev2_skills
print(f"Symmetric Difference (skills unique to one dev or the other): {unique_skills_overall}")

Walkthrough:

  • .union() (|): Combines two sets into a new set with all unique items from both.
  • .intersection() (&): Creates a new set containing only the items that are present in both sets.
  • .difference() (-): Creates a new set with items from the first set that are not in the second set.
  • .symmetric_difference() (^): Creates a new set with items that are in either the first set or the second set, but not in both.

🛠️ Section 4: Project-Based Example: Find Unique and Common Skills

Let's formalize the example from the previous section into a small program.

# ProjectExample.py
# The full Python code for the mini-project.

def analyze_developer_skills(dev1_skills, dev2_skills):
"""
Analyzes and prints the relationship between two sets of skills.
"""
# Ensure inputs are sets to remove any duplicates
dev1_set = set(dev1_skills)
dev2_set = set(dev2_skills)

common = dev1_set.intersection(dev2_set)
unique_to_dev1 = dev1_set.difference(dev2_set)
unique_to_dev2 = dev2_set.difference(dev1_set)
all_skills = dev1_set.union(dev2_set)

print("--- Skill Analysis ---")
print(f"Common Skills: {common if common else 'None'}")
print(f"Skills only Dev1 has: {unique_to_dev1 if unique_to_dev1 else 'None'}")
print(f"Skills only Dev2 has: {unique_to_dev2 if unique_to_dev2 else 'None'}")
print(f"Total Skill Pool: {all_skills}")
print("----------------------")


# --- Main Execution ---
developer1_skills = ["Python", "JavaScript", "React", "CSS", "Git"]
developer2_skills = ["Java", "Python", "SQL", "Git", "Docker"]

analyze_developer_skills(developer1_skills, developer2_skills)

💡 Conclusion & Key Takeaways

Sets are the ideal data structure when the uniqueness of items is your primary concern. Their ability to perform high-level mathematical operations makes them invaluable for data comparison and filtering tasks.

Let's summarize the key Python takeaways:

  • Sets store unordered collections of unique, immutable items.
  • Create them with {} (for non-empty) or set() (required for empty).
  • The main power of sets comes from their methods: .union(), .intersection(), .difference(), etc.
  • They are highly efficient for membership testing (e.g., if item in my_set:) and removing duplicates from lists.

Challenge Yourself (Python Edition): You have two lists of customer IDs from two different marketing campaigns. Write a short script that uses sets to find out how many unique customers were reached across both campaigns.


➡️ Next Steps

Congratulations! You've now covered all of Python's core collection types: lists, tuples, dictionaries, and sets. In the next article, "Introduction to Python Collections Module: namedtuple, deque, Counter", we'll look at some specialized collection types that can be very useful in specific situations.

Keep practicing, keep exploring, and enjoy your Python coding adventure!


Glossary (Python Terms)

  • Set: An unordered and mutable Python collection of unique, immutable items.
  • Union: An operation that combines all elements from two or more sets.
  • Intersection: An operation that finds the common elements between two or more sets.
  • Difference: An operation that finds the elements present in one set but not in another.

Further Reading (Python Resources)