Sets — Unordered Collections of Unique Elements
A **set** in Python is an **unordered**, **mutable** collection that automatically ensures **uniqueness** of its elements.
Chapter 3: Data Structures
Sub-chapter: Sets — Unordered Collections of Unique Elements
A set in Python is an unordered, mutable collection that automatically ensures uniqueness of its elements.
Sets are incredibly useful for eliminating duplicates, performing mathematical set operations, and testing membership efficiently.
Behind the scenes, Python implements sets as hash tables, allowing average-case O(1) lookup time — meaning checking whether an element exists in a set is extremely fast.
🧠 Why Use Sets?
Sets are perfect for:
- ✅ Storing unique items only (no duplicates)
- ⚡ Performing fast membership checks (
in,not in) - 🧮 Doing mathematical set operations like union, intersection, and difference
- 🧱 Cleaning and deduplicating datasets
⚙️ Creating Sets
Using Curly Braces {}
colors = {"red", "green", "blue"}
Using the set() Constructor
numbers = set([1, 2, 2, 3, 4])
print(numbers) # {1, 2, 3, 4}
Empty Set
⚠️
{}creates an empty dictionary, not a set — useset()instead.
empty_set = set()
From Strings or Ranges
letters = set("banana") # {'b', 'a', 'n'}
even_numbers = set(range(0, 10, 2)) # {0, 2, 4, 6, 8}
🧱 Adding and Removing Elements
| Method | Description | Example |
|---|---|---|
add(x) | Add element x to the set | colors.add("yellow") |
remove(x) | Remove element x; raises error if missing | colors.remove("green") |
discard(x) | Remove element safely (no error) | colors.discard("pink") |
pop() | Remove and return a random element | item = colors.pop() |
clear() | Remove all elements | colors.clear() |
Example:
colors = {"red", "green", "blue"}
colors.add("yellow")
colors.discard("green")
print(colors) # {'red', 'yellow', 'blue'}
🧩 Membership Testing
Sets are optimized for membership operations.
fruits = {"apple", "banana", "cherry"}
print("apple" in fruits) # True
print("grape" not in fruits) # True
These operations are O(1) — much faster than checking in lists.
🧮 Set Operations
Python sets support all major mathematical set operations.
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
| Operation | Symbol | Example | Result |
|---|---|---|---|
| Union | ` | ` | `A |
| Intersection | & | A & B | {3, 4} |
| Difference | - | A - B | {1, 2} |
| Symmetric Difference | ^ | A ^ B | {1, 2, 5, 6} |
Each of these also has a corresponding method version:
| Method | Equivalent Operator |
|---|---|
union() | ` |
intersection() | & |
difference() | - |
symmetric_difference() | ^ |
Example:
A = {1, 2, 3}
B = {3, 4, 5}
print(A.union(B)) # {1, 2, 3, 4, 5}
print(A.intersection(B)) # {3}
print(A.difference(B)) # {1, 2}
print(A.symmetric_difference(B))# {1, 2, 4, 5}
🧱 Subsets, Supersets, and Disjoint Sets
| Operation | Example | Description |
|---|---|---|
issubset() | A.issubset(B) | Returns True if A is a subset of B |
issuperset() | A.issuperset(B) | Returns True if A contains all elements of B |
isdisjoint() | A.isdisjoint(B) | True if A and B have no common elements |
Example:
A = {1, 2}
B = {1, 2, 3}
print(A.issubset(B)) # True
print(B.issuperset(A)) # True
print(A.isdisjoint({4}))# True
🧩 Set Comprehensions
Like list comprehensions, you can build sets dynamically.
squares = {x**2 for x in range(10)}
print(squares)
With conditions:
even_squares = {x**2 for x in range(10) if x % 2 == 0}
print(even_squares)
🧱 Immutable Sets — frozenset
If you need a hashable and immutable version of a set (usable as a dictionary key), use frozenset.
frozen = frozenset([1, 2, 3])
print(frozen)
# frozen.add(4) # ❌ Error: frozenset is immutable
⚙️ Real-World Example — Removing Duplicates and Comparing Preferences
1️⃣ Removing Duplicates from Data
emails = ["a@example.com", "b@example.com", "a@example.com"]
unique_emails = list(set(emails))
print(unique_emails) # ['a@example.com', 'b@example.com']
2️⃣ Comparing User Preferences
user1_tags = {"python", "ai", "machine-learning", "automation"}
user2_tags = {"ai", "webdev", "automation", "cloud"}
common = user1_tags & user2_tags
unique_to_user1 = user1_tags - user2_tags
print("Common Interests:", common)
print("Unique to User1:", unique_to_user1)
3️⃣ Fast Membership Testing for Blacklists
blacklist = {"spam.com", "malware.net", "phishing.org"}
domain = "spam.com"
if domain in blacklist:
print("Blocked domain!")
⚡ Performance Insights
- Sets use hashing, providing O(1) average lookup and insertion.
- Duplicate insertions are ignored automatically.
- Elements must be immutable (e.g., strings, numbers, tuples).
- Unhashable types (like lists or dicts) cannot be added.
🧱 Comparing Sets with Other Collections
| Feature | Set | List | Tuple | Dictionary |
|---|---|---|---|---|
| Duplicates | ❌ Not allowed | ✅ Allowed | ✅ Allowed | ❌ Keys unique |
| Ordered | ⚠️ Not guaranteed | ✅ Yes | ✅ Yes | ✅ (3.7+) |
| Mutable | ✅ Yes | ✅ Yes | ❌ No | ✅ Yes |
| Hashable | ❌ | ❌ | ✅ | ❌ |
| Lookup Speed | ⚡ O(1) | 🐢 O(n) | 🐢 O(n) | ⚡ O(1) |
| Use Case | Membership, uniqueness | Ordered data | Fixed sequences | Mapped key-value data |
🧠 Best Practices for Sets
- Use sets for fast lookups and duplicate removal.
- Always use immutable objects (no lists or dicts) as set elements.
- Use
frozensetwhen you need an immutable or hashable version. - Avoid depending on element order — sets are unordered by design.
- Prefer
discard()overremove()to avoid runtime errors.
🧾 Key Takeaways
- Sets store unique, unordered elements.
- Enable fast membership testing and mathematical operations.
- Support comprehensions, nesting, and immutable variants (
frozenset). - Perfect for real-world tasks like deduplication, tag comparison, and data filtering.
Sets are one of Python’s most efficient data structures — combining simplicity, mathematical elegance, and blazing-fast performance.