Sets — Unordered Collections of Unique Elements

A **set** in Python is an **unordered**, **mutable** collection that automatically ensures **uniqueness** of its elements.

Chapter 3: Data Structures

Sub-chapter: Sets — Unordered Collections of Unique Elements

A set in Python is an unordered, mutable collection that automatically ensures uniqueness of its elements.
Sets are incredibly useful for eliminating duplicates, performing mathematical set operations, and testing membership efficiently.

Behind the scenes, Python implements sets as hash tables, allowing average-case O(1) lookup time — meaning checking whether an element exists in a set is extremely fast.


🧠 Why Use Sets?

Sets are perfect for:


⚙️ Creating Sets

Using Curly Braces {}

colors = {"red", "green", "blue"}

Using the set() Constructor

numbers = set([1, 2, 2, 3, 4])
print(numbers)  # {1, 2, 3, 4}

Empty Set

⚠️ {} creates an empty dictionary, not a set — use set() instead.

empty_set = set()

From Strings or Ranges

letters = set("banana")  # {'b', 'a', 'n'}
even_numbers = set(range(0, 10, 2))  # {0, 2, 4, 6, 8}

🧱 Adding and Removing Elements

MethodDescriptionExample
add(x)Add element x to the setcolors.add("yellow")
remove(x)Remove element x; raises error if missingcolors.remove("green")
discard(x)Remove element safely (no error)colors.discard("pink")
pop()Remove and return a random elementitem = colors.pop()
clear()Remove all elementscolors.clear()

Example:

colors = {"red", "green", "blue"}
colors.add("yellow")
colors.discard("green")
print(colors)  # {'red', 'yellow', 'blue'}

🧩 Membership Testing

Sets are optimized for membership operations.

fruits = {"apple", "banana", "cherry"}
print("apple" in fruits)     # True
print("grape" not in fruits) # True

These operations are O(1) — much faster than checking in lists.


🧮 Set Operations

Python sets support all major mathematical set operations.

A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
OperationSymbolExampleResult
Union```A
Intersection&A & B{3, 4}
Difference-A - B{1, 2}
Symmetric Difference^A ^ B{1, 2, 5, 6}

Each of these also has a corresponding method version:

MethodEquivalent Operator
union()`
intersection()&
difference()-
symmetric_difference()^

Example:

A = {1, 2, 3}
B = {3, 4, 5}

print(A.union(B))               # {1, 2, 3, 4, 5}
print(A.intersection(B))        # {3}
print(A.difference(B))          # {1, 2}
print(A.symmetric_difference(B))# {1, 2, 4, 5}

🧱 Subsets, Supersets, and Disjoint Sets

OperationExampleDescription
issubset()A.issubset(B)Returns True if A is a subset of B
issuperset()A.issuperset(B)Returns True if A contains all elements of B
isdisjoint()A.isdisjoint(B)True if A and B have no common elements

Example:

A = {1, 2}
B = {1, 2, 3}
print(A.issubset(B))    # True
print(B.issuperset(A))  # True
print(A.isdisjoint({4}))# True

🧩 Set Comprehensions

Like list comprehensions, you can build sets dynamically.

squares = {x**2 for x in range(10)}
print(squares)

With conditions:

even_squares = {x**2 for x in range(10) if x % 2 == 0}
print(even_squares)

🧱 Immutable Sets — frozenset

If you need a hashable and immutable version of a set (usable as a dictionary key), use frozenset.

frozen = frozenset([1, 2, 3])
print(frozen)
# frozen.add(4)  # ❌ Error: frozenset is immutable

⚙️ Real-World Example — Removing Duplicates and Comparing Preferences

1️⃣ Removing Duplicates from Data

emails = ["a@example.com", "b@example.com", "a@example.com"]
unique_emails = list(set(emails))
print(unique_emails)  # ['a@example.com', 'b@example.com']

2️⃣ Comparing User Preferences

user1_tags = {"python", "ai", "machine-learning", "automation"}
user2_tags = {"ai", "webdev", "automation", "cloud"}

common = user1_tags & user2_tags
unique_to_user1 = user1_tags - user2_tags

print("Common Interests:", common)
print("Unique to User1:", unique_to_user1)

3️⃣ Fast Membership Testing for Blacklists

blacklist = {"spam.com", "malware.net", "phishing.org"}
domain = "spam.com"

if domain in blacklist:
    print("Blocked domain!")

⚡ Performance Insights


🧱 Comparing Sets with Other Collections

FeatureSetListTupleDictionary
Duplicates❌ Not allowed✅ Allowed✅ Allowed❌ Keys unique
Ordered⚠️ Not guaranteed✅ Yes✅ Yes✅ (3.7+)
Mutable✅ Yes✅ Yes❌ No✅ Yes
Hashable
Lookup Speed⚡ O(1)🐢 O(n)🐢 O(n)⚡ O(1)
Use CaseMembership, uniquenessOrdered dataFixed sequencesMapped key-value data

🧠 Best Practices for Sets


🧾 Key Takeaways


Sets are one of Python’s most efficient data structures — combining simplicity, mathematical elegance, and blazing-fast performance.