ch12s1_AdvancedArrayManipulation
NumPy’s true power lies in its ability to handle **large, multi-dimensional arrays efficiently**.
Chapter 12: Advanced NumPy and Pandas — Advanced Array Manipulation
⚙️ Advanced Array Manipulation with NumPy
NumPy’s true power lies in its ability to handle large, multi-dimensional arrays efficiently.
In this chapter, we’ll explore advanced techniques for manipulating, reshaping, and combining arrays — key skills for numerical and scientific computing.
🚀 1. Why NumPy Is So Fast
NumPy arrays are stored in contiguous memory blocks, allowing vectorized operations to run at C speed — much faster than Python loops.
Example: comparing plain Python vs NumPy speed
import numpy as np
import time
x = np.arange(1_000_000)
y = np.arange(1_000_000)
start = time.time()
z = x + y # Vectorized
print("NumPy:", time.time() - start)
💡 NumPy avoids Python-level loops by performing operations in compiled C code.
🧮 2. Broadcasting — Working with Different Shapes
Broadcasting allows NumPy to perform operations between arrays of different shapes without copying data.
Example — Scalar Broadcasting
import numpy as np
a = np.array([1, 2, 3])
result = a + 10
print(result) # [11 12 13]
Example — 1D and 2D Broadcasting
A = np.array([[1], [2], [3]]) # Shape (3,1)
B = np.array([10, 20, 30]) # Shape (3,)
C = A + B
print(C)
NumPy expands dimensions automatically to make shapes compatible.
Rule of Thumb: Starting from trailing dimensions, NumPy compares shapes — they must be equal or one must be 1.
⚡ 3. Universal Functions (ufuncs)
Universal functions (ufuncs) perform element-wise operations efficiently.
They are vectorized wrappers around C functions for fast computation.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.add(a, b)) # [5 7 9]
print(np.multiply(a, b)) # [4 10 18]
print(np.sqrt(b)) # [2. 2.236 2.449]
print(np.exp(a)) # [2.718 7.389 20.085]
Custom Ufuncs
You can create your own ufunc using np.frompyfunc():
def add_exclamation(x): return f"{x}!"
u = np.frompyfunc(add_exclamation, 1, 1)
print(u(["Hi", "NumPy", "World"]))
🔢 4. Axis Operations
Many NumPy functions operate along a specific axis.
matrix = np.array([[1, 2, 3],
[4, 5, 6]])
print("Sum of columns:", np.sum(matrix, axis=0))
print("Sum of rows:", np.sum(matrix, axis=1))
| Function | Description |
|---|---|
np.sum() | Sum of elements |
np.mean() | Average |
np.max() / np.min() | Find extremes |
np.std() | Standard deviation |
🧱 5. Reshaping and Transposing Arrays
Reshaping
a = np.arange(6)
reshaped = a.reshape(2, 3)
print(reshaped)
Transposing
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(matrix.T) # Swap rows and columns
Flattening
flat = matrix.flatten()
print(flat)
Use
ravel()for a view (no copy) andflatten()for a copy.
🧩 6. Stacking and Splitting Arrays
Stacking
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.vstack((a, b))) # Vertical stack
print(np.hstack((a, b))) # Horizontal stack
Splitting
x = np.arange(10)
print(np.split(x, 2)) # Split into equal halves
🧠 7. Combining Arrays Efficiently
You can also concatenate arrays of compatible shapes.
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6]])
C = np.concatenate((A, B), axis=0)
print(C)
Use
np.concatenate(),np.stack(), ornp.column_stack()for more control.
🧬 8. Copy vs View — Memory Efficiency
NumPy often returns views instead of copies to optimize memory.
a = np.arange(5)
b = a.view() # View (shares data)
c = a.copy() # Independent copy
b[0] = 99
print("a:", a) # a changes
print("c:", c) # c stays the same
| Operation | Shares Memory? |
|---|---|
view() | ✅ Yes |
reshape() | ✅ Often |
ravel() | ✅ Often |
copy() | ❌ No |
⚙️ 9. Fancy Indexing and Boolean Masks
Fancy indexing allows you to extract arbitrary elements efficiently.
arr = np.array([10, 20, 30, 40, 50])
indices = [0, 3, 4]
print(arr[indices]) # [10 40 50]
Boolean masks make filtering easy:
mask = arr > 25
print(arr[mask]) # [30 40 50]
🧠 10. Useful Array Manipulation Functions
| Function | Description |
|---|---|
reshape() | Change shape without changing data |
ravel() | Flatten array (returns view) |
flatten() | Flatten array (returns copy) |
stack() / vstack() / hstack() | Combine arrays |
split() | Split into subarrays |
transpose() | Swap axes |
moveaxis() | Move specific axis to new position |
⚡ 11. Performance Tips
✅ Use vectorized operations instead of Python loops.
✅ Avoid unnecessary copies (use view() when possible).
✅ Pre-allocate arrays instead of appending in loops.
✅ Use astype() carefully — conversions can be costly.
✅ Use in-place operations (+=, *=) for large arrays.
🧭 Summary
Advanced array manipulation in NumPy unlocks high-performance data transformations.
Mastering reshaping, stacking, broadcasting, and memory handling lets you process large datasets efficiently.
The key to NumPy’s power lies in vectorization + broadcasting + memory views — once you understand those, you can manipulate any data shape efficiently.