ch12s1_AdvancedArrayManipulation

NumPy’s true power lies in its ability to handle **large, multi-dimensional arrays efficiently**.

Chapter 12: Advanced NumPy and Pandas — Advanced Array Manipulation

⚙️ Advanced Array Manipulation with NumPy

NumPy’s true power lies in its ability to handle large, multi-dimensional arrays efficiently.
In this chapter, we’ll explore advanced techniques for manipulating, reshaping, and combining arrays — key skills for numerical and scientific computing.


🚀 1. Why NumPy Is So Fast

NumPy arrays are stored in contiguous memory blocks, allowing vectorized operations to run at C speed — much faster than Python loops.

Example: comparing plain Python vs NumPy speed

import numpy as np
import time

x = np.arange(1_000_000)
y = np.arange(1_000_000)

start = time.time()
z = x + y  # Vectorized
print("NumPy:", time.time() - start)

💡 NumPy avoids Python-level loops by performing operations in compiled C code.


🧮 2. Broadcasting — Working with Different Shapes

Broadcasting allows NumPy to perform operations between arrays of different shapes without copying data.

Example — Scalar Broadcasting

import numpy as np

a = np.array([1, 2, 3])
result = a + 10
print(result)  # [11 12 13]

Example — 1D and 2D Broadcasting

A = np.array([[1], [2], [3]])  # Shape (3,1)
B = np.array([10, 20, 30])     # Shape (3,)
C = A + B
print(C)

NumPy expands dimensions automatically to make shapes compatible.

Rule of Thumb: Starting from trailing dimensions, NumPy compares shapes — they must be equal or one must be 1.


⚡ 3. Universal Functions (ufuncs)

Universal functions (ufuncs) perform element-wise operations efficiently.
They are vectorized wrappers around C functions for fast computation.

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print(np.add(a, b))      # [5 7 9]
print(np.multiply(a, b)) # [4 10 18]
print(np.sqrt(b))        # [2. 2.236 2.449]
print(np.exp(a))         # [2.718 7.389 20.085]

Custom Ufuncs

You can create your own ufunc using np.frompyfunc():

def add_exclamation(x): return f"{x}!"
u = np.frompyfunc(add_exclamation, 1, 1)
print(u(["Hi", "NumPy", "World"]))

🔢 4. Axis Operations

Many NumPy functions operate along a specific axis.

matrix = np.array([[1, 2, 3],
                   [4, 5, 6]])

print("Sum of columns:", np.sum(matrix, axis=0))
print("Sum of rows:", np.sum(matrix, axis=1))
FunctionDescription
np.sum()Sum of elements
np.mean()Average
np.max() / np.min()Find extremes
np.std()Standard deviation

🧱 5. Reshaping and Transposing Arrays

Reshaping

a = np.arange(6)
reshaped = a.reshape(2, 3)
print(reshaped)

Transposing

matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(matrix.T)  # Swap rows and columns

Flattening

flat = matrix.flatten()
print(flat)

Use ravel() for a view (no copy) and flatten() for a copy.


🧩 6. Stacking and Splitting Arrays

Stacking

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print(np.vstack((a, b)))  # Vertical stack
print(np.hstack((a, b)))  # Horizontal stack

Splitting

x = np.arange(10)
print(np.split(x, 2))   # Split into equal halves

🧠 7. Combining Arrays Efficiently

You can also concatenate arrays of compatible shapes.

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6]])
C = np.concatenate((A, B), axis=0)
print(C)

Use np.concatenate(), np.stack(), or np.column_stack() for more control.


🧬 8. Copy vs View — Memory Efficiency

NumPy often returns views instead of copies to optimize memory.

a = np.arange(5)
b = a.view()    # View (shares data)
c = a.copy()    # Independent copy

b[0] = 99
print("a:", a)  # a changes
print("c:", c)  # c stays the same
OperationShares Memory?
view()✅ Yes
reshape()✅ Often
ravel()✅ Often
copy()❌ No

⚙️ 9. Fancy Indexing and Boolean Masks

Fancy indexing allows you to extract arbitrary elements efficiently.

arr = np.array([10, 20, 30, 40, 50])
indices = [0, 3, 4]
print(arr[indices])  # [10 40 50]

Boolean masks make filtering easy:

mask = arr > 25
print(arr[mask])  # [30 40 50]

🧠 10. Useful Array Manipulation Functions

FunctionDescription
reshape()Change shape without changing data
ravel()Flatten array (returns view)
flatten()Flatten array (returns copy)
stack() / vstack() / hstack()Combine arrays
split()Split into subarrays
transpose()Swap axes
moveaxis()Move specific axis to new position

⚡ 11. Performance Tips

✅ Use vectorized operations instead of Python loops.
✅ Avoid unnecessary copies (use view() when possible).
✅ Pre-allocate arrays instead of appending in loops.
✅ Use astype() carefully — conversions can be costly.
✅ Use in-place operations (+=, *=) for large arrays.


🧭 Summary

Advanced array manipulation in NumPy unlocks high-performance data transformations.
Mastering reshaping, stacking, broadcasting, and memory handling lets you process large datasets efficiently.

The key to NumPy’s power lies in vectorization + broadcasting + memory views — once you understand those, you can manipulate any data shape efficiently.