Advanced Slicing: Multi-Dimensional Manipulation
Advanced slicing goes beyond simple index ranges—using np.newaxis, ellipsis (...), negative indices, and multi-dimensional fancy indexing to reshape, transpose, and extract data from complex array structures without copying or looping. These techniques are essential for handling batches in deep learning, extracting time-series subsets, and reshaping data for operations requiring specific dimensional layouts. Mastering advanced slicing makes your code concise, readable, and memory-efficient.
Dimension Expansion with np.newaxis
np.newaxis (equivalent to None in index position) adds a new axis of size 1 at that location, enabling broadcasting and reshape operations without copying.
import numpy as np
# Start with a 1D array
vec = np.array([1, 2, 3])
print(f"Original shape: {vec.shape}") # (3,)
# Add axis at the beginning: convert to column vector
col_vec = vec[np.newaxis, :] # or vec[None, :]
print(f"Column vector shape: {col_vec.shape}") # (1, 3)
# Add axis at the end: convert to row vector
row_vec = vec[:, np.newaxis]
print(f"Row vector shape: {row_vec.shape}") # (3, 1)
# Multiple newaxis for higher dimensions
matrix = np.arange(6).reshape(2, 3)
print(f"Original matrix shape: {matrix.shape}") # (2, 3)
# Add axes: (2, 3) → (1, 2, 1, 3)
expanded = matrix[np.newaxis, :, np.newaxis, :]
print(f"Expanded shape: {expanded.shape}") # (1, 2, 1, 3)
# Practical: normalize along different axes
data = np.random.randn(100, 50, 30) # batch of 100 images, 50x30 each
mean_per_batch = data.mean(axis=(1, 2), keepdims=True) # shape (100, 1, 1)
normalized = data - mean_per_batch # broadcasting handles rest
np.newaxis creates a view (no copy), making it memory-efficient even for large arrays.
Ellipsis: Flexible Multi-Dimensional Slicing
The ellipsis operator (...) represents "all remaining dimensions," useful for slicing without knowing array dimensionality.
import numpy as np
# 4D array: (batches, channels, height, width)
images = np.random.randn(32, 3, 224, 224)
# Traditional slicing: verbose for high-dimensional data
first_image_first_channel = images[0, 0, :, :]
# Using ellipsis: concise and dimension-agnostic
first_image_first_channel_ellipsis = images[0, 0, ...]
assert np.array_equal(first_image_first_channel, first_image_first_channel_ellipsis)
# Ellipsis at different positions
all_but_last = images[..., 0] # shape (32, 3, 224) — last dimension only at index 0
all_but_first = images[1, ...] # shape (3, 224, 224) — all but first dimension
# Practical: extract last channel from all images
last_channel = images[..., :, :, -1] # or images[:, :, :, :, -1]
print(f"Last channel shape: {last_channel.shape}") # (32, 3, 224)
Ellipsis is especially useful for functions that accept arrays of any dimensionality.
Negative Indexing and Slicing
Negative indices count from the end of an axis. This is powerful for extracting trailing elements without knowing array size.
import numpy as np
arr = np.arange(10)
print(arr[-1]) # 9 — last element
print(arr[-3:]) # [7, 8, 9] — last 3 elements
print(arr[:-2]) # [0, 1, 2, 3, 4, 5, 6, 7] — all but last 2
# Multi-dimensional negative indexing
matrix = np.arange(12).reshape(3, 4)
print(matrix[-1, -1]) # 11 — bottom-right element
print(matrix[-2:, -3:]) # bottom-left 2x3 block
# Practical: last timestep in time-series data
time_series = np.random.randn(1000, 128, 10) # (samples, timesteps, features)
last_timestep = time_series[:, -1, :] # shape (1000, 10)
Step (Stride) in Slicing
The third element in a slice (start:stop:step) controls stride. Negative steps reverse.
import numpy as np
arr = np.arange(20)
print(arr[::2]) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] — every other element
print(arr[1::2]) # [1, 3, 5, 7, 9, 11, 13, 15, 17, 19] — every other, start at 1
print(arr[::-1]) # [19, 18, 17, ..., 1, 0] — reversed
print(arr[10:0:-1]) # [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] — reverse slice
# Multi-dimensional striding
matrix = np.arange(16).reshape(4, 4)
print(matrix[::2, ::2]) # every other row and column
# [[0, 2],
# [8, 10]]
# Practical: downsampling time-series or images
downsampled = matrix[::2, ::2] # 4x4 → 2x2
Warning: Strided slices may break contiguity, potentially slowing subsequent operations.
Combining Techniques: Index Arrays and Slicing
You can combine fancy indexing (integer arrays) with simple slicing:
import numpy as np
# 2D array
data = np.arange(20).reshape(4, 5)
print(data)
# [[ 0, 1, 2, 3, 4],
# [ 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19]]
# Fancy row indexing with simple column slicing
rows = np.array([0, 2, 3])
result = data[rows, 1:4] # select rows 0, 2, 3 and columns 1-3
print(result)
# [[ 1, 2, 3],
# [11, 12, 13],
# [16, 17, 18]]
# Practical: select specific samples and features
features_per_sample = np.array([10, 5, 8, 3, 12]) # variable feature counts
sample_indices = np.array([0, 2, 1, 3, 2]) # which sample for each query
# This is tricky; use integer array indexing carefully
Reshaping vs Slicing: When to Use Each
Reshaping reinterprets a flat memory layout; slicing selects subsets. Choose based on your goal:
import numpy as np
data = np.arange(24).reshape(2, 3, 4) # 24 elements
# Slicing: select a subset
subset = data[:, 1, :] # select middle row (slice)
print(f"Slice shape: {subset.shape}") # (2, 4)
# Reshape: change dimensionality without selecting
reshaped = data.reshape(6, 4) # flatten rows and columns
print(f"Reshape shape: {reshaped.shape}") # (6, 4)
# Flatten: collapse all dimensions
flattened = data.reshape(-1) # -1 infers size
print(f"Flattened shape: {flattened.shape}") # (24,)
# Transpose: swap axes (view, no copy)
transposed = data.transpose(1, 0, 2) # swap first two axes
print(f"Transposed shape: {transposed.shape}") # (3, 2, 4)
# Moveaxis: move one axis (often clearer than transpose)
moved = np.moveaxis(data, 0, -1) # move axis 0 to the end
print(f"Moveaxis shape: {moved.shape}") # (3, 4, 2)
Practical: Complex Data Extraction
Real-world example: extracting train/test splits, batches, and features:
import numpy as np
# Dataset: 1000 samples, 50 timesteps, 30 features
dataset = np.random.randn(1000, 50, 30)
# Split into train/test (80/20)
split_idx = 800
train, test = dataset[:split_idx], dataset[split_idx:]
print(f"Train shape: {train.shape}, Test shape: {test.shape}")
# Create batches (32 per batch)
batch_size = 32
num_batches = len(train) // batch_size
# Advanced slicing to extract batch without copying
for batch_num in range(num_batches):
start = batch_num * batch_size
end = start + batch_size
batch = train[start:end] # view or copy? Let's check
print(f"Batch {batch_num}: shape {batch.shape}")
# Extract every 5th timestep and first 10 features
subsampled = dataset[:, ::5, :10] # (1000, 10, 10)
print(f"Subsampled: {subsampled.shape}")
# Fancy indexing: select specific features
important_features = np.array([5, 12, 19, 28])
selected = dataset[:, :, important_features] # (1000, 50, 4)
print(f"Selected features: {selected.shape}")
Key Takeaways
np.newaxisadds axes of size 1 without copying; use to reshape for broadcasting.- Ellipsis (
...) represents all remaining dimensions; especially useful for high-dimensional arrays. - Negative indices count from the end; combined with slicing, they extract trailing data without knowing size.
- Step in slicing (
arr[::2]) downsamples; stride breaks contiguity, potentially slowing operations. - Combining fancy indexing with slicing enables complex selection patterns on multiple axes.
- Slicing creates views (fast, no copy); reshaping reinterprets layout (also fast, usually a view).
Frequently Asked Questions
Does arr[np.newaxis, :] create a copy?
No, it creates a view with modified strides. Memory is shared; the original array is unaffected by modifications to the view (except for shape-0 issues). This makes it memory-efficient.
How do I combine row and column fancy indexing?
Use np.ix_() to convert separate row/column indices into a proper 2D fancy index:
rows = np.array([0, 2])
cols = np.array([1, 3])
result = arr[np.ix_(rows, cols)] # Select (row 0, col 1), (row 0, col 3), ...
What is the difference between arr[:, :] and arr?
arr[:, :] creates a view with explicit slicing syntax; arr is the array itself. On ndarrays, they're effectively identical. The slicing notation is explicit when modifying subsets.
Can slicing break array contiguity?
Yes. Strided slices (arr[::2]) create non-contiguous views. If you'll iterate such slices, copy them first: arr[::2].copy() to ensure C-contiguous memory.
How do I transpose a multi-dimensional array flexibly?
Use np.transpose(arr, axes) or arr.transpose(axes) to specify the new axis order. For moving one axis, np.moveaxis(arr, src, dst) is clearer than transposing.