Skip to main content

PyTorch Tensors: Beginner's Guide

A PyTorch tensor is a multidimensional array that forms the foundation of all deep learning computations. Tensors can live on CPUs or GPUs, support automatic differentiation, and enable efficient batch processing of numerical data. Understanding how to create, manipulate, and operate on tensors is essential before building any neural network.

What are PyTorch tensors?

PyTorch tensors are similar to NumPy arrays but with two critical advantages: GPU support and automatic differentiation. A tensor is defined by its shape (dimensions), data type (dtype), and device (CPU or GPU). Unlike standard Python lists, tensors store contiguous data in memory, making them extremely efficient for mathematical operations. According to PyTorch's official documentation (2026), tensors are the fundamental building blocks for all neural networks and scientific computing workflows.

Creating and inspecting tensors

There are multiple ways to create tensors in PyTorch. The most common approaches include direct instantiation, conversion from NumPy arrays, and using factory functions.

Creating tensors from data

import torch

# Create a tensor from a Python list
tensor_from_list = torch.tensor([1, 2, 3, 4, 5])
print(tensor_from_list) # Output: tensor([1, 2, 3, 4, 5])

# Create a 2D tensor (matrix)
matrix = torch.tensor([[1, 2, 3], [4, 5, 6]])
print(matrix.shape) # Output: torch.Size([2, 3])

# Create a tensor with specific data type
float_tensor = torch.tensor([1.0, 2.0, 3.0], dtype=torch.float32)
print(float_tensor.dtype) # Output: torch.float32

Using factory functions

import torch

# Zeros, ones, and random tensors
zeros = torch.zeros(3, 4) # 3x4 matrix of zeros
ones = torch.ones(2, 5) # 2x5 matrix of ones
random_tensor = torch.randn(3, 3) # 3x3 matrix of random normal values

# Tensors with specific ranges
range_tensor = torch.arange(0, 10, step=2) # [0, 2, 4, 6, 8]
linspace_tensor = torch.linspace(0, 1, 5) # Evenly spaced 5 values

print(f"Zeros shape: {zeros.shape}")
print(f"Random tensor:\n{random_tensor}")

Tensor properties and attributes

Every tensor has key properties you'll access constantly: shape (the dimensions), dtype (numeric type), and device (CPU or GPU).

Inspecting tensor properties

import torch

# Create a tensor
t = torch.randn(2, 3, 4)

print(f"Shape: {t.shape}") # torch.Size([2, 3, 4])
print(f"Number of dimensions: {t.ndim}") # 3
print(f"Total elements: {t.numel()}") # 24
print(f"Data type: {t.dtype}") # torch.float32
print(f"Device: {t.device}") # device(type='cpu')

# Reshape without changing data
reshaped = t.reshape(6, 4)
print(f"Reshaped to: {reshaped.shape}") # torch.Size([6, 4])

# Flatten to 1D
flat = t.flatten()
print(f"Flattened size: {flat.shape}") # torch.Size([24])

Tensor operations and arithmetic

PyTorch tensors support element-wise operations, matrix multiplication, and reduction operations.

OperationExampleResult
Element-wise additiona + bAdds corresponding elements
Matrix multiplicationtorch.matmul(a, b) or a @ bMultiplies matrices following linear algebra rules
Element-wise multiplicationa * bMultiplies corresponding elements (Hadamard product)
Transposea.T or a.transpose(0, 1)Reverses dimensions
Sum reductiona.sum()Sums all elements to scalar

Performing arithmetic operations

import torch

# Create sample tensors
a = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
b = torch.tensor([[2.0, 0.0], [1.0, 2.0]])

# Element-wise operations
add_result = a + b
mul_result = a * b # Element-wise, not matrix multiplication

# Matrix multiplication
matmul_result = a @ b
print(f"Matrix multiplication:\n{matmul_result}")

# Reduction operations
sum_all = a.sum() # Sum of all elements: 10.0
sum_rows = a.sum(dim=1) # Sum along rows: tensor([3., 7.])
mean_val = a.mean() # Mean of all elements: 2.5

print(f"Sum of all elements: {sum_all}")
print(f"Sum along dimension 1: {sum_rows}")

Indexing and slicing tensors

Access specific elements, rows, or sub-tensors using NumPy-style indexing.

Extracting and modifying tensor values

import torch

# Create a 3x4 tensor
t = torch.arange(12).reshape(3, 4).float()
print(f"Original tensor:\n{t}")

# Single element access
element = t[0, 1] # First row, second column
print(f"Element at [0, 1]: {element}")

# Row and column slicing
first_row = t[0, :] # Entire first row
first_col = t[:, 0] # Entire first column
submatrix = t[1:3, 1:3] # Rows 1-2, columns 1-2

print(f"First row: {first_row}")
print(f"Submatrix:\n{submatrix}")

# Modify elements
t[0, 0] = 99
print(f"Modified tensor:\n{t}")

Broadcasting in PyTorch

Broadcasting allows operations between tensors of different shapes, automatically expanding smaller tensors to match larger ones without copying data.

Understanding and using broadcasting

import torch

# Broadcasting example: adding a scalar to a matrix
matrix = torch.ones(2, 3)
scalar = torch.tensor(5.0)
result = matrix + scalar # Scalar broadcasts to (2, 3)
print(f"Scalar broadcast result:\n{result}")

# Broadcasting a row vector to a matrix
row_vector = torch.tensor([[1.0, 2.0, 3.0]])
matrix = torch.ones(4, 3)
result = matrix + row_vector # (1, 3) broadcasts to (4, 3)
print(f"Row broadcast result shape: {result.shape}") # torch.Size([4, 3])

# Broadcasting fails with incompatible shapes
try:
incompatible = torch.ones(2, 3) + torch.ones(3, 2)
except RuntimeError as e:
print(f"Broadcasting error (expected): {e}")

Tensor conversion and memory layout

Convert tensors to and from NumPy arrays, and understand how data is stored in memory.

Working with NumPy arrays

import torch
import numpy as np

# NumPy array to PyTorch tensor
np_array = np.array([[1, 2], [3, 4]])
tensor = torch.from_numpy(np_array)
print(f"Tensor from NumPy:\n{tensor}")

# PyTorch tensor to NumPy array
tensor = torch.tensor([[1, 2], [3, 4]], dtype=torch.float32)
np_result = tensor.numpy()
print(f"NumPy from tensor:\n{np_result}")
print(f"Type: {type(np_result)}")

# Important: modifying a NumPy array converted from a tensor
# (on CPU) modifies the original tensor because they share memory
np_array = np.array([1, 2, 3], dtype=np.float32)
tensor = torch.from_numpy(np_array)
np_array[0] = 99
print(f"Modified tensor (shared memory): {tensor}")

Key Takeaways

  • PyTorch tensors are multidimensional arrays that support GPU acceleration and automatic differentiation, unlike NumPy arrays.
  • Create tensors using constructors like torch.tensor(), torch.zeros(), torch.ones(), and torch.randn() for different initialization patterns.
  • Every tensor has a shape, dtype, and device—understanding these properties is critical for debugging shape mismatches and ensuring computations run on the correct hardware.
  • Tensor operations include element-wise arithmetic, matrix multiplication (@ operator), and reductions (sum(), mean()) that operate along specified dimensions.
  • Broadcasting automatically aligns tensor shapes for compatible operations, reducing the need for manual reshaping.
  • NumPy arrays and PyTorch tensors can be converted back and forth; on CPU, they share memory, so modifying one affects the other.

Frequently Asked Questions

What is the difference between reshape() and view()?

reshape() may copy data if the tensor is not contiguous, while view() always returns a view with no copy but requires the tensor to be contiguous. Use view() for performance when possible, or contiguous().view() if the tensor might not be contiguous.

Can I perform in-place operations on tensors?

Yes, PyTorch functions ending with _ perform in-place modifications: a.add_(b) adds b to a directly. In-place operations save memory but can cause issues in autograd if the tensor is part of a computation graph with requires_grad=True.

How do I move a tensor to the GPU?

Use .to(device) or .cuda(): tensor = tensor.to('cuda') or tensor = tensor.cuda(). Check available devices with torch.cuda.is_available(). To move back to CPU: tensor = tensor.cpu().

What is the difference between torch.tensor() and torch.Tensor()?

torch.tensor() creates a tensor from data (list, array, etc.) and infers the dtype from the input. torch.Tensor() is a constructor that creates an empty uninitialized tensor with default dtype float32. Always use torch.tensor() for clarity when you have data.

Why does my tensor have dtype int64 when I expected float32?

PyTorch infers dtype from input data: integer lists become int64, float lists become float32. Explicitly specify dtype: torch.tensor([1, 2, 3], dtype=torch.float32) to override the default inference.

Further Reading