Tensors play a very important role in deep learning and machine learning. They are at the heart of algorithms such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) that are widely used in many applications such as image recognition, natural language processing, and time series analysis.
In this article, we will first briefly review some linear algebra concepts that will be useful for understanding tensors. We will then see how tensors can be represented using matrices and vectors. Finally, we will look at some examples of how tensors are used in deep learning.
Don’t wait, download now and transform your career!Your FREE Guide to Become a Data Scientist
What is a tensor?
Simply put, a tensor is a generalization of vectors and matrices to higher dimensions. Just as vectors can be represented as matrices with a single column, and matrices can be represented as vectors with multiple columns, tensors can be thought of as “higher-dimensional matrices” with multiple dimensions.
A vector is a one-dimensional array of numbers, while a matrix is a two-dimensional array of numbers. A tensor is an n-dimensional array of numbers, where n is any positive integer. In other words, a tensor is just a generalization of vectors and matrices to higher dimensions.
One way to think about tensors is that they are like multidimensional arrays. In fact, in many programming languages such as Python, they are actually implemented as arrays.
However, tensors have some important differences from arrays. One key difference is that tensors can be used in conjunction with linear algebra, which is a powerful set of tools for mathematical manipulation.
Linear algebra is a branch of mathematics that deals with vector spaces and the properties of linear transformations between those spaces. It’s a very general area of mathematics, and it’s useful in physics, engineering, and machine learning.
Tensors are also related to another area of mathematics called differential geometry, which is concerned with the study of shapes and curves in higher-dimensional space. Differential geometry is used in computer graphics and robotics.
Tensors are also useful in machine learning because they can be used to represent data sets with arbitrary numbers of dimensions. For example, a data set might have two dimensions (height and width), or three dimensions (height, width, and depth)
Linear algebra review
To fully understand tensors, you need to know a little bit about linear algebra. If you’re not familiar with linear algebra, don’t worry, you can still follow along with this article.
A vector is a one-dimensional array of numbers and it can be represented as a column matrix:
$$\vec{v} = \begin{bmatrix} v_0 \\ v_n \end{bmatrix}$$
A matrix is a two-dimensional array of numbers and it can be represented as a rectangular array:
$$M = \begin{bmatrix} m_{00} & m_{01} \\ m_{n0} & m_{nn}\end{bmatrix}$$
In deep learning, we often work with higher-dimensional arrays called tensors. A tensor is simply an n-dimensional array of numbers. For example, a vector is a one-dimensional tensor, a matrix is a two-dimensional tensor, and an image is a three-dimensional tensor (width, height, and depth).
A scalar is a tensor that has only one element. For example, $v = [0]$ is a vector, and $M = [0]$ is a matrix.
A vector can be represented as a matrix with one row or one column. For example, the following are equivalent:
$$\vec{v} = \begin{bmatrix} v_0 \\ v_n \end{bmatrix}$$
$$\vec{v} = \begin{bmatrix} v_0 & v_n \end{bmatrix}$$
Similarly, a matrix can be represented as a tensor with two indices. For example, the following are equivalent:
$$M = \begin{bmatrix} m_{00} & m_{01} \\ m_{02} & m_{03} \end{bmatrix}$$
$$M = \begin{bmatrix} [m_{00}, m_{01}, m_{02}, m_{03}] \end{bmatrix}$$
An image can be represented as a matrix with three indices (width, height, and depth). For example, the following are equivalent:
$$I = \begin{bmatrix} i_{000} & i_{001} & i {002}\cdots \\ i {010 }&i _{011}&i {012}\cdots \\ \vdots & \vdots & \ddots \end{bmatrix}$$
$$I = \begin{bmatrix} [i_{000}, i_{001}, i {002},\cdots] \\ [i {010 },i _{011},i {012},\cdots] \\ \vdots & \ddots \end{bmatrix}$$
A video can be represented as a matrix with four indices (width, height, depth, and time). For example, the following are equivalent:
$$V = \begin{bmatrix} v_{0000} & v_{0001} & v _{0002}\cdots \\ v_{0010 }&v _{0011}&v {0012}\cdots \\ \vdots & \vdots & \ddots \end{bmatrix}$$
$$V = \begin{bmatrix} [v_{0000}, v_{0001}, v _{0002},\cdots] \\ [v_{0010 },v _{0011},v {0012},\cdots] \\ \vdots & \ddots \end{bmatrix}$$
A tensor is a generalization of a matrix to an arbitrary number of indices. A scalar is a tensor with zero indices. A vector is a tensor with one index. A matrix is a tensor with two indices. And so on.
So, what's the advantage of using tensors over matrices?
In general, the more indices a tensor has, the more degrees of freedom it represents. This means that tensors can compactly represent very high-dimensional data.
For example, an image with a million pixels can be represented as a tensor with only two indices: width and height. By contrast, if we were to represent this image as a matrix, it would have one million indices!
Tensors also make many operations easier to perform than if we were working with matrices. In particular, tensors are easily manipulated using automatic differentiation software like TensorFlow and PyTorch. This software can automatically compute the gradient of a tensor with respect to any other tensor.
So what does all this mean for deep learning?
Deep learning models are often described as being composed of layers. Each layer takes some input data, transforms it in some way, and outputs transformed data.
The most common type of layer is the fully connected layer, which takes a vector as input and outputs a new vector. But there are many other types of layers, including convolutional layers, pooling layers, and recurrent layers.
Layers are usually parameterized by tensors. For example, a fully connected layer is typically parameterized by two tensors: a weight matrix and a bias vector. When we apply this layer to an input tensor, we multiply the input by the weight matrix and add the bias vector.
Tensors are also used to represent the weights of neural networks. A weight tensor is simply a tensor that is used as a parameter in a layer. When we train a neural network, we need to optimize the values of these weight tensors to minimize some loss functions.
To summarize, tensors are data structures that you can think of as multi-dimensional arrays. They’re used extensively in linear algebra, and they’re what make deep learning possible. If you understand how they work, you’ll be well on your way to understanding deep learning!