Matrix

A brief introduction to Matrix.

11/10/20244 min read

This post is dedicated to building an intuitive understanding of matrix and its operations, including matrix multiplication, determinant, and inverse.

A matrix represents a unique linear transformation. A linear transformation is a transformation of a space that satisfies three key properties: 1) the origin remains fixed, 2) grid lines stay parallel, and 3) grid lines maintain equal spacing. Examples of linear transformations include rotation and scaling.

A m*n matrix can be written as m-row vectors or n-column vectors. The column vectors of a matrix M represent the new basis vectors after linear transformation. For example, consider matrix M = [[1, 3], [-2, 0]] the column vectors are [1, -2] and [3, 0], which represents the unit vector i= [1, 0] lands on [1, -2], and unit vector j = [0, 1] lands [3, 0] after the linear transformation encoded by matrix M. As mentioned in the previous blog, any vector v=[a, b] can be decomposed as v = a*i+b*j, where i and j are the basis unit vectors. After linear transformation, the decomposition remains unchanged as all the grids remain parallel and even spaced. Thus v' = a*i' + b*j', where v' is the transformed vector, and i' and j' are the new basis unit vectors. This concept captures the essence of matrix-vector multiplication, where M*v = a*M_c1 + b*M_c2 with M_ci representing the ith column vector of M. We can think of matrix-matrix multiplication as performing multiple matrix-vector multiplications. The images below show the multiplication of Matrix M = [[1, 3],[-2,0]] and vector v = [-1, 2].

The determinant of a square matrix A is denoted as det(A). We can think of the determinant as a function that takes a square matrix as input and returns a scaler. It describes the change in the volume (or area in 2D space) formed by the new basis vectors compared with the original basis vectors. For example, on the left side of the image above, the area formed by the original basis vectors is a unit square with an area of 1, while the area formed by the new basis vectors is a parallelogram with an area of 6. Therefore, the determinant of the matrix M = [[1, 3], [-2, 0]] is 6, which represents how much the area changes after the linear transformation encoded by the matrix M. We can use geometrics to easily prove the determinant of a 2D matrix M = [[a, b], [c, d]] is ad - bc. For 3D space, a 3*3 matrix maps a unit cude to a parallelepiped, and the determinant of a 3D matrix M = [[a,b,c], [d,e,f], [g,h,i]] is a(ei-fh)-b(di-fg)+c(dh-eg). When the determinant of a square matrix is zero, it indicates that the linear transformation reduces the dimensionality of the space. For example, a 2*2 matrix with a determinant of zero maps the 2D space into a line, effectively collapsing the unit square area to zero. Similarly, a 3*3 matrix with a determinant of zero reduces the 3D space to either a 2D plane or a 1D line, causing zero volume. It's important to note that the determinant is only defined for the square matrix, as non-square matrices cannot perform such transformations because they map vectors from an n-dimensional space to a different-dimensional space. These transformations do not have a consistent, well-defined notion of volume or area in the same way square matrices do. Therefore, the concept of a determinant, which involves scaling the space uniformly, doesn't apply.

The span of vectors is the set of all possible vectors you can reach with linear combinations of the given vectors. A linear combination involves scalar multiplication and vector addition. For instance, if you have two 2D vectors that are not collinear (not lying on the same line), their span is the entire 2D plane. Conversely, if the two vectors are collinear, their span is the line they lie on. A set of vectors is said to be linearly dependent if at least one vector in the set can be expressed as a linear combination of the others. Equivalently, the set is linearly dependent if there exists a non-trivial linear combination (i.e., not all scalars are zero) that results in the zero vector. On the other hand, a set of vectors is linearly independent if no vector in the set can be written as a linear combination of the others. In this case, each vector in the set contributes a new dimension to the span.

The column space of the matrix A is the span of the A's column vectors.

The rank of the matrix A is the number of dimensions in the output of a linear transformation encoded by A. This is equivalent to the dimension of the column space of A. For example, rank 1 means the output of a linear transformation is a line, and rank 2 means the output of a linear transformation is a plane. For a 2*2 matrix, the maximum possible rank is 2, which means the linear transformation spans the 2D space, and its determinant is not zero. If it has a rank of 1, it means the 2D space collapses into a line after the linear transformation. When the rank is as high as it could be, equal to the number of column vectors of the matrix, the matrix is a full rank.

The inverse of matrix A, denoted as A-1, encodes the linear transformation that reverses the linear transformation represented by A. This means applying A-1 after A restores to the original inputs, sastifying A-1*A = I, where I is the identity matrix.

A system of linear equations is a collection of two or more linear equations involving the same variables. The linear equations involve only scaling and addition. Such a system could be rewritten as matrix and vector multiplication, with the matrix containing all the constant coefficients, and the vector containing all the variables. Their matrix-vector product equals a vector of constants: A*x = v. The geometric interpretation of this system is that we are searching for a vector x after linear transformation encoded by A lands on v. The solution of the linear system of equations depends on the linear transformation encoded by A squishes the space into a lower dimension or not. 1) When A is a square matrix meaning the number of equations is the same as the number of variables, if the determinant of A is non-zero, then A is a full-rank square matrix, the linear transformation doesn't change the dimension of the space, there will be only one x that lands on v with x = A-1*v. 2) When the determinant of A is zero, there is no inverse of A, as the linear transformation of A squishes the space into a lower dimension, and we can't reverse the lower dimension to a higher dimension (eg: we can't reverse a line to a 2D space), and there are infinite solutions if v is in the column space of A or no solution if v is not in the column space of A. 3) When A is a non-square matrix, rank(A) < number of columns and v is in the column space of A, there are infinite solutions. When v is not in the column space of A, there is no solution.