Linear Algebra: Vectors, Matrices and their properties
February 02, 2018
Introduction
Large datasets are often comprised of hundreds to millions of individual data items. It is easier to work with this data and operate on it when it is represented in the form of vectors and matrices. Linear algebra is a branch of mathematics that deals with vectors and operations on vectors. Linear algebra is thus an important prerequisite for machine learning and data processing algorithms.
This tutorial covers the basics of vectors and matrices, as well as the concepts that are required for data science and machine learning. It also introduces you terminology, such as “dot product”, “trace of a matrix”, etc.
Vectors and their properties
What is a Vector?
Vectors can be thought of as an array of numbers where the order of the numbers also matters. They are typically represented by a lowercase bold letter such as x. The individual numbers are denoted by writing the vector name with subscript indicating the position of the individual member. For example, x1 is the first number, x2 is the second number and so on. If we want to write the vector with the members explicitly then we enclose the individual elements in square brackets as,
$$ \textbf{x}= \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{bmatrix} $$Vector Addition and Multiplication
Addition of two vectors is performed by adding the corresponding elements of each vector. For example,
$$ \textbf{x}=\begin{bmatrix} 1 \\ 2 \end{bmatrix} , \textbf{y}=\begin{bmatrix} 3 \\ 4 \end{bmatrix} $$ $$ \textbf{x}+\textbf{y}= \begin{bmatrix} 1 \\ 2 \end{bmatrix} + \begin{bmatrix} 3 \\ 4 \end{bmatrix} = \begin{bmatrix} 1+3 \\ 2 +4 \end{bmatrix}=\begin{bmatrix} 4 \\ 6 \end{bmatrix} $$When a vector is multiplied by a scalar then each element gets multiplied by the scalar. For example:
$$ 2\textbf{x}=2\begin{bmatrix} 1 \\ 2 \end{bmatrix}=\begin{bmatrix} 2*1 \\ 2*2 \end{bmatrix}=\begin{bmatrix} 2\\ 4 \end{bmatrix} $$Norm of a Vector
The norm of a vector x, denoted as ||x|| is a measure of a vector’s magnitude. Mathematically,
$$ ||\textbf{x}|| = \sqrt{\sum_{i=1}^{n}x_i^2} = \sqrt {x_1^2 + x_2^2 + x_3^2 + \ldots + x_n^2} $$For example, if we have vector x which is equal to:
$$ \textbf{x}=\begin{bmatrix} 1 \\ 2 \end{bmatrix} $$Then, ||x|| is equal to:
$$ ||\textbf{x}|| = \sqrt{1^2+2^2} = \sqrt{5} \approx 2.236 $$Dot product of two vectors
The scalar product or dot product of two vectors is the sum of the product of the individual components of the two vectors. If we have two vectors x and y, the dot product is defined as:
$$ \textbf{x}\cdot\textbf{y} = x_{1}y_{1} + x_{2}y_{2} + ..... + x_{n}y_{n} $$Example:
$$ \textbf{x}=\begin{bmatrix} 1 \\ 2 \end{bmatrix}, \textbf{y}=\begin{bmatrix} -2 \\ 3 \end{bmatrix} $$ $$ \textbf{x}\cdot\textbf{y} = x_{1}y_{1}+x_{2}y_{2} =(1 \times -2)+(2 \times 3) = 4 $$Relation between norm and dot product
From the definition of the dot product and the norm, it is easy to deduce that the dot product of a vector with itself is equal to the norm squared. That is,
$$ \textbf{x}\cdot\textbf{x} = ||\textbf{x}||^2 $$Orthogonal and Orthonormal Vectors
Two vectors a and b are said to be orthogonal to each other if their dot product is 0, i.e.
$$ \textbf{a} \cdot \textbf{b} = 0 $$If both the orthogonal vectors also have unit norm (that is, if their norm = 1), then they are called orthonormal vectors.
Linear Independence of vectors
We call a set of vectors (v1, v2, .., vn) linearly independent if no vector of the set can be represented as a linear combination (only using scalar multiplication and vector additions) of other vectors. If they can be represented in that way then they are called linearly dependent vectors. For example, let’s say
$$ \textbf{v}_1=\begin{bmatrix} 1 \\ 2 \end{bmatrix} , \textbf{v}_2=\begin{bmatrix} -1 \\ 2 \end{bmatrix} , \textbf{v}_3=\begin{bmatrix} 1 \\ 0 \end{bmatrix} $$v1, v2, v3 given above are linearly dependent as v1 can be represented as linear combination of v2 and v3 in following way:
$$ \textbf{v}_1 = \textbf{v}_2+2*\textbf{v}_3 $$Matrices and their properties
Matrix
A matrix is a two dimensional array of numbers. Generally matrices are represented by an uppercase bold letter such as A. Since a matrix is two dimensional, each element is represented by a small letter with two indices such as $a_{ij}$ where i represents the row and j represents the column. A representation of $m \times n$ matrix is shown below,
$$ \textbf{A}= \begin{bmatrix} a_{11} & a_{12} &.....& a_{1n}\\ a_{21} & a_{22} &.....& a_{2n} \\ \vdots \\ a_{m1} & a_{m2} &.....& a_{mn} \end{bmatrix} $$For example, below W is a matrix of dimensions $2 \times 3$
$$ \textbf{W}= \begin{bmatrix} 3 & 7 & 8 \\ 5 & 1 & 7 \end{bmatrix} $$Relation between vectors and matrix
A vector of dimension n can be looked at as a matrix with n rows and 1 column. A matrix of dimensions m x n can be thought of as being composed on n column vectors or m row vectors.
Matrix multiplication
The multiplication result matrix of two matrices A and B denoted by C has elements cij which is the dot product of the ith row of A with jth column of the matrix B. To write it explicitly,
$$ c_{ij}= \sum_{k}a_{ik}*b_{kj} = a_{i1}*b_{1j} + a_{i2}*b_{2j} + ... + a_{ik}*b_{kj} $$For two matrices to be compatible for multiplication the number of columns of the first matrix has to be equal to the number of rows of the second matrix (this ensures that the dot-product will be between two vectors of the same dimensionality). That is, if A is a $m_1 \times n_1$ matrix and B is a $m_2 \times n_2$ matrix, then $\textbf{A} \times \textbf{B}$ is legal only if $n_1 = m_2$. The resulting product matrix has dimension $m_1 \times n_2$.
Let’s look at an example to see how matrices are multiplied,
$$ A = \begin{bmatrix} 1&2\\ 3&4\\ 5&6 \end{bmatrix}, B=\begin{bmatrix} 3&2&1&2\\ 2&1&4&5 \end{bmatrix}. $$Multiplying A and B is legal since number of columns in A is equal to the number of rows in B. (A has 2 columns and B has 2 rows).
$$ C=A*B = \begin{bmatrix} 1&2\\ 3&4\\ 5&6 \end{bmatrix}*\begin{bmatrix} 3&2&1&2\\ 2&1&4&5 \end{bmatrix} $$ $$ = \begin{bmatrix} 1*3+2*2&1*2+2*1&1*1+2*4&1*2+2*5\\ 3*3+4*2&3*2+4*1&3*1+4*4&3*2+4*5\\ 5*3+6*2&5*2+6*1&5*1+6*4&5*2+6*5 \end{bmatrix} $$ $$ = \begin{bmatrix} 7&4&9&12\\ 17&10&19&26\\ 27&16&29&40 \end{bmatrix} $$Diagonal Matrix
A square matrix whose non-diagonal elements are zero is called a diagonal matrix. For example:
$$ \begin{bmatrix} 5&0 &0\\ 0&10&0\\0&0&2 \end{bmatrix} \text{is a diagonal matrix} $$Identity Matrix
A square matrix whose non-diagonal elements are zero and diagonal elements are 1 is called an identity matrix. For example:
$$ \textbf{I} = \begin{bmatrix} 1&0&0\\ 0&1&0\\0&0&1 \end{bmatrix} \text{is an identity matrix} $$In the context of linear algebra, the letter I is used to denote an identity matrix.
Transpose of a Matrix
The transpose of matrix X, denoted by XT, is the result of flipping the rows and columns of a matrix X. When we take the transpose, element (i, j) goes to position (j, i).
For example, if
$$ X=\begin{bmatrix} 1&2&3\\ 4&5&6 \end{bmatrix} $$then transpose of X is given by,
$$ X^T=\begin{bmatrix} 1&4\\ 2&5\\ 3&6 \end{bmatrix} $$Inverse of a Matrix
The inverse of a square matrix X, denoted by X -1 is a special matrix which has the property that their multiplication results in the identity matrix:
$$ XX^{-1} = I = X^{-1}X $$Above, I is an identity matrix. Both X and X-1 are inverses of each other. An instance of an inverse matrix is illustrated below:
$$ X = \begin{bmatrix} 3 & 1 \\ 4 & 2 \end{bmatrix}, X^{-1} = \begin{bmatrix} 1 & -\frac{1}{2} \\ -2 & \frac{3}{2} \end{bmatrix} $$These two matrices are inverses of each other. When we multiply them, we get an identity matrix.
$$ X X^{-1} = \begin{bmatrix} 3 & 1 \\ 4 & 2 \end{bmatrix}\begin{bmatrix} 1 & -\frac{1}{2} \\ -2 & \frac{3}{2} \end{bmatrix} $$ $$ =\begin{bmatrix} 3-2 & -\frac{3}{2}+ \frac{3}{2} \\ 4 -4 & -2 + 3 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = I $$We get the same result when we perform matrix multiplication X-1X.
Matrix Eigenvalues
Let A be a square matrix, v a vector and λ a scalar that satisfy A.v = λv, then λ is called the Eigenvalue associated with Eigenvector v of A. As an example, you can verify that A.v = λv below:
$$ A = \begin{bmatrix} 3&0 \\ 2&-1 \end{bmatrix}, v = \begin{bmatrix} 2 \\ 1 \end{bmatrix}, \lambda=3 $$Reference
- Khan Academy, Linear algebra: Linear Algebra
- MIT OpenCourseWare, Linear Algebra: MIT OpenCourseWare