Matrix Inverse
Linear system
We have matrix $\boldsymbol{A}: m \times n$, and vectors $\boldsymbol{b}: m \times 1$ and $\boldsymbol{x}: n \times 1$ where $\boldsymbol{b}$ is given and $\boldsymbol{x}$ is unknown. We use the equation
$$ \boldsymbol{Ax} = \boldsymbol{b} $$
We’re interested in when does a solution $\boldsymbol{x}$ exist, and how do we find it.
A linear system is consistent
when there exists at least one $\boldsymbol{x}$ satisfying the equation. We can see that the following statements are equivalent:
 A linear system is consistent.
 $\boldsymbol{b} \in \mathcal{C}(\boldsymbol{A})$.
 $\mathcal{C}([\boldsymbol{A}, \boldsymbol{b}]) = \mathcal{C}(\boldsymbol{A})$.
 $r([\boldsymbol{A}, \boldsymbol{b}]) = r(\boldsymbol{A})$.
If $\boldsymbol{A}$ has full row rank $m$, then $\mathcal{C}(\boldsymbol{A}) = \mathbb{R}^m$, and $\boldsymbol{Ax} = \boldsymbol{b}$ is consistent for any $\boldsymbol{b}$.
The next fact is important in statistical contexts. The linear system $\boldsymbol{A}'\boldsymbol{Ax} = \boldsymbol{A}'\boldsymbol{b}$ is consistent for any $\boldsymbol{A}$, $\boldsymbol{b}$. This is because $\boldsymbol{A}'\boldsymbol{b} \in \mathcal{C}(\boldsymbol{A}') = \mathcal{C}(\boldsymbol{A}'\boldsymbol{A})$.
If $\boldsymbol{A}$ has full column rank, $\boldsymbol{A}'\boldsymbol{A}$ is nonsingular, and $\boldsymbol{x}$ is unique. This is an assumption in our regression analyses, which is there’s no redundancy in our predictor variables.
Matrix inverse
The inverse of a matrix only makes sense when the matrix is square. Before talking about it, we introduce two other concepts for nonsquare matrices.
Left and right inverses
If $\boldsymbol{A}: m \times n$, the right inverse
$\boldsymbol{R}$ is the $n \times m$ matrix such that
$$ \boldsymbol{AR} = \boldsymbol{I}_m $$
The right inverse exists when $\boldsymbol{A}$ has full row rank, i.e. $r(\boldsymbol{A}) = m$ ($m \leq n$). This is because if $r(\boldsymbol{A}) = m$, then $\boldsymbol{Ax} = \boldsymbol{b}$ is consistent for any $\boldsymbol{b}$. Let
$$ \boldsymbol{b}_1 = \begin{pmatrix} 1 \\ 0 \\ 0 \\ \vdots \\ 0 \end{pmatrix}_{m \times 1}, \boldsymbol{b}_2 = \begin{pmatrix} 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{pmatrix}, \cdots, \boldsymbol{b}_m = \begin{pmatrix} 0 \\ 0 \\ 0 \\ \vdots \\ 1 \end{pmatrix} $$
The solution for $\boldsymbol{Ab}_1 = \boldsymbol{x}$ is $\boldsymbol{x}_1$, and similarly we have $\boldsymbol{x}_2, \cdots, \boldsymbol{x}_m$. If we put $\boldsymbol{x}_1, \cdots, \boldsymbol{x}_m$ into an $n \times m$ matrix
$$ \boldsymbol{R} = [\boldsymbol{x}_1, \cdots, \boldsymbol{x}_m], $$
we have $\boldsymbol{AR} = \boldsymbol{A}[\boldsymbol{x}_1, \cdots, \boldsymbol{x}_m] = \boldsymbol{I}_m$. From the proof we can also see that $\boldsymbol{R}$ is not unique.
Similarly, the left inverse
$\boldsymbol{L}: n \times m$ is the matrix that satisfies $\boldsymbol{LA} = \boldsymbol{I}_n$. $\boldsymbol{L}$ exists when $\boldsymbol{A}$ has full column rank, i.e. $r(\boldsymbol{A}) = n (m \geq n)$.
Inverse
If $\boldsymbol{A}$ is nonsingular, then its right inverse and left inverse exist, and they are the same, which we call the inverse
of $\boldsymbol{A}$ denoted by $\boldsymbol{A}^{1}$. $\boldsymbol{A}^{1}$ is unique. To prove this, note that
$$ \boldsymbol{L} = \boldsymbol{LI}_n = \boldsymbol{LAR} = \boldsymbol{I}_n \boldsymbol{R} = \boldsymbol{R} $$
Properties
If $\boldsymbol{A}$ has full column rank $n$^{1}, and there exists $\boldsymbol{L}$ such that $\boldsymbol{LA} = \boldsymbol{I}_n$ ($n \leq m$), then

$\boldsymbol{B} = \boldsymbol{C} \Longleftrightarrow \boldsymbol{AB} = \boldsymbol{AC}$
This is because
$$ \boldsymbol{AB} = \boldsymbol{AC} \Rightarrow \boldsymbol{LAB} = \boldsymbol{LAC} \Rightarrow \boldsymbol{I}_n\boldsymbol{B} = \boldsymbol{I}_n \boldsymbol{C} $$

For any $\boldsymbol{B}$, $r(\boldsymbol{AB}) = r(\boldsymbol{B})$. Premultiplying a full column rank matrix does not change the rank.

For any $\boldsymbol{B}$, $\mathcal{R}(\boldsymbol{AB}) = \mathcal{R}(\boldsymbol{B})$ when $\boldsymbol{A}$ has full column rank.
If $\boldsymbol{A}$ has full column rank $n$, then $\boldsymbol{A}'\boldsymbol{A}$ is nonsingular and vice versa. Similarly if $\boldsymbol{A}$ has full row rank, $\boldsymbol{AA}'$ is nonsingular.
Inverse of nonsingular matrix
Suppose $\boldsymbol{A}$ is an $m \times m$ nonsingular matrix.

If $m = 2$, we have
$$ \boldsymbol{A} = \begin{pmatrix} a & b \\ c & d \end{pmatrix}, \quad \boldsymbol{A}^{1} = \frac{1}{ad  bc} \begin{pmatrix} d & b \\ c & a \end{pmatrix} $$
Check that $\boldsymbol{AA}^{1} = \boldsymbol{I}_2$ to make sure.

If $\boldsymbol{A}$ is diagonal and all diagonals are nonzero,
$$ \boldsymbol{A}^{1} = diag \left(d_1^{1}, \cdots, d_m^{1} \right) $$

$\boldsymbol{Ax} = \boldsymbol{b}$ has a unique solution $\boldsymbol{x} = \boldsymbol{A}^{1}\boldsymbol{b}$.

$(k\boldsymbol{A})^{1} = \frac{1}{k}\boldsymbol{A}^{1}$.

$\boldsymbol{A}'$ is also nonsingular, and its inverse is
$$ (\boldsymbol{A}')^{1} = (\boldsymbol{A}^{1})' $$
This is because if we take the transpose of both sides of $\boldsymbol{AA}^{1} = \boldsymbol{I}_m$, we get $(\boldsymbol{A}^{1})'\boldsymbol{A}' = \boldsymbol{I}_m$. Since $\boldsymbol{A}'$ is nonsingular, $(\boldsymbol{A}^{1})'$ is the inverse of $\boldsymbol{A}'$.

$\boldsymbol{A}^{1}$ is also nonsingular, i.e. $r(\boldsymbol{A}^{1}) = m$. This is because
$$ m = r(\boldsymbol{AA}^{1}) \leq r(\boldsymbol{A}^{1}) \leq m $$

If $\boldsymbol{B}$ is also an $m \times m$ nonsingular matrix, then $\boldsymbol{AB}$ is also nonsingular, and $(\boldsymbol{AB})^{1} = \boldsymbol{B}^{1}\boldsymbol{A}^{1}$. To check this claim,
$$ (\boldsymbol{AB})(\boldsymbol{B}^{1}\boldsymbol{A}^{1}) = \boldsymbol{AI}_m\boldsymbol{A}^{1} = \boldsymbol{AA}^{1} = \boldsymbol{I}_m $$
This could be generalized to the product of $k$ nonsingular matrices: $(\boldsymbol{A}_1\boldsymbol{A}_2 \cdots \boldsymbol{A}_k)^{1} = \boldsymbol{A}_k^{1} \cdots \boldsymbol{A}_1^{1}$. This is similar to the transpose of the product of matrices.

If $\boldsymbol{A}: m \times n$ and $r(\boldsymbol{A}) = r$; $\boldsymbol{B}: m \times m$ is nonsingular, then $r(\boldsymbol{BA}) = r$. Similarly if $\boldsymbol{C}: n \times n$ is nonsingular, $r(\boldsymbol{AC}) = r$. In fact, $\boldsymbol{B}$ only needs to have full column rank, and $\boldsymbol{C}$ only needs to have full row rank.
Orthonormal matrices
An orthogonal matrix
, or orthonormal matrix to be more precise, is a square matrix whose columns and rows are orthonormal. One way to express this is
$$ \boldsymbol{AA}' = \boldsymbol{A}'\boldsymbol{A} = \boldsymbol{I}_n $$
By definition, the columns have to be orthonormal because
$$ \begin{gathered} \boldsymbol{a}_i'\boldsymbol{a}_i = 1 = \\boldsymbol{a}_i \^2 \\ \boldsymbol{a}_i'\boldsymbol{a}_j = 0 \end{gathered} $$
Equivalently, $\boldsymbol{A}$ is orthogonal if $\boldsymbol{A}^{1} = \boldsymbol{A}'$.
Examples
In general, any $2 \times 2$ orthogonal matrix $\boldsymbol{P}$ is
$$ \boldsymbol{P} = \begin{pmatrix} \sin\theta & \cos\theta \\ \cos\theta & \sin\theta \end{pmatrix} $$
If we think of the matrix as an operator on a vector $\boldsymbol{u}$, the vector $\boldsymbol{v} = \boldsymbol{Pu}$ can be interpreted as a rotation of $\boldsymbol{u}$ by $\theta$ degrees. This is why $\boldsymbol{P}$ is sometimes called the rotation matrix
. Note that this operation applies to higherorder matrices as well. For example, in PCA we can enforce zerocorrelation by rotation.
The identity matrix is also obviously orthogonal. We can also permute the columns and the resulting matrix is still orthogonal. Geometrically, this is permuting the coordinate axes.
Properties

The product of two orthogonal matrices is orthogonal.
To prove this, let $\boldsymbol{P}$ and $\boldsymbol{Q}$ be two orthogonal matrices. We then have
$$ (\boldsymbol{PQ})(\boldsymbol{PQ})' = \boldsymbol{PQ}\boldsymbol{Q}'\boldsymbol{P}' = \boldsymbol{PP}' = \boldsymbol{I} $$

For any vector $\boldsymbol{x} \in \mathbb{R}^n$, $\ \boldsymbol{Px} \^2 = \\boldsymbol{x}\^2$. Premultiplying an orthogonal matrix to a vector doesn’t change the norm of the vector because
$$ (\boldsymbol{Px})'(\boldsymbol{Px}) = \boldsymbol{x}'\boldsymbol{P}'\boldsymbol{Px} = \boldsymbol{x}'\boldsymbol{x} = \\boldsymbol{x}\^2 $$

For any vector $\boldsymbol{x}, \boldsymbol{y} \in \mathbb{R}^n$, their inner product is not changed by premultiplying an orthogonal matrix
$$ (\boldsymbol{Px})'(\boldsymbol{Py}) = \boldsymbol{x}'\boldsymbol{y} $$

We can make similar arguments if $\boldsymbol{A}$ has full row rank. Just consider switching the orders. ↩︎
Aug 31  Vector Space  8 min read 
Nov 04  Quadratic Form  15 min read 
Oct 26  Determinant  6 min read 
Oct 23  Projection Matrix  4 min read 
Oct 21  Generalized Inverse  4 min read 