Suppose that A is a square matrix. We want to find a matrix (which we denote naturally by A^{-1})
of the same size, such that A^{-1}timesAequalsI, the identity matrix.
Whatever A does or transforms to, A^{-1} undoes or transforms back. Their product is the
identity matrix, which does nothing. The problem is that such inverse matrix might not exist. The notion of the
inverse of a matrix first appears in an 1855 note of Arthur Cayley (1821--1895). In 1842, Arthur Cayley graduated from
Trinity College, Cambridge, England, but could not find a suitable teaching post. So for 14 years starting 1863,
he worked as a lawyer. During this time frame, he published about 300 mathematical papers, and finally, in 1863, he
became a professor at Cambridge.
A square n × n matrix A is invertible or not singular if there exists a matrix A^{-1}
such that
Apparently, a square matrix for which does not exist an inverse matrix is called singular.
Non-square matrices (m-by-n matrices for which m ≠ n) do
not have an inverse. However, in some cases such a matrix may have a
left inverse or right inverse.
If A is m-by-n and the rank of A is equal to n, then
A has a left inverse: an n-by-m matrix B such that
BA = I_{n}. If A has rank m, then it has a
right inverse: an n-by-m matrix B such that
AB = I_{m}.
There are known many other generalized inverses.
In other sections, we will work with the pseudoinverse, which is sometimes called the
Moore–Penrose inverse, when we discuss
least squares approximations and singular value decomposition. However, there are known others such as
group inverse and Drazin inverse.
Theorem: (Raymond Beauregard)
For any n × n matrices A and C over a field of
scalars,
Recall that if A is n × n matrix then its reduced row-echelon form
H is a
matrix of the same size with zeros in the pivot columns except for the pivots
which are equal to 1. It is achieved by applying elementary row operations
(row swapping, row addition, row scaling) to A. An elementary matrix is
one obtained by applying a single elementary row operation to the n × n
identity matrix I. Elementary matrices have inverses that are also
elementary matrices. Left multiplication of A by an elementary matrix
E effects the same row operation on A that was used to create
E. If P is a product of those elementary matrices that reduce
A to P, then P is invertible and PA =
HH.
Let H be the reduced row echelon form of A, and let P be
the product of those elementary matrices (in the appropriate order) that
reduce A to H. P is an invertible matrix such that
P A = H. Notice that H is the identity matrix if and only
if it has n pivots.
Beginning with AC = I, we left multiply this equation by P
obtaining \( {\bf P}\,{\bf A}\,{\bf C} = {\bf P} \)
or \( {\bf H}\,{\bf C} = {\bf P} . \) If H
is not the identity matrix it must have a bottom row of zeros forcing P
to have likewise a bottom row of zeros, and this contradicts the
invertibility of P. Thus H = I, C = P, and
the equation \( {\bf P}\,{\bf A} = {\bf H} \) is
actually \( {\bf C}\,{\bf A} = {\bf I} . \)
This argument shows at once that (i) a matrix is invertible if and only if its
reduced row echelon form is the identity matrix, and (ii) the set of
invertible matrices is precisely the set of products of elementary matrices.
From Beauregard's theorem, it follows that the inverse matrix is unique. Although, we can show directly that a matrix
A cannot have two different inverses. Suppose the opposite is true and there exist two matrices B and
C such that BA = I and AC = I. According to the identities
This theorem confirms the common sense: If one put on socks and then sneakers, the first to be taken off are sneakers.
Given A ∈ ℝ^{m×n}, the transpose of A = [a_{i,j}], denoted
A^{T} or just A', is an element of
ℝ^{n×m} defined by \( \left[ a_{i,j}
\right]^{\mathrm T} = \left[ a_{j,i} \right] . \)
Theorem: Let A be a rectangular m × n matrix with real entries. Then the product
\( {\bf A}^{\mathrm T}{\bf A} \) is an invertible n × n matrix if and only if
A has linearly independent columns (full column rank). This statement is valid for complex-valued matrices,
but instead of \( {\bf A}^{\mathrm T}{\bf A} \) it should be considered
\( {\bf A}^{\ast}{\bf A} , \) the product of the adjoint matrix with itself.
▣
The product is a square matrix (n-by-n). For every matrix A, we are going to show that
\( {\bf A}^{\mathrm T}{\bf A} \) has the same kernel as A. When the columns of
A are linearly independent, its nullspace contains only the zero vector. Then
\( {\bf A}^{\mathrm T}{\bf A} , \) with this same nullspace, is invertible.
Let A be any matrix. If vector x is in its kernel, then Ax = 0. Multiplying by
\( {\bf A}^{\mathrm T} \) gives
\( {\bf A}^{\mathrm T}{\bf A} \,{\bf x} = {\bf 0} . \) So x is also in the
nullspace of \( {\bf A}^{\mathrm T}{\bf A} . \)
Now start with the kernel of \( {\bf A}^{\mathrm T}{\bf A} . \) From
\( {\bf A}^{\mathrm T}{\bf A} \,{\bf x} = {\bf 0} , \) we must prove that
Ax = 0. We multiply from left by \( {\bf x}^{\mathrm T}: \)
The vector \( {\bf A} \,{\bf x} \) has length zero. Therefore,
\( {\bf A} \,{\bf x} = {\bf 0} . \) Every vector x in one nullspace is in the other nullspace.
If A is of full column rank, so does
\( {\bf A}^{\mathrm T}{\bf A} . \) If A has independent columns, so does
\( {\bf A}^{\mathrm T}{\bf A} . \)
A diagonal matrix has an inverse provided no diagonal entries are zero:
Now we answer a natural question how to find an inverse matrix. There are
known two basic methods to determine the inverse of a square (nonsingular)
matrix: the cofactor method and Gauss--Jordan elimination procedure. We
discuss an iterative method in Numerical Applications section. We
start with the former, which is most numerically expensive. For a completeness,
the cofactor matrices are defined again. We also need another definition of minor introduced by the English mathematician James Sylvester in 1850.
Given A = [a_{i,j}] ∈ ℂ^{n×n}, the minorM_{i,j} of the element a_{i,j} is the
determinant of the resulting (n-1)×(n-1) matrix found by removing row i and column j from A.
For a given n × n matrix \( {\bf A} = \left[ a_{ij} \right] , \) the (i,j)-cofactor
of A is the number C_{ij} given by \( C_{ij} = (-1)^{i+j} \det{\bf A}_{i,j} , \)
where A_{i,j} is (n-1) × (n-1) matrix obtained from A by deleting i-th row and
j-th column.
The n × n transpose matrix of cofactors is called the adjugate of A
is the cofactor matrix (also called the matrix of cofactors).
▣
Our starting point is the cofactor expansion:
\[
\det {\bf A} = \sum a_{i,k}\,A_{j,k} ,
\]
where A_{j,k} is the cofactor of the element a_{j,k}, and the sum is either on j
(for any fixed value of k) or on k (for any fixed value of j). If we have the sum over j,
then
\[
\sum_j a_{j,k} A_{j,i} = \begin{cases} \det {\bf A} , & \ \mbox{ if } \ i=k , \\
0, & \ \mbox{ if } \ i \ne k \end{cases}
\]
since we get two identical columns when i ≠ k. Upon division by det A, we have
\[
\sum_j \left( \frac{A_{j,i}}{\det {\bf A}} \right) a_{j,k} = \delta_{i,k} = \begin{cases} 1 , & \ \mbox{ if } \ i=k , \\
0, & \ \mbox{ if } \ i \ne k . \end{cases}
\]
This scalar statement (which holds for \( 1 \le i \le n, \ 1 \le j \le n \) ) is equivalent, according to the definition of matrix multiplication to the
matrix equation
The cofactor formula for the inverse matrix allows us to make the following two statements.
Theorem: A square matrix A is invertible if and only if its determinant is
not zero: detA ≠ 0.
▣
Theorem: Let A be an invertible square matrix. Then the linear equation
Ax = b has a unique solution \( {\bf x} = {\bf A}^{-1} {\bf b} \)
for any vector b.
▣
Now we compute the inverse matrix using the Gauss-Jordan elimination procedure.
If A is an n × n nonsingular matrix, then for any column vector b, which is n × 1 matrix, the
linear system Ax = b has a unique solution. Partition the identity matrix I_{n} into
its columns that we denote as \( {\bf e}_i , \ i=1,2,\ldots , n . \) Since the identity matrix
can be represented as a collection of column-vectors
\( {\bf I}_n = \left[ {\bf e}_1 \ {\bf e}_2 \ \cdots \ {\bf e}_n \right] , \) every linear equation
\( {\bf A}\,{\bf x} = {\bf e}_i \) has a unique solution for each i. Hence, we have
Letting \( {\bf C} = \left[ {\bf c}_1 , \ {\bf c}_2 , \ \cdots \ {\bf c}_n \right] , \) we
have just shown that C is the inverse of A. To find C, we introduce the augmented matrix
\( \left[ {\bf A} \ {\bf I} \right] . \) Then apply the following algorithm.
Theorem: Inversion Algorithm To find the inverse of a square matrix A,
use two steps
Step 1: form the augmented matrix \( \left[ {\bf A} \ {\bf I} \right] . \)
Step 2: apply the Gauss--Jordan method to attempt to reduce \( \left[ {\bf A} \ {\bf I} \right] \)
to \( \left[ {\bf I} \ {\bf C} \right] . \) If the reduction can be carried out, then
A^{-1} = C. Otherwise, A^{-1} does not exist. ▣
A simple method for carrying out this algorithm is given in the following example.
We find its inverse using Gauss-Jordan method applied to the augmented matrix
\( \left[ {\bf A} \ {\bf I} \right] . \) Of course, using Mathematica, it is an easy job:
However, we show all steps required to determine the inverse matrix without applying the standard commands. We start with
the augmented matrix, which we denote as B:
Next move would be multiplication of the second row by (7/2) and adding to the third row. This is equivalent to
multiplication by the elementary matrix:
The forward elimination procedure would finish over here because Gauss is happy with this part, called reduced echelon form.
The pivots 2, 1, 1/2 are on diagonal of submatrix U.
The contribution of the geodesist Wilhelm Jordan
is to continue with elimination. Wilhelm goes all the way to obtain the reduced row echelon form: rows are added to rows to
produce zeroes above the pivots:
The last step is to divide (except second row) by its pivot. The new pivots are all ones and we have reached I
in the first half of the matrix. The last three columns will provide us the inverse matrix A^{-1}:
Theorem: Equivalent Statements regarding a square n × n matrix A.
A is invertible.
A is row equivalent to the identity matrix I_{n}.
Ax = 0 has only the trivial (= zero) solution.
A is expressible as a product of elementary matrices.
Ax = b has a unique solution for each n-component column vector b (which is n × 1 matrix).
The span (= set of all linear combinations) of the column vectors of A is \( \mathbb{R}^n . \)
\( \det {\bf A} \ne 0. \)
The spectrum of A does not contain zero.
\( \left( {\bf A}^{-1} \right)^{\ast} = \left( {\bf A}^{\ast} \right)^{-1} \qquad
\Longrightarrow \qquad \left( {\bf A}^{-1} \right)^{\mathrm T} = \left( {\bf A}^{\mathrm T} \right)^{-1} . \)
This means that A is invertible if and only if its adjoint (or transpose) matrix is invertible. ▣
Inverse of partitioned matrices
Suppose we are given a square n×n matrix that could be partitioned as follows:
where X_{r×r}, Y_{r×s}, Z_{s×r}, and W_{s×s}
matrices are to be found.
Observe that X, Y, Z, and W are of the same dimensions as P, Q, R, and S, respectively. Since A^{−1}A = I, the identity matrix, we ave
Frame, J.S., A simple recursion formula for inverting a matrix,
Bulletin of the American Mathematical Society, 1949, Vol. 55, p. 1045. doi:10.1090/S0002-9904-1949-09310-2
Greenspan, D., Methods of matrix inversion, The American mathematical Monthly, 1955, Vol. 62, No. pp. 303--318.