Determinants and Inverse Matrices

This page supports the main stream of the web site by providing/reminding some information regading determinants and inverse matrices. We demonstrate capabilities of Sage for this topic.

Determinants

General definition of determinant is difficult and counterintuitive. We define it recursively. For a \( 1 \times 1 \) matrix that consists of one element, \( {\bf A} = [a] , \) its determinant is \( \det {\bf A} = \left\vert {\bf A} \right\vert = a . \) Note that it is a custom to use two notations for determinants: one is to write ``det'' in front of a square matrix, and another one is to embrase the square array with vertical lines. For a \( 2\times 2 \) matrix \( {\bf A} = \left[ \begin{array}{cc} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array} \right] , \) its determinant is \( \det {\bf A} = \left\vert {\bf A} \right\vert = a_{11} a_{22} - a_{12} a_{21} . \)

In general, we define the determinant (the term determinant was first introduced by the German mathematician Carl Friedrich Gauss in 1801) using cofactor expansion. Therefore, we need a definition.

If A is a square matrix, then the minor (the term was introduced by James Sylvester in 1850) of entry \( a_{ij} \) is denoted by \( M_{ij} \) and is defined to be the determinant of the submatrix that remains after the i-th row and j-th column are deleted from A. The number \( (-1)^{i+j} M_{ij} \) is denoted by \( C_{ij} \) and is called the cofactor of entry \( a_{ij} . \) The matrix of cofactors of a square matrix A is called the adjugate of A.

Sage uses two standard commands to evaluate the determinant of a square matrix

sage: A=matrix(QQ,3,3,[2,-4,1,4,-8,7,-2,4,-3])
sage: A.determinant()
0
sage: A.det()
0 sage: det(A)

There are two popular methods to evaluate the determinant. One is based on finding cofactors for each entry, which reduces evaluation of \( n \times n \) square matrix to n determinants of order 1 less:

\[ \det {\bf A} = a_{n1} { C}_{n1} + a_{n2} { C}_{n2} + \cdots + a_{nn} { C}_{nn} . \]

Each entry \( a_{ni} \) of the last row of the matrix A is multiplied by its corresponding cofactor \( { C}_{ni} , \) and sum the results. Finding a determinant in this way is often refered to as cofactor expansion along the last row (could be used any other row or column) of the matrix.

Example. Consider a square matrix

\[ {\bf A} = \left[ \begin{array}{ccc} 5& -2 & 1 \\ 0&4&-3 \\ 2&-7 & 6 \end{array} \right] . \]
Multiplying every entry of the last row by its cofactor, and summing, we have
\[ \det {\bf A} = a_{31} \, C_{31} + a_{32} \, C_{32} + a_{33} \, C_{33} = 2(2) + (-7)(15) + 6(20) = 19 , \]
where
\[ { C}_{31} = \begin{vmatrix} -2 & 1 \\ 4 & -3 \end{vmatrix} , \qquad {C}_{32} = \begin{vmatrix} 5& 1 \\ 0 & -3 \end{vmatrix} , \qquad {C}_{33} = \begin{vmatrix} 5&-2 \\ 0&4 \end{vmatrix} . \]
We can obtaine the same unswer if use the middle row:
\[ \det {\bf A} = a_{21}\, {C}_{21} + a_{22}\, {C}_{22} + a_{23} \, {C}_{23} = 0(-5) + 4(28) + 3(-31) = 19 , \]
where
\[ C_{21} = \begin{vmatrix} -2 & 1 \\ -7 & 6 \end{vmatrix} , \qquad {C}_{22} = \begin{vmatrix} 5& 1 \\ 2 & 6 \end{vmatrix} , \qquad {C}_{23} = \begin{vmatrix} 5&-2 \\ 2&-7 \end{vmatrix} . \]
sage: M = matrix([[1,2,3],[4,5,6]])
sage: I = identity_matrix(3)

Another method is based on the following theorem:

Theorem: Let A be an upper or lower triangular \( n \times n \) matrix. Then its determinant is \( \det {\bf A} = a_{11} a_{22} \cdots a_{nn} , \) the product of the entries of A along the main diagonal. ■

Next theorem describes explicitely how row operations affect the determinant:

Theorem: Let A be \( n \times n \) matrix, with determinant \( | {\bf A} | , \) and let c be a scalar.

  • If \( R_1 \) is the row operation of multiplication by a constant \( (i) \,\leftarrow \, c(i) ,\) then \( | R_1 ({\bf A})| = c\,|{\bf A}| . \)
  • If \( R_2 \) is the row operation by adding another row \( (j) \,\leftarrow \, (i + (j)) ,\) then \( | R_2 ({\bf A})| = |{\bf A}| . \)
  • If \( R_3 \) is the row operation of switching rows \( (i) \,\leftrightarrow (j) ,\) then \( | R_3 ({\bf A})| = -|{\bf A}| . \)

Another useful properties of square matrices:

  • If A is an \( n \times n \) matrix, and c is any scalar, then \( | c{\bf A} | = c^n |{\bf A} | . \)

  • \( \det {\bf A} = \det{\bf A}^{\mathrm T} . \)
  • If A is a matrix with two proportional rows or two proportional columns, then \( \det {\bf A} =0. \)

We present a method for evaluating determinants that involves substansionally less computations than cofactor expansion. The idea of the method is to reduce the given matrix to upper trangular form by elementary row operations, then compute the determinant of the upper traingular matrix (an easy computation), and then relate that determinant to that of the original matrix. Here is an example.

Example. Evaluate the determinant of the matrix \( {\bf A} = \left[ \begin{array}{ccc} 0&1&5 \\ 3& -6 & 9 \\ 2&6&1 \end{array} \right] , \) using row echelon form.

\begin{align*} \det ({\bf A} ) &= \left\vert \begin{array}{ccc} 0&1&5 \\ 3& -6 & 9 \\ 2&6&1 \end{array} \right\vert = - \left\vert \begin{array}{ccc} 3& -6 & 9 \\ 0&1&5 \\ 2&6&1 \end{array} \right\vert \\ &= -3 \left\vert \begin{array}{ccc} 1& -2 & 3 \\ 0&1&5 \\ 2&6&1 \end{array} \right\vert = -3 \left\vert \begin{array}{ccc} 1& -2 & 3 \\ 0&1&5 \\ 0&10&-5 \end{array} \right\vert \quad\mbox{$-2$ times the first row was added to the third row} \\ &= -3 \left\vert \begin{array}{ccc} 1& -2 & 3 \\ 0&1&5 \\ 0&0&-55 \end{array} \right\vert \quad \mbox{$-10$ times the second row was added to the third row} \\ &= (-3)(-55) \left\vert \begin{array}{ccc} 1& -2 & 3 \\ 0&1&5 \\ 0&0&1 \end{array} \right\vert = (-3)(-55) = 165 . \end{align*}
The following are all equivalent for \( n \times n \) matrix A   The following are all equivalent for \( n \times n \) matrix A
A is singular (\( {\bf A}^{-1} \) does not exist)   A is nonsingular (\( {\bf A}^{-1} \) exists) 
Rank \( ({\bf A}) \ne n \)   Rank \( ({\bf A}) = n \)  
\( \det ({\bf A}) =0 \)     \( \det ({\bf A}) \ne 0 \)  
A is not row equivalent \( \det {\bf I}_n \)     A is row equivalent to \( \det {\bf I}_n \)  
\( {\bf A}\,{\bf x} = {\bf 0} \) has a nontrivial solution for x    \( {\bf A}\,{\bf x} = {\bf 0} \) has only the trivial solution for x 
\( {\bf A}\,{\bf x} = {\bf b} \) does not have a unique solution for x    \( {\bf A}\,{\bf x} = {\bf b} \) has a unique solution \( {\bf x} = {\bf A}^{-1} {\bf b} \)  
Block matrices are ubiquitous in physics and applied mathematics, appearing naturally in the description of systems with multiple discrete variables (e.g., quantum spin, quark color and flavor). In addition, block matrices are exploited in many computational methods familiar to researchers of fluid dynamics. The need to calculate determinants of these matrices is almost equally widespread, for both analytical and numerical applications.

The problem of calculating the determinant of a \( 2 \times 2 \) block matrix has been long studied, and is a most important case, since it can be extended to any larger matrix in the same way that the determinant of an arbitrary square matrix can be expressed in terms of the determinants of \( 2 \times 2 \) matrices, via minor expansion. If a square matrix M can be partitioned into submarices in the following manner

\[ {\bf M} = \begin{bmatrix} {\bf A} & {\bf B} \\ {\bf C} & {\bf D} \end{bmatrix} , \]
you cannot always use block determinants. However, if one of the off-diagonal submatrix is zero, you can:

\[ \det \begin{bmatrix} {\bf A} & {\bf 0} \\ {\bf C} & {\bf D} \end{bmatrix} = \begin{vmatrix} {\bf A} & {\bf 0} \\ {\bf C} & {\bf D} \end{vmatrix} = \left \vert {\bf A} \right\vert \, \left \vert {\bf D} \right\vert \qquad\mbox{and} \qquad \det \begin{bmatrix} {\bf A} & {\bf B} \\ {\bf 0} & {\bf D} \end{bmatrix} = \left \vert {\bf A} \right\vert \, \left \vert {\bf D} \right\vert . \]
To find the determinant of a block \( 2 \times 2 \) matrix, we introduce Schur complements:
\[ {\bf M} / {\bf A} \,\stackrel{\mathrm{def}}{=} \, {\bf D} - {\bf C}\,{\bf A}^{-1} {\bf B} \qquad\mbox{and} \qquad {\bf M} / {\bf D} \,\stackrel{\mathrm{def}}{=} \, {\bf A} - {\bf B}\,{\bf D}^{-1} {\bf C} . \]
Consider the equation
\[ \begin{bmatrix} {\bf A}_{m\times m} & {\bf B}_{m\times n} \\ {\bf C}_{n\times m} & {\bf D}_{n\times n} \end{bmatrix} \, \begin{bmatrix} {\bf I}_{m\times m} & {\bf 0}_{m\times n} \\ -{\bf D}^{-1}{\bf C} & {\bf I}_{n\times n} \end{bmatrix} = \begin{bmatrix} {\bf A} - {\bf B}\,{\bf D}^{-1} {\bf C} & {\bf B} \\ {\bf 0} & {\bf D} \end{bmatrix} , \]
where A, B, C, and D are \( m \times m , \quad m\times n , \quad n\times m , \quad \mbox{and} \quad n \times n \) matrices, respectively, I and 0 are the identity and zero matrix, respectively (taken to be of the appropriate dimensions), and we have assumed that D is invertible. If \( {\bf D}^{−1} \) does not exist, the above equation can be rewritten in terms of \( {\bf A}^{−1} , \) with the right side lower-triangular. If neither inverse exists, the notion of generalized inverses must be employed. We do not consider these complications in the present analysis, but rather simply assume the existence of all inverses that arise in the following calculations. In practice, this assumption is of nearly no consequence, since the blocks will generally be functions of at least one continuous variable (e.g., momentum), and any singularities will be isolated points of measure zero. Taking the determinant of the above equation yields the result
\[ \det \begin{bmatrix} {\bf A}_{m\times m} & {\bf B}_{m \times n} \\ {\bf C}_{n\times m} & {\bf D}_{n\times n} \end{bmatrix} = \det \left( {\bf A} - {\bf B}\,{\bf D}^{-1} {\bf C} \right) \det {\bf D} , \]
where we have exploited the fact that the determinant of a block tringular matrix is the product of the determinants of its diagonal blocks. The matrices appearing on the right side are the \( n \times n \) matrix D, and the \( m \times m \) Schur complement with respect to D.

If one consider the lower-triangular block matrix, leaving the Schur complement \( {\bf D} - {\bf C}\,{\bf A}^{-1} {\bf B} \) in the right lower coner:

\[ \begin{bmatrix} {\bf I} & {\bf 0} \\ -{\bf C}\,{\bf A}^{-1} & {\bf I} \end{bmatrix} \, \begin{bmatrix} {\bf A} & {\bf B} \\ {\bf C} & {\bf D} \end{bmatrix} = \begin{bmatrix} {\bf A} & {\bf B} \\ {\bf 0} & {\bf D} - {\bf C}\,{\bf A}^{-1} {\bf B} \end{bmatrix} , \]
then
\[ \begin{vmatrix} {\bf A} & {\bf B} \\ {\bf C} & {\bf D} \end{vmatrix} = \left\vert {\bf A} \right\vert \, \left\vert {\bf D} - {\bf C}\,{\bf A}^{-1} {\bf B} \right\vert , \]
provided that \( {\bf A}^{-1} \) exists.

If square matrices A and C commute (\( {\bf A}\,{\bf C} = {\bf C}\,{\bf A} \) ), then

\[ \begin{vmatrix} {\bf A}_{n\times n} & {\bf B}_{n\times n} \\ {\bf C}_{n\times n} & {\bf D}_{n\times n} \end{vmatrix} = \left\vert {\bf A}\, {\bf D} - {\bf C}\, {\bf B} \right\vert . \]

While the above equations have proven immensely useful, there remain many applications in which this result is not ideally suited. For example, there are situations in which physics dictates that we partition a matrix in different forms.

Inverse Matrices

Let A be an \( n \times n \) matrix. If there exists an \( n \times n \) matrix B such that

\[ {\bf A} \,{\bf B} = {\bf B}\,{\bf A} = {\bf I}_n , \]
where I is the \( n \times n \) identity, then the matrix A is said to be nonsingular or invertible. The matrix B is said to be the inverse matrix of A and denoted by \( {\bf A}^{-1} . \) Otherwise, matrix A is called singular when \( \det ({\bf A}) =0. \)

Theorem: If A is an invertible matrix, then \( {\bf A} {\bf A}^{\mathrm T} \) and \( {\bf A}^{\mathrm T} {\bf A} \) are also invertible.

Theorem: Inverse of lower/upper triangular matrix is lower/upper triangular matrix. The complexity of computing inverse n-by-n triangular matrix is

\[ n^2 + (n-1)^2 + \cdots + 1 = \frac{1}{6} \, n(n+1)(2n+1) \approx \frac{n^3}{3} \quad \mbox{flops}. \]

Theorem: If A is an invertible symmetric matrix (so it is equal to its transpose), then \( {\bf A}^{-1} \) is symmetric. ■

For inverse matrix, the standard commands are

sage: M = matrix(SR, 2, var('a,b,c,d'))  # SR stands for symbolic entries
sage: ~M
[1/a - b*c/(a^2*(b*c/a - d)) b/(a*(b*c/a - d))] [ c/(a*(b*c/a - d)) -1/(b*c/a - d)]
sage: (~M*M).simplify_rational() [1 0] [0 1]
sage: M^(-1) ; M
[1 0] [0 1]

Of course, you can evaluate the inverse matrix using Gauss--Jordan procedure. We show it on example.

Example. Find the inverse matrix to the following square matrix \( {\bf A} = \left[ \begin{array}{ccc} 1&-4&1 \\ 4&-1&2 \\ 2&2&-3 \end{array} \right] . \)

sage: AA=matrix(QQ,3,3,[[1,-4,1],[4,-1,2],[2,2,-3]]);A
[ 1 -4 1]
[ 4 -1 2]
[ 2 2 -3]
sage: m = identity_matrix(3);m
[1 0 0]
[0 1 0]
[0 0 1]
sage: m3=A.augment(m)
sage: m3
[ 1 -4 1 1 0 0]
[ 4 -1 2 0 1 0]
[ 2 2 -3 0 0 1]
sage: m4=m3.rref(); m4
[ 1 0 0 1/55 2/11 7/55]
[ 0 1 0 -16/55 1/11 -2/55]
[ 0 0 1 -2/11 2/11 -3/11]
Now we extract the last three columns to obtain the inverse matrix:
sage: Ain=m4.matrix_from_columns([3,4,5])
sage: Ain
[ 1/55 2/11 7/55]
[-16/55 1/11 -2/55]
[ -2/11 2/11 -3/11]
sage: Ain*A  # to check the answer [1 0 0]
[0 1 0]
[0 0 1]

Suppose that a square \( (m+n) \times (m+n) \) nonsingular matrix M can be partitioned into four blocks

\[ {\bf M} = \left[ \begin{array}{cc} {\bf A}_{m\times m} & {\bf B}_{m\times n} \\ {\bf C}_{n\times m} & {\bf D}_{n\times n} \end{array} \right] , \]
where A, B, C, and D have arbitrary sizes \( m \times m , \) \( m \times n , \) \( n \times m , \) and \( n \times n , \) respectively. (A and D must be square, so that they can be inverted. Furthermore, \( m \times m \) matrix A and \( n \times n \) Schur complement \( {\bf M}/{\bf A} = {\bf D} - {\bf C} {\bf A}^{-1} {\bf B} \) must be nonsingular.)

The inverse matrix \( {\bf M}^{-1} \) will be partitioned in exactly the same manner as the original matrix:

\[ {\bf M}^{-1} = \left[ \begin{array}{cc} {\bf Q}_{m\times m} & {\bf R}_{m\times n} \\ {\bf S}_{n\times m} & {\bf T}_{n\times n} \end{array} \right] \]
so that Q is \( m \times m ,\) T is \( n \times n ,\) S is \( n \times m ,\) and R is \( m \times n .\)

From the definition of an inverse (\( {\bf A}\,{\bf A}^{-1} = {\bf A}^{-1} {\bf A} = {\bf I}\) ) and the rules for multiplying partitioned matrices, the following four equations are obtained:
\[ {\bf A}\,{\bf Q} + {\bf B}\,{\bf S} = {\bf I} , \quad {\bf A} \, {\bf R} + {\bf B}\,{\bf T} = {\bf 0}, \quad {\bf C}\,{\bf Q} + {\bf D}\,{\bf S} = {\bf 0} , \quad {\bf C}\,{\bf R} + {\bf D}\,{\bf T} = {\bf I} . \]
These equations can be solved for the submatrices Q, R, S, and T to obtain
\[ {\bf Q} = \left( {\bf A} - {\bf B} \,{\bf D}^{-1} {\bf C} \right)^{-1} , \quad {\bf R} = - {\bf Q}\, {\bf B}\, {\bf D}^{-1} , \quad {\bf S} = -{\bf D}^{-1} {\bf C}\, {\bf Q} , \quad {\bf T} = {\bf D}^{-1} - {\bf D}^{-1} {\bf C}\,{\bf R} . \]

Equivalently, its inverse exists and equals

\[ {\bf M}^{-1} = \left[ \begin{array}{cc} {\bf A}^{-1} + {\bf A}^{-1} {\bf B} \left( {\bf D} - {\bf C} {\bf A}^{-1} {\bf B} \right)^{-1} {\bf C}\,{\bf A}^{-1} & -{\bf A}^{-1} {\bf B} \left( {\bf D} - {\bf C} {\bf A}^{-1} {\bf B} \right)^{-1} \\ - \left( {\bf D} - {\bf C} {\bf A}^{-1} {\bf B} \right)^{-1} {\bf C}\, {\bf A}^{-1} & \left( {\bf D} - {\bf C} {\bf A}^{-1} {\bf B} \right)^{-1} \end{array} \right] , \]

provided that all of the matrices on the right side exist. For a triangular block matrix, we have

\[ \begin{bmatrix} {\bf A} & {\bf B} \\ {\bf 0} & {\bf D} \end{bmatrix}^{-1} = \begin{bmatrix} {\bf A}^{-1} & - {\bf A}^{-1} {\bf B} {\bf D}^{-1} \\ {\bf 0} & {\bf D}^{-1} \end{bmatrix} . \]

In particular, suppose that a square matrix A is a block diagonal matrix having main diagonal block square matrices, such that the off-diagonal blocks are zero matrices:

\[ {\bf A} = \left[ \begin{array}{cccc} {\bf A}_{1} & {\bf 0} & \cdots & {\bf 0} \\ {\bf 0} & {\bf A}_{2} & \cdots & {\bf 0} \\ \vdots & \vdots & \ddots & \vdots \\ {\bf 0} & {\bf 0} & \cdots & {\bf A}_{n} \end{array} \right] , \]

where \( {\bf A}_k , \ (k=1,2,\ldots , n) \) is a square matrix; in other words, it is the direct sum of \( {\bf A}_1 , \ {\bf A}_2 , \ \ldots , \ {\bf A}_n \) . It can also be indicated as \( {\bf A} = {\bf A}_1 \oplus {\bf A}_2 \oplus \,\cdots \, \oplus {\bf A}_n \) or diag(\( {\bf A}_1 , \ {\bf A}_2 , \ \ldots , \ {\bf A}_n \) ). Any square matrix can trivially be considered a block diagonal matrix with only one block. For any arbitrary matrices A (of size \( m \times n \) ) and B (of size \( p \times q \) ), we have the direct sum of A and B, denoted by \( {\bf A} \oplus {\bf B} \) and defined as

\[ {\bf A} \oplus {\bf B} = \left[ \begin{array}{ccccccc} a_{1,1}&a_{1,2}&\cdots & a_{1,n} & 0 & \cdots & 0 \\ \vdots&\vdots&\ddots& \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} & 0 & \cdots & 0 \\ 0&0&\cdots& 0& b_{1,1} & \cdots & b_{1,q} \\ \vdots& \vdots& \ddots & \vdots & \vdots & \ddots & \vdots \\ 0&0& \cdots & 0 & b_{p,1} & \cdots & b_{p,q} \end{array} \right] . \]
For instance,
\[ \left[ \begin{array}{ccc} 1&2&3 \\ 4&5&6 \end{array} \right] \oplus \left[ \begin{array}{ccc} 7&8 \\ 9&1 \end{array} \right] = \left[ \begin{array}{ccccc} 1&2&3&0&0 \\ 4&5&6&0&0 \\ 0&0&0& 7&8 \\ 0&0&0&9&1 \end{array} \right] . \]
Note that any element in the direct sum of two vector spaces of matrices could be represented as a direct sum of two matrices. ■

The inverse of a block diagonal matrix is another block diagonal matrix, composed of the inverse of each block, as follows:

\[ {\bf A}^{-1} = \left[ \begin{array}{cccc} {\bf A}_{1} & {\bf 0} & \cdots & {\bf 0} \\ {\bf 0} & {\bf A}_{2} & \cdots & {\bf 0} \\ \vdots & \vdots & \ddots & \vdots \\ {\bf 0} & {\bf 0} & \cdots & {\bf A}_{n} \end{array} \right]^{-1} = \left[ \begin{array}{cccc} {\bf A}_{1}^{-1} & {\bf 0} & \cdots & {\bf 0} \\ {\bf 0} & {\bf A}_{2}^{-1} & \cdots & {\bf 0} \\ \vdots & \vdots & \ddots & \vdots \\ {\bf 0} & {\bf 0} & \cdots & {\bf A}_{n}^{-1} \end{array} \right] . \]
For the determinant and trace, the following properties hold:
\begin{align*} \det {\bf A} &= \det {\bf A}_1 \times \det {\bf A}_2 \times \cdots \times \det {\bf A}_n ; \\ \mbox{tr} {\bf A} &= \mbox{tr} {\bf A}_1 + \mbox{tr} {\bf A}_2 + \cdots + \mbox{tr} {\bf A}_n . \end{align*}
A block tridiagonal matrix is another special block matrix, which is just like the block diagonal matrix, having square matrices (blocks) in the lower diagonal, main diagonal and upper diagonal, with all other blocks being zero matrices. It is essentially a tridiagonal matrix but has submatrices in places of scalars. A block tridiagonal matrix A has the form:
\[ {\bf A} = \left[ \begin{array}{ccccccc} {\bf B}_{1} & {\bf C}_1 && & \cdots && {\bf 0} \\ {\bf A}_2 & {\bf B}_{2} & {\bf C}_2 & && & {\bf 0} \\ &\ddots & \ddots & \ddots & && \vdots \\ && {\bf A}_k & {\bf B}_{k} & {\bf C}_k & & \\ \vdots &&& \ddots & \ddots & \ddots & \\ &&&& {\bf A}_{n-1} & {\bf B}_{n-1} & {\bf C}_{n-1} \\ {\bf 0} & & \cdots & {\bf 0} & \cdots & {\bf A}_{n} & {\bf B}_n \end{array} \right] , \]
where \( {\bf A}_k , \ {\bf B}_k , \ \mbox{ and } \ {\bf C}_k \ (k=1,2,\ldots , n) \) are square sub-matrices of the lower, main and upper diagonal respectively.