This section presents the concept of converting a single ordinary differential equation (linear or nonlinear) into an equivalent system of first order differential equations. More precisely, if a single differential equation in normal form is of order n, then it is possible to transfer it to an equivalent system containing n (or more) differential equations of first order. Although this conversion is not unique, it is always the case that the single equation and the system of differential equations have the same set of solutions. Moreover, upon introducing additional auxiliary dependent variables, it is possible to simplify the problem and/or obtain useful information about solutions.

The reverse procedure of converting a system of linear ODEs into a single equation is discussed in section viii of Part 2. Not every system of n linear first order differential equations can be transfered to a single n-th order differential equation.

Given a single ordinary differential equation, one method of finding numerical solutions entails transferring it into an equivalent system of differential equations of the first order. When a single differential equation has an isolated highest derivative, it is always possible to transfer the differential equation into an equivalent system of differential equations of the first order. This can be achieved by denoting sequential derivatives of the unknown variable with a new dependent variable with the exception of the last derivative, which is used to incorporate the given single differential equation. Other options are also possible for such a conversion, which will be clear from the examples.

The word "equivalent" indicates that both a single differential equation and the corresponding system of equations have the same set of solutions (so no solution is lost or added). The opposite is not always true, and some systems of n ordinary differential equations cannot be reduced to an equivalent single n-th order differential equation. The general solution of the single n-th order differential equation depends on n arbitrary constants and it belongs to the space of functions having n continuous derivatives, which is usually denoted by Cn. A system of n differential equations has the same properties and its general solution also depends on n arbitrary constants. Therefore, a single n-th order ODE can be equivalent to the system of (n+1) first order ODEs only when the latter does not require extra differentiation of the unknown function. However, a single n-th order ODE cannot be equivalent to the (n−1) system of first order ODEs because the latter contains (n−1) arbitrary constants and requires utilization of Cn-1. A key illustration highlighting this is the pendulum equation, presented in Example 5, which does not use the derivative operator, and where a single second order equation is transformed into four first order equations representing an equivalent system. On the other hand, the derivative operator is used in Example 1, showing that the resulting system of equations is not equivalent to the original equation. Therefore, the key takeaway is to avoid using the derivative operator because it is an unbounded operation.

There are a couple of reasons why it is convenient to transfer a single differential equation of an order higher than one to a system of differential equations of first order. First of all, it is simpler to analyze theoretically the first order vector differential equation than deal with a higher order differential equation because the latter can be naturally included into a broader topic: systems of differential equations. Second, it is easier to use and implement a universal numerical algorithm for a first order derivative operator instead of utilizing special numerical procedures for higher order derivatives. It is more accurate to monitor errors of numerical calculations for a system of first order differential equations than for a single equation. Also, there are many situations in which we not only want to know the solution to an ODE (acronym for Ordinary Differential Equations), but also the derivative or acceleration of the solution. This information can be naturally extracted from the solution to the system of first order differential equations (with no extra work).

When treating the independent variable as a time, it is common to use Newton's notation \( \dot{y} \) for the derivative instead of Lagrange notation y' or Leibniz's notation \( \displaystyle {\text d}y/{\text d}t . \) Sometimes, we also utilize Euler's notation for the derivative operator: \( \displaystyle \texttt{D} = {\text d}/{\text d}t . \)

Example 1A: We consider a similar differential equation, but now it is a nonhomogeneous equation:
\begin{equation*} %\label{EqConv.1a} x''(t) + 2x'(t) - 3x(t) = 10\,\sin t, \qquad x(0) =3, \quad x' (0) =-1. \tag{1A.1} \end{equation*}
This is an inhomogeneous, linear, second-order ODE with constant coefficients. We find its particlular solution using method of undetermined coefficients:
\[ x_p (t) = A\,\cos t + B\,\sin t . \]
Numerical values of coefficients A and B are determined upon substitution xp into the given differential equation:
\[ -4 \left( A\,\cos t + B\,\sin t \right) + 2 \left( -A\,\sin t + B\,\cos t \right) = 10\,\sin t . \]
Equating coefficients of sine and cosine functions (that are linearly independent), we obtain the following system of algebraic equations:
\[ -4A + 2B =0, \qquad -4B -2A =10 . \]
Using matlab, we get the general solution to be
\[ x(t) = c_{1}e^{t} + c_{2}e^{-3t} - \cos t -2\,\sin t, \]
where c1 and c2 are arbitrary constants. Substitution of the general solution into the initial conditions yields
\[ x(0) = c_{1} + c_{2} -1 = 3, \qquad x'(0) = c_{1} -3\, c_{2} -2 = -1. \]
Asking matlab for help, we get Therefore, the solution of the given initial value problem becomes
\[ x(t) = \frac{13}{4}\, e^t + \frac{3}{4}\,e^{-3t} - \cos t -2\,\sin t . \]
matlab confirms Now we convert the given problem to a vector equation by letting x1 = x and its first derivative we denote by x2. Note that second derivative is known from the given equation \( x'' = 3\,x(t) - 2\, x' (t) = 3\, x_1 - 2\, x_2 . \) This leads to the system of first order differential equations
\[ \begin{split} x'_1 &= x_2 = x', \\ x'_2 &= x'' = -2\, x_2 + 3\,x_1 + 10\,\sin t. \end{split} \]
Upon introducing a column vector x = [ x1, x2]T (here "T" stands for transposition; see first chapter of this tutorial), we can rewrite the system of ODEs in more appropriate vector form:
\begin{equation*} %\label{EqConv.3a} \frac{\text d}{{\text d}t} \begin{bmatrix} x_1 (t) \\ x_2 (t) \end{bmatrix} = \begin{bmatrix} 0& \phantom{-}1 \\ 3 & -2 \end{bmatrix} \begin{bmatrix} x_1 (t) \\ x_2 (t) \end{bmatrix} + \begin{bmatrix} 0 \\ 10\,\sin t \end{bmatrix} , \qquad \begin{bmatrix} x_1 (0) \\ x_2 (0) \end{bmatrix} = \begin{bmatrix} \phantom{-}3 \\ -1 \end{bmatrix} . \tag{1A.2} \end{equation*}
In vector form, this initial value problem can be written as
\[ \frac{{\text d}{\bf x}}{{\text d}t} = {\bf A}\, {\bf x}(t) + {\bf f}(t) , \qquad {\bf x}(0) = \begin{bmatrix} \phantom{-}3 \\ -1 \end{bmatrix} , \quad \mbox{with} \quad {\bf A} = \begin{bmatrix} 0& \phantom{-}1 \\ 3 & -2 \end{bmatrix} , \quad {\bf f} (t) = \begin{bmatrix} 0 \\ 10\,\sin t \end{bmatrix} . \]

   

Now we generalize the previous examples by considering the general linear differential equation. Suppose we are given an n-th order linear differential equation

\begin{equation} \label{EqConv.5} x^{(n)} (t) + a_{n-1} x^{(n-1)} (t) + \cdots + a_1 x' (t) + a_0 x(t) = f(t) , \end{equation}
where all coefficients 𝑎i and function f are known. Upon introducing n new dependent variables
\[ y_1 (t) = x(t), \quad y_2 (t) = y'_1 (t) = x' (t), \quad y_3 (t) = y'_2 (t) = x'' (t), \quad \ldots , \quad y_n (t) = y'_{n-1} (t) = x^{(n-1)} (t) , \]
we reduce the given n-th order linear differential equation \eqref{EqConv.5} to an equivalent system of the first order equations.

We can assign to equation \eqref{EqConv.5} the differential operator \( L\left[ t, \texttt{D} \right] = \texttt{D}^n + a_{n-1} (t)\, \texttt{D}^{n-1} + \cdots + a_1 (t)\,\texttt{D} + a_0 (x)\,\texttt{I} , \) where \( \texttt{D} = {\text d}/{\text d}t \) and \( \texttt{D}^0 = \texttt{I} \) are the derivative operator and the identical operator, respectively. Then we rewrite the differential equation in a more concise form:

\[ L\left[ t, \texttt{D} \right] x(t) = f(t) . \]
When coefficients in this differential operator of order n do not depend on t, we obtain a constant coefficient differential operator \( L\left[ \texttt{D} \right] = \texttt{D}^n + a_{n-1} \, \texttt{D}^{n-1} + \cdots + a_1 \,\texttt{D} + a_0 \,\texttt{I}, \) to which we can assign a polynomial
\[ p(\lambda ) = \lambda^n + a_{n-1} \lambda^{n-1} + \cdots + a_1 \lambda + a_0 , \]
called the characteristic polynomial. Upon introducing n new dependent variables
\[ y_1 (t) = x(t), \quad y_2 = \dot{y}_1 , \quad y_3 = \ddot{y}_1 = \dot{y}_2 , \ldots , \quad y_{n} = \dot{y}_{n-1} , \]
we convert the given differential equation into the system of first order equations:
\[ \frac{\text d}{{\text d}t} \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix} = {\bf C}_{p} \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ \vdots \\ f(t) \end{bmatrix} . \]
where matrix
\begin{equation} \label{EqConv.6} {\bf C}_{p} = \begin{bmatrix} 0&1&0& \cdots &0&0 \\ 0&0&1& \cdots &0&0 \\ 0&0&0 \cdots &0&0 \\ \vdots&\vdots&\vdots& \ddots&\vdots&\vdots \\ 0&0&0& \cdots &0&1 \\ - a_0 & -a_1 & -a_2 & \cdots & -a_{n-2}& -a_{n-1} \end{bmatrix} \end{equation}
is called companion matrix for polynomial p(λ).
Theorem: For any polynomial p(λ), its companion matrix has the characteristic polynomial
\begin{equation} \label{EqConv.7} \chi (\lambda ) = \det \left( \lambda {\bf I} - {\bf C}_{p} \right) = p(\lambda ) \end{equation}
equal to the given polynomial.    ▣

The same approach can be used to convert a nonlinear differential equation to an equivalent system of first order equations. It should be noted that nonlinear systems of differential equations can be solved only numerically (except for some rare cases). When a numerical solver is applied, it is common to introduce an auxiliary variable---the residual---in order to control accuracy. This approach is demonstrated in the following series of examples.

  1. Brand, L., The companion matrix and its properties, The American Mathematical Monthly, 1964, Vol. 71, No. 6, pp. 629--634.
  2. Eastham, M.S.P., "The spectral theory of periodic differential equations" , Scottish Acad. Press (1973)
  3. Hill, G. W. "On the Part of the Motion of Lunar Perigee Which is a Function of the Mean Motions of the Sun and Moon." Acta Math. 8, 1-36, 1886.
  4. Magnus,W., Winkler, S., "Hill's equation" , Dover, reprint (1979)
  5. V.A. Yakubovich, V.M. Starzhinskii, "Linear differential equations with periodic coefficients" , Wiley (1975) (Translated from Russian)