This section presents the concept of converting a single ordinary differential
equation (linear or nonlinear) into an equivalent system of first order
differential equations. More precisely, if a single differential equation in normal form is of order n, then it is possible to transfer it to an equivalent system containing n (or more) differential equations of first order. Although this conversion is not unique, it is always the case that the single
equation and the system of differential equations have the same set of
solutions. Moreover, upon introducing additional auxiliary dependent
variables, it is possible to simplify the problem and/or obtain useful
information about solutions.
The reverse procedure of converting a system of linear ODEs into a single equation is discussed in section viii of Part 2. Not every system of n linear first order differential equations can be transfered to a single n-th order differential equation.
Return to computing page for the first course APMA0330
Return to computing page for the second course APMA0340
Return to Mathematica tutorial for the first course APMA0330
Return to Mathematica tutorial for the second course APMA0340
Return to the main page for the first course APMA0330
Return to the main page for the second course APMA0340
Introduction to Linear Algebra with Mathematica
Given a single ordinary differential equation, one method of finding numerical solutions entails transferring it into an equivalent system of differential equations of the first order.
When a single differential equation has an isolated highest derivative, it
is always possible to transfer the differential equation into
an equivalent system of differential equations of the first order. This can be
achieved by
denoting sequential derivatives of the unknown variable with a new dependent
variable with the exception of the last derivative, which is used to incorporate the
given single differential equation. Other options are also possible
for such a conversion, which will be clear from the examples.
The word "equivalent" indicates that both a single differential equation and the corresponding system of equations have the same set of solutions (so no solution is lost or added). The opposite is not always true, and some systems of n ordinary differential equations cannot be reduced to an equivalent single n-th order differential equation.
The general solution of the single n-th order differential equation depends on n arbitrary constants and it belongs to the space of functions having n continuous derivatives, which is usually denoted by C^{n}. A system of n differential equations has the same properties and its general solution also depends on n arbitrary constants. Therefore, a single n-th order ODE can be equivalent to the system of (n+1) first order ODEs only when the latter does not require extra differentiation of the unknown function. However, a single n-th order ODE cannot be equivalent to the (n−1) system of first order ODEs because the latter contains (n−1) arbitrary constants and requires utilization of C^{n-1}.
A key illustration highlighting this is the pendulum equation, presented in Example 5, which does not use
the derivative operator, and where
a single second order equation is transformed into four first order equations representing an equivalent system. On the other hand, the derivative operator is used in Example 1, showing that the resulting system of equations is not equivalent to the original equation. Therefore, the key takeaway is to avoid using the derivative operator because it is an unbounded operation.
There are a couple of reasons why it is convenient to transfer a single differential equation of an order higher than one to a system of differential equations of first order. First of all, it is simpler to analyze theoretically the first order vector differential equation than deal with a higher order differential equation because the latter can be naturally included into a broader topic: systems of differential equations. Second, it is easier to use and implement a universal numerical algorithm for a first order derivative operator instead of utilizing special numerical procedures for higher order derivatives. It is more accurate to monitor errors of numerical calculations for a system of first order differential equations than for a single equation. Also, there are many situations in which we not only want to know the solution to an ODE (acronym for Ordinary Differential Equations), but also the derivative or acceleration of the solution. This information can be naturally extracted from the solution to the system of first order differential equations (with no extra work).
When treating the independent variable as a time, it is common to use
Newton's notation \( \dot{y} \) for the derivative
instead of Lagrange notation
y' or Leibniz's notation \( \displaystyle
{\text d}y/{\text d}t . \) Sometimes, we also utilize Euler's notation for the derivative operator: \( \displaystyle
\texttt{D} = {\text d}/{\text d}t . \)
Example 1:
Let us start with a simple case of constant coefficient linear differential equation when all manipulations become transparent. So we
consider the following initial value problem for a linear second order
differential equation
This is a homogeneous, linear, second-order ODE with constant coefficients. As
a brief refresher from the previous course, recall that its solution can be computed by finding the
roots of the corresponding characteristic equation:
\[
\lambda^2 + 2\,\lambda - 3 = 0.
\tag{1.2}
\]
There are two distinct, real roots to the quadratic (characteristic) equation:
λ_{1} = 1 and λ_{2} = -3, and the general
solution is a linear combination of two exponent functions:
where c_{1} and c_{2} are arbitrary constants.
To satisfy the initial conditions, we have to choose these constants so that
x(0) = c_{1} + c_{2} = 3 and
x'(0) = c_{1} -3 c_{2} = -1. This yields
c_{1} = 2 and c_{2} = 1; and the solution becomes
Now we convert the given problem to a vector equation by letting
x_{1} = x and its first derivative we denote by
x_{2}. Note that second derivative is known from the given equation \( x'' = 3\,x(t) - 2\, x' (t) = 3\, x_1 - 2\, x_2
. \) This leads to the system of first order differential
equations
Upon introducing a column vector x = [ x_{1},
x_{2}]^{T} (here "T" stands for transposition; see first chapter of this tutorial), we can rewrite the system of ODEs in a more appropriate vector form:
In order to find the solution to this vector differential
equation, we first need to find the
eigenvalues of the corresponding matrix (which is called the companion
matrix):
where I is the identity matrix.
From observation, the eigenvalues of matrix A
are the same as the roots of the characteristic equation
λ² + 2 λ −3 = 0
for the given second order differential equation.
Sometimes, we need to monitor the residual of numerical calculations by
introducing an additional auxiliary variable y_{3} =
x'', the second derivative of the unknown solution
x(t), and keep other variables: y_{1} = x and
y_{2} = x'. Then its derivative, according to the given differential equation, will be:
This allows us to compose the system of three equations. There are two options to achieve it. First, we recall that the derivative of y_{2} is our new dependent variable y_{3}. So we get
Therefore, we obtain a vector differential equation \( \dot{\bf y} = {\bf C}\,{\bf y}(t) \) that is equivalent (we hope?) to the the original IVP (1.1)
Note that the derivative of y_{3} is expressed through the
original function x(t) as \( y'_3 = 7\, y_2 -6\, y_1 . \) So another option is to rewrite the given system of ODEs as
the system of diferential equations (1.6) is not equivalent to (1.1) because the corresponding matrix B has three eigenvalues {−3, 2, 1}. Recall that matrix A has two eigenvalues {−3, 1}. It becomes clear when we rewrite the system of three equations with matrix B in the following form with Euler's notation \( \texttt{D} = {\text d}/{\text d}t \) for derivative operator:
The system above contains three unbounded derivative operators, but the original differential equation (1.1) is of order two. So its general solution is
with some arbitrary constants C_{1}, C_{2}, and C_{3}.
Upon using the appropriate initial conditions we eliminated unwanted additional constant C_{3} from its general solution. So we obtain the same solution from the system of three equations as the original IVP, at least from theoretical point of view. However, matrix B has a dominant eigenvalue λ = 2>1 and the corresponding numerical algorithm should avoid it.
Then the difference of y_{3} and \(
3\, x(t) - 2\,x' (t) = 3\,y_1 - 2\, y_2 \) gives the residual (error)
of calculations.
We use the standard Mathematica command NDSolve to solve numerically the system of three equations (1.6) and plot the residual of calculations for the equation. Blue curve gives the error with the true second derivative, and the red curve gives the residual based on numerical calculation.
Figure 1. Residuals of the second derivative: the error between the second derivative y_{3}(t) and the true value in blue.
Figure 2. The residual of the second derivative y_{3}(t) = d²x/dt² and the numerical solution 3x_{2}(t) - 2x_{1}(t).
Note that introducing the system of three equations with artificial differentiation does not provide an equivalent system for the given equation because the derivative operator is unbounded.
The transformation to a system of first order ODEs is straight forward,
but it is not unique. Let us try another set of new dependent variables:
This is an inhomogeneous, linear, second-order ODE with constant coefficients.
We find its particlular solution using method of undetermined coefficients:
\[
x_p (t) = A\,\cos t + B\,\sin t .
\]
Numerical values of coefficients A and B are determined upon substitution x_{p} into the given differential equation:
\[
-4 \left( A\,\cos t + B\,\sin t \right) + 2 \left( -A\,\sin t + B\,\cos t \right) = 10\,\sin t .
\]
Equating coefficients of sine and cosine functions (that are linearly independent), we obtain the following system of algebraic equations:
Now we convert the given problem to a vector equation by letting
x_{1} = x and its first derivative we denote by
x_{2}. Note that second derivative is known from the given equation \( x'' = 3\,x(t) - 2\, x' (t) = 3\, x_1 - 2\, x_2
. \) This leads to the system of first order differential
equations
Upon introducing a column vector x = [ x_{1},
x_{2}]^{T} (here "T" stands for transposition; see first chapter of this tutorial), we can rewrite the system of ODEs in more appropriate vector form:
Example 2:
With our next example, we show that variable coefficients do not prevent conversion of a single equation of the second order into a system of two equations of the first order.
The differential equation for modeling non-linear electrical oscillations has the form
where \( \ddot{x} = \frac{{\text d}^2 x}{{\text d} t^2} = \texttt{D}^2 x(t) \) stands for the second derivative with respect to time variable t, and q(t) is a single-valued periodic function. This linear differential equation has practical importance in the analysis of time-variable circuits. If function q(t) is a periodic function, then Eq.(2.1) is called Hill's equation. It was discovered by an American astronomer and mathematician from Rutgers University George William Hill (1838--1914) in 1886.
Hill's differential equation is widely applicable to many problems involving sawtooth variations, rectangular ripple analysis, exponential variations, etc.
A particular case of Hill's differential equation when
the function q(t) has the form
\[
q(t) = a_0 - a_1 \cos \omega t
\tag{2.2}
\]
is known as Mathieu's equation.
The problem of vibrations of an elliptical membrane was first seriously studied by the French mathematician Émile Léonard Mathieu (1835--1890) in 1868, in his celebrated paper, “Mémoire sur le mouvement vibratoire d’une membrane de forme elliptique”, J. Math. Pures Appl. 13, 137 (1868). (http://eudml.org/doc/234720).
Mathieu proved that the vertical displacement produced from oscillations on an elliptical drumhead is governed, in general, by the 2D Helmholtz equation. Furthermore, Mathieu showed that when the 2D Helmholtz equation is transformed into elliptical coordinates, the differential equations, respectively, for the radial and angular coordinates of the elliptical system become separable. The result is two almost identical linear second-order differential equations, each of which is aptly named the “Mathieu equation”. So, to this end, Mathieu’s equation has the canonical form
where the separation constant λ and the physical parameter q are both real. Since a Wronskian of the Mathieu equation is a constant, if y_{1}(t) is a solution, then another linearly independent solution will be
where A is a parameter. Here double superdot \( \ddot{x} = \frac{{\text d}^2 x}{{\text d} t^2} \) stands for the second derivative with respect to time variable t. Upon converting the given
differential equation to the system of first order equations
So we see the behavior of solutions depends on the value of parameter A:
when it is greater than 1, solutions have different properties. Therefore,
such value of A = 1 is called the bifurcation point (or node).
Mathematica has a dedicated command NDSolve to find an a
approximate solution to the initial value problem. Correspondingly, we use it
to plot solutions
of the given initial value problem for different values of parameter A.
When you give a setting for WorkingPrecision (which is an option for various numerical operations that specifies how many digits of precision should be maintained in internal computations), this typically defines an upper limit on the precision of the results from a computation. But within this constraint you can tell the Wolfram Language how much precision and accuracy you want to get. You should realize that for many kinds of numerical operations, increasing precision and accuracy goals by only a few digits can greatly increase the computation time required. Given a particular setting for WorkingPrecision, each of the functions for numerical operations in the Wolfram Language uses certain default settings for PrecisionGoal (which is an option for various numerical operations that specifies how many effective digits of precision should be sought in the final result) and AccuracyGoal (which is an option for various numerical operations which specifies how many effective digits of accuracy should be sought in the final result). Typical is the case of NDSolve, in which these default settings are equal to half the settings given for WorkingPrecision.
One can try to control the accuracy of numerical calculations by introducing
the residual variable: \( z (t) = \ddot{x} = -
\left( A - \sin 2t \right) x = \left( \sin 2t - A \right) x . \)
Upon its differentiation, we get another system of differential equations with
three unknowns:
This system of 3 differential equations is not equivalent to the original Mathieu equation because its general solution depends on 3 arbitrary constants and its existence requires a solution to be from the space C³. To eliminate extra constant, we need to impose the third initial condition, so the initial value problem will have the same solution as the original IVP for the Mathieu equation.
⁎ ✱ ✲ ✳ ✺ ✻ ✼
✽ ❋
◾
Now we generalize the previous examples by considering the general linear
differential equation.
Suppose we are given an n-th order linear
differential equation
we reduce the given n-th order linear differential equation \eqref{EqConv.5} to an equivalent system of the first order equations.
We can assign to equation \eqref{EqConv.5} the differential
operator \( L\left[ t, \texttt{D} \right] =
\texttt{D}^n + a_{n-1} (t)\, \texttt{D}^{n-1} + \cdots + a_1 (t)\,\texttt{D} +
a_0 (x)\,\texttt{I} , \) where \( \texttt{D} =
{\text d}/{\text d}t \) and \( \texttt{D}^0 = \texttt{I} \) are the derivative operator and the identical operator, respectively. Then we rewrite the differential equation in a more concise
form:
\[
L\left[ t, \texttt{D} \right] x(t) = f(t) .
\]
When coefficients in this differential operator of order
n do not depend on t, we obtain a constant coefficient
differential operator \( L\left[ \texttt{D} \right] =
\texttt{D}^n + a_{n-1} \, \texttt{D}^{n-1} + \cdots + a_1 \,\texttt{D} +
a_0 \,\texttt{I}, \) to which we can assign a polynomial
Example 3:
Consider a polynomial of the fourth degree
\( p(\lambda ) = \lambda^4 + a\,\lambda^3 + b\, \lambda^2 +
c\,\lambda + d , \) where d ≠ 0, and its companion matrix
The same approach can be used to convert a nonlinear differential equation to an
equivalent system of first order equations. It should be noted that nonlinear
systems of differential equations can be solved only numerically (except for some
rare cases). When a numerical solver is applied, it is common to introduce an
auxiliary variable---the residual---in order to control accuracy. This
approach is demonstrated in the following series of examples.
Example 4:
Consider the initial value problem for the ideal pendulum equation
where initial displacement 𝑎 and velocity v are specified, and
ω² = (g/ℓ), with g being acceleration due to
gravity and ℓ being length of the rod containing as point mass. This
equation is called the ideal pendulum equation because friction is not taken
into account. When resistance is presented and assumed to be proportional to
the velocity of the bob, the equation becomes
Phase portrait for undriven pendulum with resistance
The ideal pendulum equation can be converted to a system of three equations
upon introducing the residual
variable:
\[
\begin{cases}
\dot{x} = y , \\
\dot{y} = z , \\
\dot{z} = - \omega^2 y\,\cos x ,
\end{cases} \qquad x(0) =a, \quad y(0) = v , \quad z(0) = - \omega^2 \sin a .
\]
So z is actually the second derivative of the original unknown function
x(t). Therefore, the last equation is obtained by differentiation of the original pendulum equation with respect to time variable t.
◾
Example 5:
In the previous example, we converted the ideal pendulum equation
\( \ddot{\theta} + \sin \theta =0 \) into the
system of first order equations upon substituting y_{1} =
θ and let y_{2} to be its derivative. Now we may define a
new variable y_{3} = sinθ. According to the chain rule, its
derivative is
where f(t) is a periodic forcing function, which can be chosen
as \( f(t) = A\,\cos \omega t \) or
\( f(t) = A\,\sin \omega t \) or their combination
\( f(t) = A\,\sin \left( \omega t + \phi \right) . \) Here A is the amplitude, k is the dumping coefficient, and ε is a real number.
We convert it to the system of two differential equations
It is not possible to specify a boundary condition at ∞ numerically,
so we will have to use a large number, and verify it is "large
enough". From the solution, we evaluate the derivatives at y =
0, and we have f''(0) = f_{3}(0).
We have to provide initial guesses
for f_{1}, f_{2}
and f_{3}. This is the hardest part about this
problem. We know that f_{1} starts at zero, and is flat
there
(f'(0)=0), but at large y, it has a constant slope
of one. We will guess a simple line of slope = 1
for f_{1}. That is correct at large y, and
is zero at y = 0. If the slope of the function is constant at large y, then the values of higher derivatives must tend to zero. We choose an exponential decay as a guess.
Finally, we let a solver iteratively find a solution for us, and find the answer we want.
We get the initial second derivative to be f''(0) =
0.3324911095517483. For the sake of simplicity, we approximate the Blasius constant as 1/3.
Now we use Picard iteration procedure starting with the initial
conditions. Since the derivative operator is unbounded, we apply its inverse to reduce the given initial value problem into fixed point form:
where \( \displaystyle h(x) = \sum_{k=1}^n (-1)^k x^{2k} \) and 𝑎, b, and c are arbitrary real parameters. We consider a simple version of the Genesio equation when chaotic jerk is observed:
which is a nonlinear vector differential equation for three unknowns.
We make a numerical experiment and plot two solutions of the Genesio equation for small values of parameters 𝑎 = 0.1 and c = 0.1, and then compare with the case when they are zero.
V.A. Yakubovich, V.M. Starzhinskii, "Linear differential equations with periodic coefficients" , Wiley (1975) (Translated from Russian)
Return to Mathematica page
Return to the main page (APMA0340)
Return to the Part 1 Matrix Algebra
Return to the Part 2 Linear Systems of Ordinary Differential Equations
Return to the Part 3 Non-linear Systems of Ordinary Differential Equations
Return to the Part 4 Numerical Methods
Return to the Part 5 Fourier Series
Return to the Part 6 Partial Differential Equations
Return to the Part 7 Special Functions