This section presents the famous Lagrange inversion theorem that provides an explicit expression for the inverse of an analytic function.

Return to computing page for the first course APMA0330
Return to computing page for the second course APMA0340
Return to Mathematica tutorial for the first course APMA0330
Return to Mathematica tutorial for the second course APMA0340
Return to the main page for the course APMA0330
Return to the main page for the course APMA0340
Return to Part V of the course APMA0330

Lagrange inversion theorem

Joseph-Louis Lagrange.

The Lagrange inversion theorem (or Lagrange inversion formula, which we abbreviate as LIT), also known as the Lagrange--Bürmann formula, gives the Taylor series expansion of the inverse function of an analytic function. The theorem was proved by Joseph-Louis Lagrange (1736--1813) and generalized by the German mathematician and teacher Hans Heinrich Bürmann ( --1817), both in the late 18th century. The Lagrange inversion formula is one of the fundamental formulas of combinatorics. In its simplest form it gives a formula for the power series coefficients of the solution f(x) of the function equation \( f(x) = x\, G\left( f(x) \right) \) in terms of coefficients of powers of G.

Theorem: Suppose z is defined as a function of w by an equation of the form

\[ f(w) =z , \]
where f is analytic at a point 𝑎 and \( f' (a) \ne 0 . \) Then it is possible to invert or solve the equation for w in the form of a series:
\[ w = a + \sum_{n\ge 1} \frac{g_n}{n!} \left( z- f(a) \right)^n , \]
\[ g_n = \lim_{w \mapsto a} \, \frac{{\text d}^{n-1}}{{\text d} w^{n-1}} \left( \frac{w-a}{f(w) - f(a)} \right)^n , \qquad n=1,2,\ldots . \]

Theorem: If \( f(z) = \sum_{n\ge 1} a_n z^n , \) with \( a_1 \ne 0 , \) (interpreted either as an analytic function or as a formal power series), then the inverse function has the following power series representation

\[ f^{-1} (w) = \sum_{n \ge 1} \, \frac{w^n}{n} \left[ z^{n-1} \right] \left( \frac{z}{f(z)} \right)^n , \]
where \( [z^n ] \,g(z) = g^{(n)} (0)/n! \) denotes the coefficient of zn in the power series g(z). More generally, if \( h(z) = \sum_{m\ge 0} b_m z^m , \) we have
\[ h \left( f^{-1} (w) \right) = h(0) + \sum_{n \ge 1} \, \frac{w^n}{n} \left[ z^{n-1} \right] h' (z) \left( \frac{z}{f(z)} \right)^n . \qquad ■ \]

It is convenient to introduce the function (or formal power series) \( g(z) = z/f(z) , \) then the equation \( f(z) = w \) can be rewritten as \( z = g(z)\, w , \) and its solution \( \varphi (w) = f^{-1} (w) \) is given by the power series

\[ \varphi (w) = f^{-1} (w) = \sum_{n \ge 1} \, \frac{w^n}{n} \left[ z^{n-1} \right] \left( g(z) \right)^n , \]
\[ h \left( \varphi (w) \right) = h(0) + \sum_{n \ge 1} \, \frac{w^n}{n} \left[ z^{n-1} \right] h' (z) \left( g(z) \right)^n . \]
There is also an alternate form
\[ h \left( \varphi (w) \right) = h(0) + \sum_{n \ge 1} \, w^n \left[ z^{n} \right] h (z) \left[ g^n (z) - z\, g' (z) \, g^{n-1} (z) \right] . \]


The more general implicit-function problem \( F(z,w) =0 \) has been solved by the Russian mathematician A. P. Yuzhakov in 1975. We rewrite the latter as \( z= G(z,w) , \) where \( G(0,0) =0 \) and \( \left\vert \left( \partial G/ \partial z \right) (0,0) \right\vert <1 . \) Then its solution \( z = \varphi (w) \) is given by the function series

\[ \varphi (w) = \sum_{n \ge 1} \, \frac{1}{n} \left[ z^{n-1} \right] \left( G(z,w) \right)^n . \]
More generally, for any analytic function (or formal power series) \( H(z,w) , \)
\begin{align*} H \left( \varphi (w) \right) &= H(0,w) + \sum_{n\ge 1} \frac{1}{n}\, \left[ z^{n-1} \right] \frac{\partial H(z,w)}{\partial z} \, G^n (z,w) , \\ &= H(0,w) + \sum_{n\ge 1} \left[ z^{n} \right] H(z,w) \left[ G^n (z,w) - z\, \frac{\partial G(z,w)}{\partial z} \, G^{n-1} (z,w) \right] . \end{align*}

Example: Let \( G(z,w) = \alpha (w)\,z + \beta (w) , \) with \( \beta (0) =0 , \) Then the inverse function has the following power series representation
\[ \varphi (w) = \sum_{n \ge 1} \, \frac{1}{n} \left[ z^{n-1} \right] \left( G(z,w) \right)^n = \sum_{n\ge 1} \alpha^{n-1} (w) \, \beta (w) = \frac{\beta (w)}{1- \alpha (w)} , \]
provided \( \left\vert \alpha (w) \right\vert <1 . \)    ▣

The inverse-function problem \( f(z) = w \) with \( f(z) = \sum_{n\ge 1} a_n z^n \quad (a_1 \ne 0 ) \) can be written in the form \( z = G(z,w) \) in a variety of different ways. The most obvious choice is

\[ G(z,w) = \frac{z}{f(z)} \, w , \]
which leads to the usual form of the Lagrange inversion formula:
\[ f^{-1} (w) = \sum_{n \ge 1} \, \frac{w^n}{n} \, a_1^{-n} \left[ z^{n-1} \right] \left( 1 + \sum_{n\ge 2} \frac{a_n}{a_1} \, z^{n-1} \right)^{-n} , \]
An alternative choice, proposed by A. P. Yuzhakov is
\[ G(z,w) = \frac{w}{a_1} - \frac{f(z) - a_1 z}{a_1} , \]
which leads to
\[ f^{-1} (w) = \sum_{n \ge 1} \, \frac{a_1^{-n}}{n} \, \sum_{k=0}^n \binom{n}{k} w^{n-k} \left( -1 \right)^k \left[ z^{n-1} \right] \left( \sum_{j\ge 2} a_j \, z^{j} \right)^{k} . \]

For instance, if \( f(z) = z\, e^{-z} , \) then the usual Lagrange inversion formula gives

\[ f^{-1} (w) = \sum_{n\ge 1} n^{n-1} \, \frac{w^n}{n!} . \]
Example: Use Lagrange inversion to explicitly solve the fifth order algebraic equation
\[ x^5 - x - a =0 . \]
The LIT yields
\[ x = -\sum_{k\ge 0} \binom{5k}{k} \, \frac{a^{4k+1}}{4k+1} . \]
Johann Heinrich Lambert.
Example: The Lambert W function is named after Johann Heinrich Lambert, and it is the function \( {\displaystyle W(z)} \) that is implicitly defined by the equation
\[ W(z)\, e^{W(z)} =z . \]
This equation always has an infinite number of solutions, most of them complex, and so W is a multivalued function. The different possible solutions are labeled by an integer variable called the branch of W. Thus, the proper way to talk about the solutions of the Lambert equation is to say that they are \( W_k (z) , \quad k=0, \pm 1, \pm 2, \ldots . \) There is always special interest in solutions that are purely real, and so we note immediately that when z is a real number, the Lambert equation can have either two real solutions, in which case they are \( W_0 (z) \quad\mbox{and}\quad W_{-1} (z) , \) or it can have only one real solution, this being \( W_0 (z) . \) Even if z is real, the branches other than \( k =0,-1 \) are always complex. In Mathematica, the function is called ProductLog. Therefore, as soon as a problem is solved in terms of W, numerical values, plots, derivatives and integrals can be easily obtained.

It first received a name in the early 1980s, when the program Maple defined a function that was named simply W. An historical search found work by the eighteenth century scientist J. H. Lambert that foreshadowed the definition of the function; even though his work did not actually define the function, was named in his honor. We may use the LIT theorem to compute the Taylor series of the Lambert function centered at the origin:
\[ W(z) = \sum_{n\ge 1} \frac{z^n}{n!} \, \lim_{w \mapsto 0} \left( \frac{{\text d}^{n-1}}{{\text d}^{n-1}} \, e^{-nw} \right) = \sum_{n\ge 1} \left( -n \right)^{n-1} \frac{z^n}{n!} . \]
The radius of convergence of this series is \( e^{-1} . \)

Johann Heinrich Lambert (1728--1777) was a Swiss polymath who made important contributions to the subjects of mathematics, physics (particularly optics), philosophy, astronomy and map projections.

By implicit differentiation, one can show that all branches of W satisfy the differential equation

\[ z \left( 1 + W \right) \frac{{\text d}W}{{\text d} z} = W \qquad\mbox{for} \quad z \ne -1/e . \]
Using the identity \( e^{W} = z/W , \) we get the following equivalent equation
\[ \frac{{\text d}W}{{\text d} z} = \frac{1}{z + e^{W(z)}} \qquad\mbox{for} \quad z \ne -1/e . \]
Example: Wien’s displacement law. The spectral distribution of black body radiation is a function of the wavelength λ and absolute temperature T, and is described by \( \rho (\lambda , T ) , \) defined such that \( \rho (\lambda , T ) \,{\text d} \lambda \) is the power emitted in a wavelength interval dλ per unit area from a black body at absolute temperature T. The wavelength λmax at which ρ is a maximum obeys Wien’s displacement law \( \lambda_{\max} T =b , \) where b is Wien’s displacement constant. This law was proposed by Wien in 1893 from general thermodynamic arguments. Once Planck's spectral distribution law is known, Wien’s law can be deduced and the value of b determined.

The Planck's spectral distribution law is

\[ \rho (\lambda , T) = \frac{8\pi hc/\lambda^5}{\exp (hc/\lambda kT) -1} . \]
The value of λ for which this function is a maximum can be obtained by solving \( \partial \rho / \partial\lambda =0 . \) After simplification, this leads to the equation
\[ -5\,\exp \left( \frac{hc}{\lambda k T} \right) + 5 + \exp \left( \frac{hc}{\lambda kT} \right) \frac{hc}{\lambda kT} =0, \]
which on substitution \( x = hc /\lambda kT , \) can be written concisely as the transcendent equation
\[ \left( x-5 \right) e^x = -5 . \]
This equation has the trivial solution x=0 and the nontrivial one:
\[ x=5 + W_0 \left( -5\,e^{-5} \right) . \]
Therefore, Wien's law is obtained with a new expression for Wien's displacement constant:
\[ b = \frac{hc/k}{5 + W_0 \left( -5\,e^{-5} \right)} = 2.897790961 \times 10^{-3} \,\mbox{mK} . \]
In the past, one would have obtained the numerical value of the law by programming a Newton--Raphson or similar solver on equation for x; now one can start up a computer package and obtain the value without programming. Time is saved not only because no programming is needed, but also because the system developers have implemented the fastest and most accurate method of evaluation.


Example: Maxwell--Boltzmann distribution. The Maxwell--Boltzmann distribution of velocity is written as
\[ f_{v} = 4\pi \left( \frac{m}{2\pi kT} \right)^{3/2} v^2 \exp \left\{ - \frac{mv^2}{2kT} \right\} , \]
where the velocity v is defined as the square root of the sum of squares of the three independent velocity components \( v = \sqrt{v_x^2 + v_y^2 + v_z^2} . \) Note that the unit of \( f_{v} \) in the above formula is probability per velocity v of a particle of mass m. The most probable velocity \( v_0 = \sqrt{2kT/m} \) is the velocity most likely to be possessed by any molecule in the system and corresponds to the maximum value \( f_0 = 4 / \left( v_0 e \sqrt{\pi} \right) \) of \( f_{v} . \)

Applying the definition of the W-function, we may present the velocity v as a function of the probability, \( f_{v} \)

\[ v^2 = -v_0^2 W \left( -\frac{f_v}{e\,f_0} \right) , \qquad 0 \le f_v \le f_0 , \quad k=0,-1. \]
It can be seen from above that for each value of the probability \( f_{v} \quad (0< f_v < f_0 ) \) there are two possible velocities \( v < v_0 \) and \( v > v_0 , \) according to the two real branches of the W-function, k=0 and k=-1, respectively. Using the property, \( W \left( - e^{-1} \right) = -1 , \) when \( f_v = f_0 \) we have \( v = v_0 . \) The velocity \( v = v_0 \sqrt{2} \) is one of the two possible velocities, corresponding to \( f_v = 2\,e^{-1} f_0 , \) which results from the equality \( W \left( -2\, e^{-2} \right) = -2 . \)


  1. Dobrushkin, V.A., Methods in Algorithmic Analysis, CRC Press, Boca Raton, FL, 2009.
  2. Scarpello, G.M. and Ritelli, D., A historical outline of the theorem of implicit functions, Divulgaciones Matematicas, 202, Vol. 10 No. 2, pp. 171--180.


Return to Mathematica page
Return to the main page (APMA0330)
Return to the Part 1 (Plotting)
Return to the Part 2 (First Order ODEs)
Return to the Part 3 (Numerical Methods)
Return to the Part 4 (Second and Higher Order ODEs)
Return to the Part 5 (Series and Recurrences)
Return to the Part 6 (Laplace Transform)
Return to the Part 7 (Boundary Value Problems)