Cracking the Taylor Series Code

From Daunting Formula to Intuitive Understanding

8/6/20242 min read

When I first learned about the Taylor Series, I found it daunting to memorize the long formula. If you struggled with it too, here's a trick to help you quickly grasp the concept.

The essence of the Taylor series is that it approximates any function with a polynomial function around a given point by using the derivatives' information at that point. Intuitively, one function p(x) approximates another function f(x) around a point a, if their values are equal at that point: p(a)=f(a). But that's not enough, the slope around that point should be the same. This means the first derivative of f(x) and p(x) should be equal at point a.

Moreover, the rate of the change of slope of the curve should be the same for a better fit, which means the second derivative of f(x) and p(x) should match at point a. For even greater accuracy, we match higher-order derivatives. This pattern continues for the third, fourth, and nth derivatives. Therefore p(x) = f(a) + f'(a)(x-a) + f''(a)(x-a)²/2! + ... + f⁽ⁿ⁾(a)(x-a)ⁿ/n!, where n goes to infinite. p(x) is an approximate function of f(x) around the point a. Taylor series can be used to calculate the value of an entire function at every point if the value of the function and all of its derivatives are known at a single point.

A Taylor series is called Maclaurin series when approving the function around the point 0. It's fun to do Maclaurin series for functions like 1/1-x, ln(1-x), and e^x. It’s important to note that p(x) approximates f(x) only around point a, not for all points. The radius of convergence of the Taylor series defines the maximum distance between point a and the point where the output of the polynomial converges to the function f(x). This radius determines the region where the polynomial approximation is valid and accurately represents the original function. The function is said to be analytic in this region.

Functions exhibit varying behaviors regarding the convergence of their Taylor series. Some functions have Taylor series that converge for all real numbers, while others converge only within a limited radius. For instance, the Taylor series of exponential functions e^x, trigonometric functions sin(x), cos(x), and all polynomial functions converge for every real number. These functions are examples where the radius of convergence is infinite. In contrast, other functions may have Taylor series with a finite, sometimes quite small, radius of convergence, such as square root, logarithm, tangent, and arctan.

One application of the Taylor Series is the Delta Method, which derives the asymptotic distribution of a random variable. The method is applicable when the random variable is a differentiable function of another random variable which is asymptotically Gaussian. The term "asymptotically Gaussian" refers to the property that the random variable approaches the normal distribution as the sample size increases, such as the sample mean.

The intuition behind the delta method is that any differentiable g function, in a "small enough" range of the function, can be approximated by the first-order Taylor series. The "small enough" range means the higher-order terms of the Taylor series become negligible at the point of expansion. The "small enough" range can be achieved by approximating the function near the mean, as the Central Limt Theorem (CLT) shows that the variance of the sample mean decreases as the sample size increases. Therefore g(Xn) = g(mu) + g'(mu)(Xn-mu), where Xn is an asymptotically Gaussian random variable, such as sample mean, and mu is the expectation/mean of Xn.

Therefore, in practice, the Delta method is more effective for large samples, as the approximation becomes more robust and the influence of higher-order terms diminishes.