Jump to content

Extreme value theorem

From Wikipedia, the free encyclopedia
A continuous function {\displaystyle f(x)} on the closed interval {\displaystyle [a,b]} showing the absolute max (red) and the absolute min (blue).

In real analysis, a branch of mathematics, the extreme value theorem states that if a real-valued function {\displaystyle f} is continuous on the closed and bounded interval {\displaystyle [a,b]}, then {\displaystyle f} must attain a maximum and a minimum, each at least once.[1][2] That is, there exist numbers {\displaystyle c} and {\displaystyle d} in {\displaystyle [a,b]} such that: {\displaystyle f(c)\leq f(x)\leq f(d)\quad \forall x\in [a,b].}

The extreme value theorem is more specific than the related boundedness theorem, which states merely that a continuous function {\displaystyle f} on the closed interval {\displaystyle [a,b]} is bounded on that interval; that is, there exist real numbers {\displaystyle m} and {\displaystyle M} such that: {\displaystyle m\leq f(x)\leq M\quad \forall x\in [a,b].}

This does not say that {\displaystyle M} and {\displaystyle m} are necessarily the maximum and minimum values of {\displaystyle f} on the interval {\displaystyle [a,b],} which is what the extreme value theorem stipulates must also be the case.

The extreme value theorem is used to prove Rolle's theorem. In a formulation due to Karl Weierstrass, this theorem states that a continuous function from a non-empty compact space to a subset of the real numbers attains a maximum and a minimum.

History

[edit]

The extreme value theorem was originally proven by Bernard Bolzano in the 1830s in a work Function Theory but the work remained unpublished until 1930. Bolzano's proof consisted of showing that a continuous function on a closed interval was bounded, and then showing that the function attained a maximum and a minimum value. Both proofs involved what is known today as the Bolzano–Weierstrass theorem.[3]

Functions to which the theorem does not apply

[edit]

The following examples show why the function domain must be closed and bounded in order for the theorem to apply. Each fails to attain a maximum on the given interval.

  1. {\displaystyle f(x)=x} defined over {\displaystyle [0,\infty )} is not bounded from above.
  2. {\displaystyle f(x)={\frac {x}{1+x}}} defined over {\displaystyle [0,\infty )} is bounded from below but does not attain its least upper bound {\displaystyle 1}.
  3. {\displaystyle f(x)={\frac {1}{x}}} defined over {\displaystyle (0,1]} is not bounded from above.
  4. {\displaystyle f(x)=1-x} defined over {\displaystyle (0,1]} is bounded but never attains its least upper bound {\displaystyle 1}.

Defining {\displaystyle f(0)=0} in the last two examples shows that both theorems require continuity on {\displaystyle [a,b]}.

Generalization to metric and topological spaces

[edit]

When moving from the real line {\displaystyle \mathbb {R} } to metric spaces and general topological spaces, the appropriate generalization of a closed bounded interval is a compact set. A set {\displaystyle K} is said to be compact if it has the following property: from every collection of open sets {\displaystyle U_{\alpha }} such that {\textstyle \bigcup U_{\alpha }\supset K}, a finite subcollection {\displaystyle U_{\alpha _{1}},\ldots ,U_{\alpha _{n}}}can be chosen such that {\textstyle \bigcup _{i=1}^{n}U_{\alpha _{i}}\supset K}. This is usually stated in short as "every open cover of {\displaystyle K} has a finite subcover". The Heine–Borel theorem asserts that a subset of the real line is compact if and only if it is both closed and bounded. Correspondingly, a metric space has the Heine–Borel property if every closed and bounded set is also compact.

The concept of a continuous function can likewise be generalized. Given topological spaces {\displaystyle V,\ W}, a function {\displaystyle f:V\to W} is said to be continuous if for every open set {\displaystyle U\subset W}, {\displaystyle f^{-1}(U)\subset V} is also open. Given these definitions, continuous functions can be shown to preserve compactness:[4]

TheoremIf {\displaystyle V,\ W} are topological spaces, {\displaystyle f:V\to W} is a continuous function, and {\displaystyle K\subset V} is compact, then {\displaystyle f(K)\subset W} is also compact.

In particular, if {\displaystyle W=\mathbb {R} }, then this theorem implies that {\displaystyle f(K)} is closed and bounded for any compact set {\displaystyle K}, which in turn implies that {\displaystyle f} attains its supremum and infimum on any (nonempty) compact set {\displaystyle K}. Thus, we have the following generalization of the extreme value theorem:[4]

TheoremIf {\displaystyle K} is a nonempty compact set and {\displaystyle f:K\to \mathbb {R} } is a continuous function, then {\displaystyle f} is bounded and there exist {\displaystyle p,q\in K} such that {\displaystyle f(p)=\sup _{x\in K}f(x)} and {\displaystyle f(q)=\inf _{x\in K}f(x)}.

Slightly more generally, this is also true for an upper semicontinuous function. (see compact space#Functions and compact spaces).

Proving the theorems

[edit]

We look at the proof for the upper bound and the maximum of {\displaystyle f}. By applying these results to the function {\displaystyle -f}, the existence of the lower bound and the result for the minimum of {\displaystyle f} follows. Also note that everything in the proof is done within the context of the real numbers.

We first prove the boundedness theorem, which is a step in the proof of the extreme value theorem. The basic steps involved in the proof of the extreme value theorem are:

  1. Prove the boundedness theorem.
  2. Find a sequence so that its image converges to the supremum of {\displaystyle f}.
  3. Show that there exists a subsequence that converges to a point in the domain.
  4. Use continuity to show that the image of the subsequence converges to the supremum.

Proof of the boundedness theorem

[edit]

Boundedness TheoremIf {\displaystyle f(x)} is continuous on {\displaystyle [a,b],} then it is bounded on {\displaystyle [a,b].}

Proof

Suppose the function {\displaystyle f} is not bounded above on the interval {\displaystyle [a,b]}. Pick a sequence {\displaystyle (x_{n})_{n\in \mathbb {N} }} such that {\displaystyle x_{n}\in [a,b]} and {\displaystyle f(x_{n})>n}. Because {\displaystyle [a,b]} is bounded, the Bolzano–Weierstrass theorem implies that there exists a convergent subsequence {\displaystyle (x_{n_{k}})_{k\in \mathbb {N} }} of {\displaystyle ({x_{n}})}. Denote its limit by {\displaystyle x}. As {\displaystyle [a,b]} is closed, it contains {\displaystyle x}. Because {\displaystyle f} is continuous at {\displaystyle x}, we know that {\displaystyle f(x_{{n}_{k}})} converges to the real number {\displaystyle f(x)} (as {\displaystyle f} is sequentially continuous at {\displaystyle x}). But {\displaystyle f(x_{{n}_{k}})>n_{k}\geq k} for every {\displaystyle k}, which implies that {\displaystyle f(x_{{n}_{k}})} diverges to {\displaystyle +\infty }, a contradiction. Therefore, {\displaystyle f} is bounded above on {\displaystyle [a,b]}

Alternative proof

Consider the set {\displaystyle B} of points {\displaystyle p} in {\displaystyle [a,b]} such that {\displaystyle f(x)} is bounded on {\displaystyle [a,p]}. We note that {\displaystyle a} is one such point, for {\displaystyle f(x)} is bounded on {\displaystyle [a,a]} by the value {\displaystyle f(a)}. If {\displaystyle e>a} is another point, then all points between {\displaystyle a} and {\displaystyle e} also belong to {\displaystyle B}. In other words {\displaystyle B} is an interval closed at its left end by {\displaystyle a}.

Now {\displaystyle f} is continuous on the right at {\displaystyle a}, hence there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(a)|<1} for all {\displaystyle x} in {\displaystyle [a,a+\delta ]}. Thus {\displaystyle f} is bounded by {\displaystyle f(a)-1} and {\displaystyle f(a)+1} on the interval {\displaystyle [a,a+\delta ]} so that all these points belong to {\displaystyle B}.

So far, we know that {\displaystyle B} is an interval of non-zero length, closed at its left end by {\displaystyle a}.

Next, {\displaystyle B} is bounded above by {\displaystyle b}. Hence the set {\displaystyle B} has a supremum in {\displaystyle [a,b]} ; let us call it {\displaystyle s}. From the non-zero length of {\displaystyle B} we can deduce that {\displaystyle s>a}.

Suppose {\displaystyle s<b}. Now {\displaystyle f} is continuous at {\displaystyle s}, hence there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(s)|<1} for all {\displaystyle x} in {\displaystyle [s-\delta ,s+\delta ]} so that {\displaystyle f} is bounded on this interval. But it follows from the supremacy of {\displaystyle s} that there exists a point belonging to {\displaystyle B}, {\displaystyle e} say, which is greater than {\displaystyle s-\delta /2}. Thus {\displaystyle f} is bounded on {\displaystyle [a,e]} which overlaps {\displaystyle [s-\delta ,s+\delta ]} so that {\displaystyle f} is bounded on {\displaystyle [a,s+\delta ]}. This however contradicts the supremacy of {\displaystyle s}.

We must therefore have {\displaystyle s=b}. Now {\displaystyle f} is continuous on the left at {\displaystyle s}, hence there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(s)|<1} for all {\displaystyle x} in {\displaystyle [s-\delta ,s]} so that {\displaystyle f} is bounded on this interval. But it follows from the supremacy of {\displaystyle s} that there exists a point belonging to {\displaystyle B}, {\displaystyle e} say, which is greater than {\displaystyle s-\delta /2}. Thus {\displaystyle f} is bounded on {\displaystyle [a,e]} which overlaps {\displaystyle [s-\delta ,s]} so that {\displaystyle f} is bounded on {\displaystyle [a,s]}.  

Proofs of the extreme value theorem

[edit]
Proof of the Extreme Value Theorem

By the boundedness theorem, f is bounded from above, hence, by the Dedekind-completeness of the real numbers, the least upper bound (supremum) M of f exists. It is necessary to find a point d in [a, b] such that M = f(d). Let n be a natural number. As M is the least upper bound, M − 1/n is not an upper bound for f. Therefore, there exists dn in [a, b] so that M − 1/n < f(dn). This defines a sequence {dn}. Since M is an upper bound for f, we have M − 1/n < f(dn) ≤ M for all n. Therefore, the sequence {f(dn)} converges to M.

The Bolzano–Weierstrass theorem tells us that there exists a subsequence {{\displaystyle d_{n_{k}}}}, which converges to some d and, as [a, b] is closed, d is in [a, b]. Since f is continuous at d, the sequence {f({\displaystyle d_{n_{k}}})} converges to f(d). But {f(dnk)} is a subsequence of {f(dn)} that converges to M, so M = f(d). Therefore, f attains its supremum M at d

Alternative Proof of the Extreme Value Theorem

The set {yR : y = f(x) for some x ∈ [a,b]} is a bounded set. Hence, its least upper bound exists by least upper bound property of the real numbers. Let M = sup(f(x)) on [a, b]. If there is no point x on [ab] so that f(x) = M, then f(x) < M on [ab]. Therefore, 1/(Mf(x)) is continuous on [a, b].

However, to every positive number ε, there is always some x in [ab] such that Mf(x) < ε because M is the least upper bound. Hence, 1/(Mf(x)) > 1/ε, which means that 1/(Mf(x)) is not bounded. Since every continuous function on [a, b] is bounded, this contradicts the conclusion that 1/(Mf(x)) was continuous on [ab]. Therefore, there must be a point x in [ab] such that f(x) = M.

Proof using the hyperreals

[edit]
Proof

In the setting of non-standard calculus, let N  be an infinite hyperinteger. The interval [0, 1] has a natural hyperreal extension. Consider its partition into N subintervals of equal infinitesimal length 1/N, with partition points xi = i /N as i "runs" from 0 to N. The function ƒ  is also naturally extended to a function ƒ* defined on the hyperreals between 0 and 1. Note that in the standard setting (when N  is finite), a point with the maximal value of ƒ can always be chosen among the N+1 points xi, by induction. Hence, by the transfer principle, there is a hyperinteger i0 such that 0 ≤ i0 ≤ N and {\displaystyle f^{*}(x_{i_{0}})\geq f^{*}(x_{i})}  for all i = 0, ..., N. Consider the real point {\displaystyle c=\mathbf {st} (x_{i_{0}})} where st is the standard part function. An arbitrary real point x lies in a suitable sub-interval of the partition, namely {\displaystyle x\in [x_{i},x_{i+1}]}, so that  st(xi) = x. Applying st to the inequality {\displaystyle f^{*}(x_{i_{0}})\geq f^{*}(x_{i})}, we obtain {\displaystyle \mathbf {st} (f^{*}(x_{i_{0}}))\geq \mathbf {st} (f^{*}(x_{i}))}. By continuity of ƒ  we have

{\displaystyle \mathbf {st} (f^{*}(x_{i_{0}}))=f(\mathbf {st} (x_{i_{0}}))=f(c)}.

Hence ƒ(c) ≥ ƒ(x), for all real x, proving c to be a maximum of ƒ.[5]

Proof from first principles

[edit]

Statement      If {\displaystyle f(x)} is continuous on {\displaystyle [a,b]} then it attains its supremum on {\displaystyle [a,b]}

Proof

By the Boundedness Theorem, {\displaystyle f(x)} is bounded above on {\displaystyle [a,b]} and by the completeness property of the real numbers has a supremum in {\displaystyle [a,b]}. Let us call it {\displaystyle M}, or {\displaystyle M[a,b]}. It is clear that the restriction of {\displaystyle f} to the subinterval {\displaystyle [a,x]} where {\displaystyle x\leq b} has a supremum {\displaystyle M[a,x]} which is less than or equal to {\displaystyle M}, and that {\displaystyle M[a,x]} increases from {\displaystyle f(a)} to {\displaystyle M} as {\displaystyle x} increases from {\displaystyle a} to {\displaystyle b}.

If {\displaystyle f(a)=M} then we are done. Suppose therefore that {\displaystyle f(a)<M} and let {\displaystyle d=M-f(a)}. Consider the set {\displaystyle L} of points {\displaystyle x} in {\displaystyle [a,b]} such that {\displaystyle M[a,x]<M}.

Clearly {\displaystyle a\in L} ; moreover if {\displaystyle e>a} is another point in {\displaystyle L} then all points between {\displaystyle a} and {\displaystyle e} also belong to {\displaystyle L} because {\displaystyle M[a,x]} is monotonic increasing. Hence {\displaystyle L} is a non-empty interval, closed at its left end by {\displaystyle a}.

Now {\displaystyle f} is continuous on the right at {\displaystyle a}, hence there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(a)|<d/2} for all {\displaystyle x} in {\displaystyle [a,a+\delta ]}. Thus {\displaystyle f} is less than {\displaystyle M-d/2} on the interval {\displaystyle [a,a+\delta ]} so that all these points belong to {\displaystyle L}.

Next, {\displaystyle L} is bounded above by {\displaystyle b} and has therefore a supremum in {\displaystyle [a,b]}: let us call it {\displaystyle s}. We see from the above that {\displaystyle s>a}. We will show that {\displaystyle s} is the point we are seeking i.e. the point where {\displaystyle f} attains its supremum, or in other words {\displaystyle f(s)=M}.

Suppose the contrary viz. {\displaystyle f(s)<M}. Let {\displaystyle d=M-f(s)} and consider the following two cases:

  1. {\displaystyle s<b}.   As {\displaystyle f} is continuous at {\displaystyle s}, there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(s)|<d/2} for all {\displaystyle x} in {\displaystyle [s-\delta ,s+\delta ]}. This means that {\displaystyle f} is less than {\displaystyle M-d/2} on the interval {\displaystyle [s-\delta ,s+\delta ]}. But it follows from the supremacy of {\displaystyle s} that there exists a point, {\displaystyle e} say, belonging to {\displaystyle L} which is greater than {\displaystyle s-\delta }. By the definition of {\displaystyle L}, {\displaystyle M[a,e]<M}. Let {\displaystyle d_{1}=M-M[a,e]} then for all {\displaystyle x} in {\displaystyle [a,e]}, {\displaystyle f(x)\leq M-d_{1}}. Taking {\displaystyle d_{2}} to be the minimum of {\displaystyle d/2} and {\displaystyle d_{1}}, we have {\displaystyle f(x)\leq M-d_{2}} for all {\displaystyle x} in {\displaystyle [a,s+\delta ]}.
    Hence {\displaystyle M[a,s+\delta ]<M} so that {\displaystyle s+\delta \in L}. This however contradicts the supremacy of {\displaystyle s} and completes the proof.
  2. {\displaystyle s=b}.   As {\displaystyle f} is continuous on the left at {\displaystyle s}, there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(s)|<d/2} for all {\displaystyle x} in {\displaystyle [s-\delta ,s]}. This means that {\displaystyle f} is less than {\displaystyle M-d/2} on the interval {\displaystyle [s-\delta ,s]}. But it follows from the supremacy of {\displaystyle s} that there exists a point, {\displaystyle e} say, belonging to {\displaystyle L} which is greater than {\displaystyle s-\delta }. By the definition of {\displaystyle L}, {\displaystyle M[a,e]<M}. Let {\displaystyle d_{1}=M-M[a,e]} then for all {\displaystyle x} in {\displaystyle [a,e]}, {\displaystyle f(x)\leq M-d_{1}}. Taking {\displaystyle d_{2}} to be the minimum of {\displaystyle d/2} and {\displaystyle d_{1}}, we have {\displaystyle f(x)\leq M-d_{2}} for all {\displaystyle x} in {\displaystyle [a,b]}. This contradicts the supremacy of {\displaystyle M} and completes the proof.

Extension to semi-continuous functions

[edit]

If the continuity of the function f is weakened to semi-continuity, then the corresponding half of the boundedness theorem and the extreme value theorem hold and the values −∞ or +∞, respectively, from the extended real number line can be allowed as possible values.[clarification needed]

A function {\displaystyle f:[a,b]\to [-\infty ,\infty )} is said to be upper semi-continuous if {\displaystyle \limsup _{y\to x}f(y)\leq f(x)\quad \forall x\in [a,b].}

TheoremIf a function f : [a, b] → [–∞, ∞) is upper semi-continuous, then f is bounded above and attains its supremum.

Proof

If {\displaystyle f(x)=-\infty } for all x in [a,b], then the supremum is also {\displaystyle -\infty } and the theorem is true. In all other cases, the proof is a slight modification of the proofs given above. In the proof of the boundedness theorem, the upper semi-continuity of f at x only implies that the limit superior of the subsequence {f(xnk)} is bounded above by f(x) < ∞, but that is enough to obtain the contradiction. In the proof of the extreme value theorem, upper semi-continuity of f at d implies that the limit superior of the subsequence {f(dnk)} is bounded above by f(d), but this suffices to conclude that f(d) = M


Applying this result to −f proves a similar result for the infimums of lower semicontinuous functions. A function {\displaystyle f:[a,b]\to [-\infty ,\infty )} is said to be lower semi-continuous if {\displaystyle \liminf _{y\to x}f(y)\geq f(x)\quad \forall x\in [a,b].}

TheoremIf a function f : [a, b] → (–∞, ∞] is lower semi-continuous, then f is bounded below and attains its infimum.

A real-valued function is upper as well as lower semi-continuous, if and only if it is continuous in the usual sense. Hence these two theorems imply the boundedness theorem and the extreme value theorem.

References

[edit]
  1. ^ Spivak, Michael (September 1994). Calculus. Publish or Perish publishing. ISBN 978-0-914098-89-8.
  2. ^ Abbott, Stephen (2001). Understanding Analysis. Undergraduate Texts in Mathematics. New York: Springer-Verlag. ISBN 978-0387950600.
  3. ^ Rusnock, Paul; Kerr-Lawson, Angus (2005). "Bolzano and Uniform Continuity". Historia Mathematica. 32 (3): 303–311. doi:10.1016/j.hm.2004.11.003.
  4. ^ a b Rudin, Walter (1976). Principles of Mathematical Analysis. New York: McGraw Hill. pp. 89–90. ISBN 0-07-054235-X.
  5. ^ Keisler, H. Jerome (1986). Elementary Calculus : An Infinitesimal Approach (PDF). Boston: Prindle, Weber & Schmidt. p. 164. ISBN 0-87150-911-3.

Further reading

[edit]
[edit]
Extreme value theorem
Morty Proxy This is a proxified and sanitized view of the page, visit original site.