The proof marries two metric facts — the
Lebesgue number lemma and the diameter-shrinking property of barycentric subdivision — with an iterated application of the chain homotopy
\(T\) from
Lemma 2.12.12.
Step 1: Lebesgue numbers. Given an open cover \(\{V_\beta\}\) of a compact metric space \((K,\rho)\text{,}\) a Lebesgue number for the cover is a real \(\delta>0\) with the property that every subset of \(K\) of diameter less than \(\delta\) is contained in some \(V_\beta\text{.}\) A Lebesgue number always exists: pass to a finite subcover \(V_1,\ldots,V_k\text{,}\) and define
\begin{equation*}
f\colon K\to\mathbb{R}, \qquad f(x) := \max_{1\le i\le k}\rho(x,\, K\setminus V_i).
\end{equation*}
Each distance-to-complement \(x\mapsto\rho(x,K\setminus V_i)\) is continuous, and vanishes exactly on \(K\setminus V_i\text{.}\) Since every \(x\) lies in some \(V_i\text{,}\) at least one of these is positive at \(x\text{,}\) so \(f(x)>0\) on all of \(K\text{.}\) By compactness, \(\delta:=\min_{x\in K}f(x)\) is strictly positive. If \(A\subseteq K\) has diameter \(<\delta\) and contains a point \(x\text{,}\) choose \(i\) attaining the max in \(f(x)\text{:}\) then \(A\subseteq B_\rho(x,\delta)\subseteq V_i\text{.}\)
Step 2: subdivision shrinks simplices. If \(\Delta\subseteq\mathbb{R}^N\) is an affine \(n\)-simplex with diameter \(d\text{,}\) every simplex appearing in \(S(\Delta)\) has diameter at most \(\tfrac{n}{n+1}d\text{.}\) To see this, let \(F\subseteq\Delta\) be a face with vertices \(v_0,\ldots,v_k\) (\(k\le n\)) and barycenter \(b_F=\tfrac{1}{k+1}\sum_{l}v_l\text{.}\) For any vertex \(v_l\) of \(F\text{,}\)
\begin{equation*}
v_l - b_F \;=\; \tfrac{1}{k+1}\sum_{m\neq l}(v_l - v_m),
\qquad\text{so}\qquad \|v_l - b_F\|\;\le\;\tfrac{k}{k+1}d\;\le\;\tfrac{n}{n+1}d.
\end{equation*}
The closed ball of radius \(\tfrac{n}{n+1}d\) around \(b_F\) therefore contains every vertex of \(F\text{,}\) and since balls are convex, it contains all of \(F\text{.}\) Now a simplex of \(S(\Delta)\) has vertices \(b_{F_0},\ldots,b_{F_n}\) with \(F_0\subsetneq\cdots\subsetneq F_n=\Delta\text{.}\) For \(i<j\text{,}\) \(b_{F_i}\in F_i\subseteq F_j\text{,}\) so \(\|b_{F_i}-b_{F_j}\|\le\tfrac{n}{n+1}d\) by the bound applied to \(F_j\text{.}\) The diameter of a simplex is the largest distance between its vertices, so the bound propagates.
Iterating, every simplex of
\(S^m(\Delta)\) has diameter at most
\(\bigl(\tfrac{n}{n+1}\bigr)^m d\text{,}\) which tends to
\(0\) as
\(m\to\infty\text{.}\)
Step 3: iteration makes chains \(\mathcal{U}\)-small. Fix a singular simplex \(\sigma\colon\Delta^n\to X\text{.}\) The sets \(\sigma^{-1}(\mathrm{int}\,U_\alpha)\) form an open cover of the compact metric space \(\Delta^n\text{,}\) so by Step 1 it has a Lebesgue number \(\delta_\sigma>0\text{.}\) By Step 2, there is a least \(m(\sigma)\ge 0\) such that every simplex of \(S^{m(\sigma)}(\iota_n)\) has diameter less than \(\delta_\sigma\) (where \(\iota_n\colon\Delta^n\to\Delta^n\) is the identity). Each such sub-simplex has image under \(\sigma\) contained in some \(U_\alpha\text{,}\) so
\begin{equation*}
S^{m(\sigma)}(\sigma) \;=\; \sigma_\sharp\bigl(S^{m(\sigma)}(\iota_n)\bigr) \;\in\; C_n^{\mathcal{U}}(X).
\end{equation*}
Extend linearly: \(m(\sigma)\) is defined on generators. Note that because the subcomplex \(C_\bullet^{\mathcal{U}}(X)\) is closed under \(\partial\text{,}\) once \(S^m(\sigma)\) is \(\mathcal{U}\)-small so are all its faces: concretely, \(m(\tau)\le m(\sigma)\) whenever \(\tau\) is a face of \(\sigma\text{.}\)
Step 4: a telescoping chain homotopy. For each fixed \(m\ge 0\text{,}\) define \(T_m\colon C_n(X)\to C_{n+1}(X)\) by
\begin{equation*}
T_m \;:=\; \sum_{i=0}^{m-1} T\circ S^i.
\end{equation*}
Using \(\partial T = (S-\mathrm{id}) - T\partial\) on each summand, together with \(\partial S^i = S^i\partial\text{,}\)
\begin{equation*}
\partial T_m \;=\; \sum_{i=0}^{m-1}\bigl[(S-\mathrm{id})S^i - T\partial S^i\bigr]
\;=\; (S^m - \mathrm{id}) - T_m\,\partial,
\end{equation*}
i.e. \(\partial T_m + T_m\partial = S^m - \mathrm{id}\text{.}\) (The \(S^{i+1}-S^i\) terms telescope.)
Step 5: from fixed \(m\) to variable \(m(\sigma)\text{.}\) Define \(D\colon C_n(X)\to C_{n+1}(X)\) on generators by
\begin{equation*}
D(\sigma) \;:=\; T_{m(\sigma)}(\sigma),
\end{equation*}
and the candidate retraction \(\rho\colon C_n(X)\to C_n^{\mathcal{U}}(X)\) by
\begin{equation*}
\rho(\sigma) \;:=\; \sigma + \partial D(\sigma) + D(\partial\sigma).
\end{equation*}
Applying Step 4 on the generator \(\sigma\) (with \(m=m(\sigma)\)) gives \(\partial D(\sigma) = S^{m(\sigma)}(\sigma) - \sigma - T_{m(\sigma)}(\partial\sigma)\text{,}\) so
\begin{equation*}
\rho(\sigma) \;=\; S^{m(\sigma)}(\sigma) \,+\, \bigl(D(\partial\sigma) - T_{m(\sigma)}(\partial\sigma)\bigr).
\end{equation*}
The first summand is \(\mathcal{U}\)-small by Step 3. For the correction term, on a face \(\tau\) of \(\partial\sigma\) (so that \(m(\tau)\le m(\sigma)\)),
\begin{equation*}
D(\tau) - T_{m(\sigma)}(\tau)
\;=\; T_{m(\tau)}(\tau) - T_{m(\sigma)}(\tau)
\;=\; -\sum_{i=m(\tau)}^{m(\sigma)-1} T\bigl(S^i(\tau)\bigr).
\end{equation*}
For \(i\ge m(\tau)\text{,}\) the chain \(S^i(\tau)\) is \(\mathcal{U}\)-small; and because \(T\) applied to any singular simplex \(\eta\) is supported in the image of \(\eta\) (since \(T(\eta)=\eta_\sharp T(\iota_{n-1})\)), \(T(S^i(\tau))\) is also \(\mathcal{U}\)-small. So the correction lies in \(C_\bullet^{\mathcal{U}}(X)\text{,}\) and \(\rho(\sigma)\in C_n^{\mathcal{U}}(X)\) as required.
Step 6: \(\rho\) is a chain-homotopy inverse to \(\iota\text{.}\) A direct computation using \(\partial^2=0\) shows that \(\rho\) is a chain map:
\begin{equation*}
\partial\rho(\sigma) \;=\; \partial\sigma + \partial^2 D(\sigma) + \partial D(\partial\sigma)
\;=\; \partial\sigma + \partial D(\partial\sigma)
\;=\; \rho(\partial\sigma),
\end{equation*}
using \(D(\partial^2\sigma)=0\) in the last step. If \(\sigma\) is already \(\mathcal{U}\)-small, then \(m(\sigma)=0\text{,}\) \(D(\sigma)=0\text{,}\) and each face is also \(\mathcal{U}\)-small with \(m(\tau)=0\text{,}\) so \(D(\partial\sigma)=0\text{;}\) hence \(\rho\iota=\mathrm{id}\text{.}\) Finally, the defining formula rearranges to
\begin{equation*}
\iota\rho - \mathrm{id} \;=\; \partial D + D\partial,
\end{equation*}
exhibiting \(D\) as a chain homotopy from \(\mathrm{id}\) to \(\iota\rho\text{.}\) Therefore \(\iota\) is a chain homotopy equivalence, and the induced map on homology is an isomorphism.