<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://connorboyle.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://connorboyle.io/" rel="alternate" type="text/html" /><updated>2026-05-12T21:59:59+00:00</updated><id>https://connorboyle.io/feed.xml</id><title type="html">Connor Boyle</title><entry><title type="html">Fast Fourier Transforms Part 3: Bluestein’s Algorithm</title><link href="https://connorboyle.io/2026/05/02/fft-bluesteins-algorithm.html" rel="alternate" type="text/html" title="Fast Fourier Transforms Part 3: Bluestein’s Algorithm" /><published>2026-05-02T00:00:00+00:00</published><updated>2026-05-02T00:00:00+00:00</updated><id>https://connorboyle.io/2026/05/02/fft-bluesteins-algorithm</id><content type="html" xml:base="https://connorboyle.io/2026/05/02/fft-bluesteins-algorithm.html"><![CDATA[<p><em>This post is part 3 in my ongoing series on fast Fourier transform algorithms. You may wish to
read <a href="/2025/09/11/fft-cooley-tukey.html">part 1</a> and <a href="/2026/04/27/fft-convolution-theorem.html">part 2</a> before reading this article.</em></p>

<p>The Cooley-Tukey algorithm works very well when the input sequence’s length is a product of (mostly or entirely) small
prime factors, which it can use to “break down” the problem. However, it provides little speedup once those small prime
factors (if any exist) are exhausted. In particular, for prime input lengths, Cooley-Tukey provides <em>no speedup at all</em>,
degrading to \(O(\lvert x \rvert^2)\) time complexity. In order to
guarantee \(O(\lvert x \rvert \log \lvert x \rvert )\) time complexity regardless of what prime factors the length has,
we will need another algorithm, such as Bluestein’s algorithm.</p>

<h2 id="bluesteins-algorithm">Bluestein’s Algorithm</h2>

<p>Recall the definition of the discrete Fourier transform:</p>

\[\mathcal{F} \{ x \}[k] = \sum_{j=0}^{|x| - 1} x[j] \cdot W_{|x|}^{jk}\]

<p>Substitute the identity \(jk = \frac{j^2 + k^2 - (k-j)^2}{2}\):</p>

\[= \sum_{j=0}^{|x| - 1} x[j] \cdot W_{|x|}^{\frac{j^2 + k^2 - (k-j)^2}{2}}\]

\[= W_{|x|}^{\frac{k^2}{2}} \cdot \sum_{j=0}^{|x| - 1} x[j] \cdot W_{|x|}^{\frac{j^2}{2}} \cdot W_{|x|}^{\frac{-(k-j)^2}{2}}\]

<p>At this point, the right-hand side should look very similar to a cyclic convolution, however it still needs a few more
manipulations:</p>

\[\mathcal{F} \{ x \}[k + 1 - |x|] = W_{|x|}^{\frac{(k + 1 - |x|)^2}{2}} \cdot \sum_{j=0}^{|x| - 1} x[j] \cdot W_{|x|}^{\frac{j^2}{2}} \cdot W_{|x|}^{\frac{-(k + 1 - |x| - j)^2}{2}}\]

<p>Let there be sequences \(y\) and \(z\), both of some (yet-undecided) length \(N\) defined as follows:</p>

\[y[j] = \left\{
\begin{array}{lll}
x[j] \cdot W_{|x|}^{\frac{j^2}{2}} &amp; \textrm{if} &amp; 0 \leq j \leq |x| - 1 \\
0 &amp; \textrm{if} &amp; |x| \leq j \leq N - 1 \\
\end{array}
\right.\]

\[z[j] = W_{|x|}^{\frac{-(j + 1 - |x|)^2}{2}}\]

\[|y| = |z| = N\]

<p>Substituting these sequences to our previous equation:</p>

\[\mathcal{F} \{ x \}[k + 1 - |x|] = W_{|x|}^{\frac{(k + 1 - |x|)^2}{2}} \cdot \sum_{j=0}^{|x| - 1} y[j] \cdot z[k-j]\]

<p>For the discrete Fourier transform \(\mathcal{F}\{ x \}\), the only valid indices are from 0
to \(\lvert x \rvert - 1\), therefore:</p>

\[0 \leq k + 1 - |x| \leq |x| - 1\]

\[|x| - 1 \leq k \leq 2|x| - 2\]

<p>Inside the summation, we know that:</p>

\[0 \leq j \leq |x| - 1\]

\[1 - |x| \leq -j \leq 0\]

\[0 \leq k - j \leq 2|x| - 2\]

<p>If we choose an integer value for \(N\) such that \(2 \lvert x \rvert - 2 &lt; N\), then \((k - j) \% N = k-j\):</p>

\[\mathcal{F} \{ x \}[k + 1 - |x|] = W_{|x|}^{\frac{(k + 1 - |x|)^2}{2}} \cdot \sum_{j=0}^{|x| - 1} y[j] \cdot z[(k-j) \% N]\]

<p>Since \(y[j] = 0\) for all \(\lvert x \rvert \leq j \leq N - 1\), we can extend the summation’s bounds without
changing its value:</p>

\[\mathcal{F} \{ x \}[k + 1 - |x|] = W_{|x|}^{\frac{(k + 1 - |x|)^2}{2}} \cdot \sum_{j=0}^{N - 1} y[j] \cdot z[(k-j) \% N]\]

\[= W_{|x|}^{\frac{(k + 1 - |x|)^2}{2}} \cdot ( y \circledast z )[k]\]

\[\mathcal{F} \{ x \}[k] = W_{|x|}^{\frac{k^2}{2}} \cdot ( y \circledast z )[k + |x| - 1]\]

<h2 id="discrete-fourier-transform-as-cyclic-convolution">Discrete Fourier Transform as Cyclic Convolution</h2>

<p>We have thus far shown that the discrete Fourier transform of any sequence \(x\) can be calculated as the convolution of
two sequences \(y\) and \(z\). \(y\) is a “zero-padded” copy of our input sequence, multiplied elementwise with
a <a href="https://en.wikipedia.org/wiki/Chirp">chirp signal</a>, while \(z\) is itself a chirp signal. The advantage of this
formulation lies in our freedom to choose the length of these sequences; their lengths \(N\) can be chosen to be any
integer greater than \(2 \lvert x \rvert - 1\).</p>

<p>If we choose \(N\) to be the smallest power of 2 greater than or equal to \(2 \lvert x \rvert - 1\), then in the
worst case scenario<sup id="fnref:upper-bound" role="doc-noteref"><a href="#fn:upper-bound" class="footnote" rel="footnote">1</a></sup> where \(2 \lvert x \rvert - 1 = 2^p + 1\) (\(p \in \mathbb{N}\)),
then \(N = 2^{p+1} = 4 \lvert x \rvert - 4\). If we apply
the <a href="/2026/04/27/fft-convolution-theorem.html#the-convolution-theorem">convolution theorem</a>, here,
we now see that the discrete Fourier transform \(\mathcal{F} \{ x \}\) can be calculated by taking the DFT’s of \(y\)
and \(z\), then taking the inverse DFT of their elementwise product, and finally multiplying that result elementwise by
another chirp signal:</p>

\[\mathcal{F} \{ x \}[k] = W_{|x|}^{\frac{k^2}{2}} \cdot \mathcal{F}^{-1} \{ \mathcal{F} \{ y \} \odot \mathcal{F} \{ z \} \} [k + |x| - 1]\]

<p>All of these operations take linear time complexity with respect to \(N\), except for the inverse and forward discrete
Fourier transforms. Since these DFT’s are calculated on sequences whose length is a power of 2, we know they can be
calculated in \(O(\lvert x \rvert \log \lvert x \rvert)\) time using the <a href="/2025/09/11/fft-cooley-tukey.html">Cooley-Tukey algorithm</a>.</p>

<h2 id="how-fast-is-a-fast-fourier-transform">How Fast is a Fast Fourier Transform?</h2>

<p>The guaranteed <a href="https://en.wikipedia.org/wiki/Time_complexity#Table_of_common_time_complexities">linearithmic</a>
time complexity of Bluestein’s algorithm unfortunately does not mean that the performance is just as good as we would
get for Cooley-Tukey applied to a sequence of highly-composite length. Bluestein’s algorithm inflates the problem size
to at least \(N = 2 \lvert x \rvert - 1\), while also adding an inverse Fourier transform step.</p>

<p>The limitations of Bluestein’s algorithm are apparent when one uses any fast Fourier transform implementation that
relies on it, such
as <a href="https://numpy.org/devdocs/release/1.17.0-notes.html#replacement-of-the-fftpack-based-fft-module-by-the-pocketfft-library">NumPy’s</a>.
In a Google Colab <a href="https://colab.research.google.com/drive/194F5UyujsP71S-DWpMTY8PSLYb8YMp4G?usp=sharing">notebook</a>, I
ran <a href="https://numpy.org/doc/stable/reference/generated/numpy.fft.fft.html"><code class="language-plaintext highlighter-rouge">numpy.fft.fft()</code></a> on random real-numbered
sequences ranging in length from \(2^{20}\) to \(2^{20}+24\); below is a graph showing the length of time taken to compute
the discrete Fourier transform of each of the sequences:</p>

<div class="img-wrapper">
    <script type="text/javascript">window.PlotlyConfig = {MathJaxConfig: 'local'};</script>
    <script charset="utf-8" src="https://cdn.plot.ly/plotly-2.35.2.min.js"></script>
    <div id="401ad8fd-05ba-4605-afc0-5dc4183f26df" class="plotly-graph-div" style="height:100%; width:100%;"></div>
    <script type="text/javascript">                                    window.PLOTLYENV=window.PLOTLYENV || {};                                    if (document.getElementById("401ad8fd-05ba-4605-afc0-5dc4183f26df")) {                    Plotly.newPlot(                        "401ad8fd-05ba-4605-afc0-5dc4183f26df",                        [{"marker":{"color":"#AFA9EC","line":{"color":"#534AB7","width":1.5}},"name":"time (sec)","x":[1048576,1048577,1048578,1048579,1048580,1048581,1048582,1048583,1048584,1048585,1048586,1048587,1048588,1048589,1048590,1048591,1048592,1048593,1048594,1048595,1048596,1048597,1048598,1048599,1048600],"y":[0.07537245750427246,0.4409618377685547,0.4523799419403076,0.4962954521179199,0.15537786483764648,0.3865385055541992,0.21071290969848633,0.43471813201904297,0.4464147090911865,0.43976402282714844,0.3624691963195801,0.43639326095581055,0.5112380981445312,0.5971498489379883,0.2506849765777588,0.40683841705322266,0.6149370670318604,0.25271081924438477,0.6094908714294434,0.4449012279510498,0.43888068199157715,0.44298338890075684,0.21194171905517578,0.3494877815246582,0.11764001846313477],"yaxis":"y","type":"bar"},{"customdata":["[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]","[17, 61681]","[2, 3, 174763]","[7, 163, 919]","[2, 2, 5, 13, 37, 109]","[3, 3, 263, 443]","[2, 29, 101, 179]","[1048583]","[2, 2, 2, 3, 43691]","[5, 209717]","[2, 7, 11, 11, 619]","[3, 349529]","[2, 2, 262147]","[1048589]","[2, 3, 3, 5, 61, 191]","[19, 229, 241]","[2, 2, 2, 2, 65537]","[3, 7, 13, 23, 167]","[2, 17, 30841]","[5, 209719]","[2, 2, 3, 87383]","[11, 95327]","[2, 43, 89, 137]","[3, 3, 3, 71, 547]","[2, 2, 2, 5, 5, 7, 7, 107]"],"hovertemplate":"%{y} prime factor(s): %{customdata}\u003cextra\u003e\u003c\u002fextra\u003e","line":{"color":"#1D9E75","shape":"spline","width":2.5},"marker":{"color":"#1D9E75","line":{"color":"white","width":2},"size":8},"mode":"lines+markers","name":"number of prime factors","x":[1048576,1048577,1048578,1048579,1048580,1048581,1048582,1048583,1048584,1048585,1048586,1048587,1048588,1048589,1048590,1048591,1048592,1048593,1048594,1048595,1048596,1048597,1048598,1048599,1048600],"y":[20,2,3,3,6,4,4,1,5,2,5,2,3,1,6,3,5,5,3,2,4,2,4,5,8],"yaxis":"y2","type":"scatter"},{"line":{"color":"#9e1d33","shape":"spline","width":2.5},"marker":{"color":"#9e1d33","line":{"color":"white","width":2},"size":8},"mode":"lines+markers","name":"log sum","visible":"legendonly","x":[1048576,1048577,1048578,1048579,1048580,1048581,1048582,1048583,1048584,1048585,1048586,1048587,1048588,1048589,1048590,1048591,1048592,1048593,1048594,1048595,1048596,1048597,1048598,1048599,1048600],"y":[3.6888794541139363,11.030006794457245,12.071214659083324,6.9930151229329605,5.123963979403259,6.568077911411976,5.739792912179234,13.862950286896838,10.685103381083682,12.253538123165887,6.476972362889683,12.764350395835228,12.47667595260336,13.862956008888167,5.579729825986222,6.192362489474872,11.09049220863191,5.3612921657094255,10.337216125917152,12.25354765955436,11.37813613862159,11.465183750984767,5.602118820879701,6.440946540632921,4.919980925828125],"yaxis":"y2","type":"scatter"}],                        {"template":{"data":{"histogram2dcontour":[{"type":"histogram2dcontour","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"choropleth":[{"type":"choropleth","colorbar":{"outlinewidth":0,"ticks":""}}],"histogram2d":[{"type":"histogram2d","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"heatmap":[{"type":"heatmap","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"heatmapgl":[{"type":"heatmapgl","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"contourcarpet":[{"type":"contourcarpet","colorbar":{"outlinewidth":0,"ticks":""}}],"contour":[{"type":"contour","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"surface":[{"type":"surface","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"mesh3d":[{"type":"mesh3d","colorbar":{"outlinewidth":0,"ticks":""}}],"scatter":[{"fillpattern":{"fillmode":"overlay","size":10,"solidity":0.2},"type":"scatter"}],"parcoords":[{"type":"parcoords","line":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterpolargl":[{"type":"scatterpolargl","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"bar":[{"error_x":{"color":"#2a3f5f"},"error_y":{"color":"#2a3f5f"},"marker":{"line":{"color":"#E5ECF6","width":0.5},"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"bar"}],"scattergeo":[{"type":"scattergeo","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterpolar":[{"type":"scatterpolar","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"histogram":[{"marker":{"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"histogram"}],"scattergl":[{"type":"scattergl","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatter3d":[{"type":"scatter3d","line":{"colorbar":{"outlinewidth":0,"ticks":""}},"marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scattermapbox":[{"type":"scattermapbox","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterternary":[{"type":"scatterternary","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scattercarpet":[{"type":"scattercarpet","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"carpet":[{"aaxis":{"endlinecolor":"#2a3f5f","gridcolor":"white","linecolor":"white","minorgridcolor":"white","startlinecolor":"#2a3f5f"},"baxis":{"endlinecolor":"#2a3f5f","gridcolor":"white","linecolor":"white","minorgridcolor":"white","startlinecolor":"#2a3f5f"},"type":"carpet"}],"table":[{"cells":{"fill":{"color":"#EBF0F8"},"line":{"color":"white"}},"header":{"fill":{"color":"#C8D4E3"},"line":{"color":"white"}},"type":"table"}],"barpolar":[{"marker":{"line":{"color":"#E5ECF6","width":0.5},"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"barpolar"}],"pie":[{"automargin":true,"type":"pie"}]},"layout":{"autotypenumbers":"strict","colorway":["#636efa","#EF553B","#00cc96","#ab63fa","#FFA15A","#19d3f3","#FF6692","#B6E880","#FF97FF","#FECB52"],"font":{"color":"#2a3f5f"},"hovermode":"closest","hoverlabel":{"align":"left"},"paper_bgcolor":"white","plot_bgcolor":"#E5ECF6","polar":{"bgcolor":"#E5ECF6","angularaxis":{"gridcolor":"white","linecolor":"white","ticks":""},"radialaxis":{"gridcolor":"white","linecolor":"white","ticks":""}},"ternary":{"bgcolor":"#E5ECF6","aaxis":{"gridcolor":"white","linecolor":"white","ticks":""},"baxis":{"gridcolor":"white","linecolor":"white","ticks":""},"caxis":{"gridcolor":"white","linecolor":"white","ticks":""}},"coloraxis":{"colorbar":{"outlinewidth":0,"ticks":""}},"colorscale":{"sequential":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]],"sequentialminus":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]],"diverging":[[0,"#8e0152"],[0.1,"#c51b7d"],[0.2,"#de77ae"],[0.3,"#f1b6da"],[0.4,"#fde0ef"],[0.5,"#f7f7f7"],[0.6,"#e6f5d0"],[0.7,"#b8e186"],[0.8,"#7fbc41"],[0.9,"#4d9221"],[1,"#276419"]]},"xaxis":{"gridcolor":"white","linecolor":"white","ticks":"","title":{"standoff":15},"zerolinecolor":"white","automargin":true,"zerolinewidth":2},"yaxis":{"gridcolor":"white","linecolor":"white","ticks":"","title":{"standoff":15},"zerolinecolor":"white","automargin":true,"zerolinewidth":2},"scene":{"xaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2},"yaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2},"zaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2}},"shapedefaults":{"line":{"color":"#2a3f5f"}},"annotationdefaults":{"arrowcolor":"#2a3f5f","arrowhead":0,"arrowwidth":1},"geo":{"bgcolor":"white","landcolor":"#E5ECF6","subunitcolor":"white","showland":true,"showlakes":true,"lakecolor":"white"},"title":{"x":0.05},"mapbox":{"style":"light"}}},"legend":{"orientation":"h","y":1.08,"x":0},"xaxis":{"tickformat":",d","title":{"text":"$\\textrm{Input Signal Size (} |x| \\textrm{)}$"}},"yaxis":{"title":{"text":"FFT Runtime (seconds)","font":{"color":"#534AB7"}},"tickfont":{"color":"#534AB7"}},"yaxis2":{"title":{"text":"Prime Factors (count\u002fsum)"},"overlaying":"y","side":"right","showgrid":false},"hovermode":"x unified","bargap":0.28,"plot_bgcolor":"white","paper_bgcolor":"white"},                        {"responsive": true}                    )                }; </script>
</div>

<p>You may notice that a <em>longer</em> sequence can take a <em>shorter</em> time to compute;<sup id="fnref:log-sum" role="doc-noteref"><a href="#fn:log-sum" class="footnote" rel="footnote">2</a></sup> for example, the DFT for the sequence of
length 1,048,580 took 0.155 seconds to compute, while for length 1,048,579 the execution time was 0.496 seconds, more
than 3 times as long. This is because the greater length, 1,048,580, is highly composite, having 6 prime factors: 2, 2,
5, 13, 37, and 109. Meanwhile, the lesser length, 1,048,579, has only 3 prime factors: 7, 163, and 919.<sup id="fnref:numpy-specific-factors" role="doc-noteref"><a href="#fn:numpy-specific-factors" class="footnote" rel="footnote">3</a></sup></p>

<h3 id="the-ugly-truth">The “Ugly” Truth</h3>

<p>Rather than being an alternative to Cooley-Tukey, Bluestein’s algorithm is better described as an (apparently
high-overhead) “compatibility layer” for Cooley-Tukey. Bluestein’s allows Cooley-Tukey to efficiently process
awkwardly-sized inputs, but never remotely as quickly as highly-composite lengths (even those of somewhat greater size).</p>

<p>The fast Fourier transform makes perennial appearances on discussions in internet fora and attention-grabbing
listicles as among the most “beautiful” algorithms. If “fast Fourier transform” refers only to Cooley-Tukey applied to
inputs of highly-composite lengths, I’m tempted to concur; that algorithm is just complex enough to be both interesting
and elegant. But if “fast Fourier transform” refers to a program that uses a combination of Cooley-Tukey and Bluestein’s
algorithm to efficiently process inputs of <em>any</em> length, then I can’t be sure I agree. Bluestein’s algorithm is, in my
opinion, both fascinating and at least a bit unsatisfying. The fact that we can calculate a Fourier transform as a
convolution is mind-boggling. The fact that this convolution is calculated using two Fourier transforms at least twice
the length of our input size is a little disappointing.</p>

<h2 id="sources--related-reading">Sources &amp; Related Reading</h2>

<ul>
  <li>Smith, J.O., <a href="https://ccrma.stanford.edu/~jos/st/Bluestein_s_FFT_Algorithm.html">“Bluestein’s FFT Algorithm”</a> in
<em>Mathematics of the Discrete Fourier Transform (DFT) with Audio Applications</em>, Second
Edition, http://ccrma.stanford.edu/~jos/mdft/, online book, 2007 edition, accessed 2026-05-01.</li>
  <li>L. Bluestein, “<a href="https://ieeexplore.ieee.org/abstract/document/1162132">A linear filtering approach to the computation of discrete Fourier transform</a>,”
in <em>IEEE Transactions on Audio and Electroacoustics</em>, vol. 18, no. 4, pp. 451-455, December 1970, doi:
10.1109/TAU.1970.1162132.</li>
</ul>

<p><strong>Footnotes:</strong></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:upper-bound" role="doc-endnote">
      <p>Assuming that \(\lvert x \rvert &gt; 1\) <a href="#fnref:upper-bound" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:log-sum" role="doc-endnote">
      <p>Thanks to some quite unrigorous experimentation, I’ve noticed that (within this particular range) the logarithm of the
sum of the prime factors of the input length is a fairly good predictor of the running time of the FFT algorithm. You
may show this value on the graph by clicking on the legend where it says “log sum”. <a href="#fnref:log-sum" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:numpy-specific-factors" role="doc-endnote">
      <p>There are other factors that will be specific to NumPy’s (or rather its
dependency, <a href="https://gitlab.mpcdf.mpg.de/mtr/pocketfft">PocketFFT</a>’s) implementation. I haven’t taken the time to
research PocketFFT’s implementation, but I’ll speculate that its implementation of Cooley-Tukey only “breaks down”
the original DFT problem using small primes (such as 2, 3, 5, and 7), and then once no small primes remain, applies
Bluestein’s algorithm or a naive DFT to the divided input sequences. Therefore, the final running time is likely a
somewhat difficult-to-predict result of the size of the prime factors and the proximity from the “divided length” to
the nearest highly-composite number that is at least twice as large as it. <a href="#fnref:numpy-specific-factors" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Connor Boyle</name></author><category term="mathematics" /><category term="software" /><summary type="html"><![CDATA[This post is part 3 in my ongoing series on fast Fourier transform algorithms. You may wish to read part 1 and part 2 before reading this article.]]></summary></entry><entry><title type="html">Fast Fourier Transforms Part 2: Cyclic Convolution</title><link href="https://connorboyle.io/2026/04/27/fft-convolution-theorem.html" rel="alternate" type="text/html" title="Fast Fourier Transforms Part 2: Cyclic Convolution" /><published>2026-04-27T00:00:00+00:00</published><updated>2026-04-27T00:00:00+00:00</updated><id>https://connorboyle.io/2026/04/27/fft-convolution-theorem</id><content type="html" xml:base="https://connorboyle.io/2026/04/27/fft-convolution-theorem.html"><![CDATA[<p><em>This post is part 2 in my ongoing series on fast Fourier transform algorithms. You may wish to
read <a href="/2025/09/11/fft-cooley-tukey.html">part 1</a> before reading this article.</em></p>

<p>One of the most useful applications of the fast Fourier transform is in efficiently
calculating <a href="https://en.wikipedia.org/wiki/Circular_convolution">cyclic convolutions</a>. Paradoxically, cyclic
convolutions can also be used to help efficiently calculate discrete Fourier transforms. This post will show the former;
I will explain the latter point in future blog posts.</p>

<h2 id="discrete-cyclic-convolution">Discrete Cyclic Convolution</h2>

<p>If \(y\) and \(z\) are two sequences of equal length \(N = \lvert y \rvert = \lvert z \rvert\), then the <strong>cyclic
convolution</strong> of those two sequences, \(y \circledast z\), is defined as follows:</p>

\[\left( y \circledast z \right) [j] \triangleq \sum_{m=0}^{N-1} y[m] \cdot z[(j-m) \% N]\]

\[| y \circledast z | \triangleq N = |y| = |z|\]

<p>Where \(\%\) is the <a href="https://en.wikipedia.org/wiki/Modulo">modulo operator</a>.<sup id="fnref:modulo" role="doc-noteref"><a href="#fn:modulo" class="footnote" rel="footnote">1</a></sup></p>

<h3 id="cyclic-convolution-visualization">Cyclic Convolution Visualization</h3>

<p>The visualization below shows two input sequences \(y\) (vertically, along the left) and \(z\) (horizontally, along the
top), the product of each of their elements in the big matrix in the middle. The output convolution, \(y \circledast z\)
is shown on the bottom. You can discover the value and origin of an element by hovering the cursor over it. Those with a
mouse can click-and-drag the input sequences to change their values.</p>

<p>\(N = \lvert y \rvert = \lvert z \rvert =\) <input type="number" min="1" max="24" id="convolution-length-selector" value="12" /></p>

<div id="ct-math-output" style="font-size: x-large; height: 100px; display: flex; justify-content: center; align-items: center; flex-direction: column"></div>

<div class="img-wrapper">
<canvas id="cyclic-convolution-canvas" width="700px" height="700px"></canvas>
</div>

<script src="/static/fft/cyclic_convolution_visualization.js"></script>

<h2 id="the-convolution-theorem">The Convolution Theorem</h2>

<p>A cyclic convolution between two sequences \(y\) and \(z\) (length \(\lvert y \rvert = \lvert z \rvert =
N\))—calculated naively (see <a href="#cyclic-convolution-visualization">above</a>)—would require \(O(N^2)\) basic operations.
However, using the Fourier transform, we can reduce this requirement to \(O(N \log(N))\) operations (assuming \(N\) is
highly composite and we calculate the Fourier transforms using the
<a href="http://localhost:4000/2025/09/11/fft-cooley-tukey.html">Cooley-Tukey algorithm</a>).</p>

<p>Let’s prove this:</p>

\[\mathcal{F} \{ y \circledast z \}[k] = \sum_{j=0}^{|y \circledast z| - 1} (y \circledast z)[j] \cdot W_{|y \circledast z|}^{jk}\]

\[= \sum_{j=0}^{N - 1} \sum_{m=0}^{N-1} y[m] \cdot z[(j-m) \% N] \cdot W_{N}^{jk}\]

\[= \sum_{m=0}^{N-1} y[m] \sum_{j=0}^{N - 1} z[(j-m) \% N] \cdot W_{N}^{jk}\]

<p>The indices of the inner summation can be split<sup id="fnref:invalid-range" role="doc-noteref"><a href="#fn:invalid-range" class="footnote" rel="footnote">2</a></sup> into two ranges, from \(0\) to \(m-1\) and from \(m\)
to \(N-1\):</p>

\[= \sum_{m=0}^{N-1} y[m] \left( \left( \sum_{j=0}^{m - 1} z[(j-m) \% N] \cdot W_{N}^{jk} \right) + \left( \sum_{j=m}^{N - 1} z[(j-m) \% N] \cdot W_{N}^{jk} \right) \right)\]

<p>For the entire outer summation, note that:</p>

\[0 \leq m \leq N-1\]

\[1-N \leq -m \leq 0\]

\[-N \leq -m \leq 0\]

<style>
.left-right {
  display: grid;
  grid-template-columns: 1fr 1fr;
  overflow: scroll;
}
.left-right-item {
  padding: 10px;
  border: solid 1px;
}
</style>

<div class="left-right">
  <div class="left-right-item">

    <p>For the <strong>first inner summation</strong>:</p>

\[\sum_{j=0}^{m - 1} z[(j-m) \% N] \cdot W_{N}^{jk}\]

\[0 \leq j \leq m-1\]

    <p>Since \(-N \leq -m\):<sup id="fnref:modulo-property" role="doc-noteref"><a href="#fn:modulo-property" class="footnote" rel="footnote">3</a></sup></p>

\[-N \leq j-m \leq -1\]

\[(j - m) \% N = j - m + N\]

    <p>Therefore:</p>

\[\sum_{j=0}^{m - 1} z[(j-m) \% N] \cdot W_{N}^{jk}\]

\[= \sum_{j=0}^{m - 1} z[j-m + N] \cdot W_{N}^{jk}\]

\[= \sum_{j=N}^{N + m - 1} z[j-m] \cdot W_{N}^{(j-N)k}\]

\[= \sum_{j=N}^{N + m - 1} z[j-m] \cdot W_{N}^{jk}\]

  </div>
  <div class="left-right-item">

    <p>For the <strong>second inner summation</strong>:</p>

\[\sum_{j=m}^{N - 1} z[(j-m) \% N] \cdot W_{N}^{jk}\]

\[m \leq j \leq N-1\]

    <p>Since \(-m \leq 0\):<sup id="fnref:modulo-property:1" role="doc-noteref"><a href="#fn:modulo-property" class="footnote" rel="footnote">3</a></sup></p>

\[0 \leq j-m \leq N-1\]

\[(j - m) \% N = j - m\]

    <p>Therefore:</p>

\[\sum_{j=m}^{N - 1} z[(j-m) \% N] \cdot W_{N}^{jk}\]

\[= \sum_{j=m}^{N - 1} z[j-m] \cdot W_{N}^{jk}\]

  </div>
</div>

\[\mathcal{F} \{ y \circledast z \}[k] = \sum_{m=0}^{N-1} y[m] \left( \left( \sum_{j=N}^{N + m - 1} z[j-m] \cdot W_{N}^{jk} \right) + \left( \sum_{j=m}^{N - 1} z[j-m] \cdot W_{N}^{jk} \right) \right)\]

\[= \sum_{m=0}^{N-1} y[m] \sum_{j=m}^{N + m - 1} z[j-m] \cdot W_{N}^{jk}\]

\[= \sum_{m=0}^{N-1} y[m] \sum_{j=0}^{N - 1} z[j] \cdot W_{N}^{(j+m)k}\]

\[= \sum_{m=0}^{N-1} y[m] \cdot W_{N}^{mk} \sum_{j=0}^{N - 1} z[j] \cdot W_{N}^{jk}\]

\[= \mathcal{F} \{ y \} [k] \cdot \mathcal{F} \{ z \} [k]\]

<p>or, if we define \(\mathcal{F} \{ y \} \odot \mathcal{F} \{ z \}\) as the elementwise product
of \(\mathcal{F} \{ y \}\) and \(\mathcal{F} \{ z \}\), i.e.</p>

\[(\mathcal{F} \{ y \} \odot \mathcal{F} \{ z \})[k] \triangleq \mathcal{F} \{ y \}[k] \cdot \mathcal{F} \{ z \}[k]\]

<p>then, applying the inverse discrete Fourier transform:<sup id="fnref:inverse-dft" role="doc-noteref"><a href="#fn:inverse-dft" class="footnote" rel="footnote">4</a></sup></p>

\[(y \circledast z)[k] = \mathcal{F}^{-1} \{ \mathcal{F} \{ y \} \odot \mathcal{F} \{ z \} \}[k]\]

<h2 id="a-note-on-other-proofs">A Note on Other Proofs</h2>

<p>Most other proofs of the convolution theorem found on the public-facing web seem to forgo an explicit modulo operator in
favor of (sometimes implicitly) treating the input sequences as “\(N\)-periodic” infinitely-repeating signals. For
example, Professor Julius Orion Smith III’s free educational book, <a href="http://ccrma.stanford.edu/~jos/mdft/"><em>Mathematics of the Discrete Fourier Transform (DFT)
with Audio Applications</em></a>, does exactly this in
its <a href="https://ccrma.stanford.edu/%7Ejos/mdft/Convolution_Theorem.html">passage</a> on the convolution theorem.</p>

<p>I suspect this convention is intuitive to anyone with sufficient background in signal processing. However, for newcomers
such as myself, the unexplained elision of the modulo operator can be rather confusing. It’s also not immediately
obvious how to represent and interact with an \(N\)-periodic infinitely-repeating signal in a computer program; the
modulo operator, on the other hand, should be quite intuitive to anyone with a computer science background, and much
more easily translatable into computer code.</p>

<h2 id="source-and-additional-reading">Source and Additional Reading</h2>

<ul>
  <li>Smith, J.O. <em>Mathematics of the Discrete Fourier Transform (DFT) with Audio Applications</em>, Second
Edition, <a href="http://ccrma.stanford.edu/~jos/mdft/">http://ccrma.stanford.edu/~jos/mdft/</a>, online book, 2007 edition,
accessed 2026-04-27.</li>
</ul>

<p><strong>Footnotes:</strong></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:modulo" role="doc-endnote">
      <p>For our purposes, let the modulo of an integer \(a\) by an integer \(b\) be defined using the remainder from
floored division, i.e.</p>

\[a \, \% \, b \triangleq a - b \lfloor \frac{a}{b} \rfloor\]

      <p>Note that using this definition, the range of the modulo operator is exclusively nonnegative, unlike the modulo
operator found in many programming languages such as Javascript and Python. <a href="#fnref:modulo" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:invalid-range" role="doc-endnote">
      <p>It is possible for the upper bound of the first sum to be lesser than its lower bound
(i.e. \(m = 0 \implies m - 1 &lt; 0\)). Let summations with invalid ranges be defined as 0, i.e.:</p>

\[m = 0 \implies \sum_{j=0}^{m - 1} z[(j-m) \% N] \cdot W_{N}^{jk} = 0\]
      <p><a href="#fnref:invalid-range" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:modulo-property" role="doc-endnote">
      <p>This proof uses the following property of modular arithmetic; for any \(c \in \mathbb{Z}\):</p>

\[cb \leq a &lt; (c+1)b \implies a \% b = a - cb\]
      <p><a href="#fnref:modulo-property" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:modulo-property:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:inverse-dft" role="doc-endnote">

      <p>The inverse discrete Fourier transform \(\mathcal{F}^{-1} \{ x \}\) is very similar to the “forward”
DFT:</p>

\[\mathcal{F}^{-1} \{ x \}[k] = \frac{1}{|x|} \sum_{j=0}^{N-1} x[j] \cdot W_{|x|}^{-jk}\]

      <p>The Cooley-Tukey algorithm, with very slight modifications, can also be used to calculate the inverse DFT (in fact,
the original Cooley-Tukey paper described an algorithm to calculate the <em>inverse</em> DFT). Demonstrating this is left
as an exercise to the reader. <a href="#fnref:inverse-dft" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Connor Boyle</name></author><category term="mathematics" /><category term="software" /><summary type="html"><![CDATA[This post is part 2 in my ongoing series on fast Fourier transform algorithms. You may wish to read part 1 before reading this article.]]></summary></entry><entry><title type="html">Confucius has Never Been more Right than Today (Unless You’re Reading this After July 26, 2102)</title><link href="https://connorboyle.io/2026/04/09/confucius-has-never-been-more-right.html" rel="alternate" type="text/html" title="Confucius has Never Been more Right than Today (Unless You’re Reading this After July 26, 2102)" /><published>2026-04-09T00:00:00+00:00</published><updated>2026-04-09T00:00:00+00:00</updated><id>https://connorboyle.io/2026/04/09/confucius-has-never-been-more-right</id><content type="html" xml:base="https://connorboyle.io/2026/04/09/confucius-has-never-been-more-right.html"><![CDATA[<p>Confucius once said:</p>

<blockquote>
  <p>為政以德，譬如北辰，居其所而眾星共之</p>

  <p>He who exercises government by means of his virtue may be compared to the north polar star,<sup id="fnref:beichen-north-star" role="doc-noteref"><a href="#fn:beichen-north-star" class="footnote" rel="footnote">1</a></sup> which keeps its place
and all the stars turn towards it.</p>

  <p>(<a href="https://ctext.org/analects/wei-zheng"><em>Analects</em> 2.1, trans. James Legge</a>)</p>
</blockquote>

<p>A 21st-century reader like you or me might think of those long-exposure photos of the northern sky, showing the North
Star, <a href="https://en.wikipedia.org/wiki/Polaris"><strong>Polaris</strong></a>, appearing almost perfectly still while the all other stars
seem to revolve around it over the course of a night.</p>

<div class="img-wrapper">
  <p><img src="/images/confucius/Steve_Ryan_-_Stars_around_Polaris_-_Day_62_(by-sa).jpg" alt="A night sky full of stars appears to rotate around a single star in the center of the image" /></p>

  <p><a href="https://creativecommons.org/licenses/by-sa/2.0/deed.en"><em>Stars around Polaris - Day 62</em></a> by Steve Ryan, (<a href="https://creativecommons.org/licenses/by-sa/2.0/deed.en">CC BY-SA 2.0 license</a>)</p>
</div>

<p>Is this the star that Confucius had in mind? Is Polaris (also known as α Ursae Minoris) a model for the ideal Confucian
ruler? Well, no, in fact, because Earth’s north pole was not pointed anywhere near Polaris in Confucius’s time. The
Earth’s celestial poles—each being a projection of its axis of rotation—“wobble”, a bit like a spinning top.</p>

<p>As every 2nd grader should know, the Earth’s axis of rotation is tilted (about 23.4°) relative to the plane of its orbit
around the Sun. While the <em>amount</em> of tilt does not change much, the <em>direction</em> of that tilt has slowly but
significantly wobbled, on a roughly 26,000-year cycle.</p>

<div class="img-wrapper">
  <p><img src="/images/confucius/Precession_animation_small_new.gif" alt="An animated gif showing the earth rotating and wobbling, with the northern pole of its axis of rotation inscribing a circle on the sky" /></p>

  <p>from Wikimedia user <a href="https://commons.wikimedia.org/wiki/File:Precession_animation_small_new.gif">Tfr000</a>, (<a href="https://creativecommons.org/licenses/by-sa/3.0/deed.en">CC BY-SA 3.0 license</a>)</p>

</div>

<p>While thinking about the above passage by Confucius, I got the idea to try simulating what the night sky would look like
at different points in the Earth’s cycle of axial precession. For illustrative purposes, I’ve removed the Sun and sped
up the rate of rotation to 6 degrees per second (in reality, the Earth’s rate of rotation is about 15.04 degrees per
hour).</p>

<p>First, here is what you would see looking due north in the night sky, assuming a view unobstructed by clouds (or, as
would be the case at southerly latitudes, the ground beneath your feet):<sup id="fnref:precession-only" role="doc-noteref"><a href="#fn:precession-only" class="footnote" rel="footnote">2</a></sup></p>

<div class="img-wrapper">
<canvas width="500" height="500" id="present-day"></canvas>

<p><i>The circumpolar region of the northern sky, as it appears today (click to toggle pole star highlighting)</i></p>
</div>

<p>By virtue of its proximity to the north celestial pole, Polaris (the bright star at the center of our view) appears to
move extremely little throughout the night.</p>

<p>Meanwhile, this is what the night sky would have looked like circa 479 BCE, the last year Confucius was alive:</p>

<div class="img-wrapper">
<canvas width="500" height="500" id="confucius-time"></canvas>

<p><i>The circumpolar region of the northern sky around 479 BC</i></p>
</div>

<p>Not only did the Earth’s north pole point nowhere near Polaris, the north pole isn’t particularly close to <em>any</em> bright
star. In this time, <a href="https://en.wikipedia.org/wiki/Kochab">Kochab</a>, at just north of +82° declination, was the most
northerly star of roughly the same brightness as Polaris (the celestial north pole is at +90° declination). Polaris was
itself all the way down at +76° declination, quite far from the north pole. For a given Earthbound observer, Polaris
could end the night more than 28° from where it had started; such a star can hardly be claimed to (as Confucius had
described) “keep its place”!</p>

<h3 id="finding-confuciuss-beichen">Finding Confucius’s <em>Beichen</em></h3>

<p>So what, then, should we make of Confucius’s statement? Well the term I gave in translation above as “north polar
star” might refer to something else entirely, so—in order not to bias the reader—I’ll leave it in transliterated
Chinese: <em>beichen</em> (北辰). As for what on Earth (or rather, above it) <em>beichen</em> refers to, sources disagree:</p>

<h4 id="another-pole-star">Another Pole Star</h4>

<p>Joseph Needham proposes<sup id="fnref:needham-citation" role="doc-noteref"><a href="#fn:needham-citation" class="footnote" rel="footnote">3</a></sup> that the ancient Chinese used a series of northern pole stars over the
centuries. Needham does not outright say which of these stars he thinks Confucius is alluding to, but logically it must
be:</p>

<ul>
  <li><a href="https://simbad.cds.unistra.fr/simbad/sim-id?Ident=bet+UMi">Kochab (β Ursae Minoris, HR 5563)</a>, a star nearly as
bright as Polaris (though significantly further from the north pole at any time), which according to Needham was used
as the pole star around 1000 BC, or</li>
  <li>“4339 Camelopardi”, an obscure star that was apparently a quite exact pole star circa 800 AD.<sup id="fnref:needham-circle" role="doc-noteref"><a href="#fn:needham-circle" class="footnote" rel="footnote">4</a></sup> I
believe<sup id="fnref:needham-nomenclature" role="doc-noteref"><a href="#fn:needham-nomenclature" class="footnote" rel="footnote">5</a></sup> Needham is referring to the double
star <a href="https://en.wikipedia.org/wiki/Struve_1694">Σ 1694</a>, or perhaps the brighter of its two
components, <a href="https://simbad.cds.unistra.fr/simbad/sim-id?Ident=HD+112028">HR 4893</a>. According to Needham, this star
was used as a pole star in the Han dynasty.</li>
</ul>

<h4 id="an-asterism">An Asterism</h4>

<p>David Pankenier, in “A Brief History of <em>Beiji</em> (Northern Culmen)”<sup id="fnref:pankenier" role="doc-noteref"><a href="#fn:pankenier" class="footnote" rel="footnote">6</a></sup> alternately proposes that Confucius was referring
collectively to the stars of the <a href="https://en.wikipedia.org/wiki/Big_Dipper#Asian_traditions">Northern Dipper</a> (also
known as Ursa Major, the Big Dipper, or the Plough). Indeed, Classical Chinese does not have grammatical number, so
“<em>beichen</em>” could just as easily be “the north star<em>s</em>” as “the north star”.</p>

<h4 id="the-pole-itself">The Pole Itself</h4>

<p>Finally, E. Bruce Brooks and A. Taeko Brooks write:</p>

<blockquote>
  <p>There was in this period no literal pole star, the immediate circumpolar region being essentially empty until much
later times… Whether we imagine a polar void or (as the text seems to require) a polar star, the thrust of the
saying is the magical power of inactivity.<sup id="fnref:brooks-and-brooks" role="doc-noteref"><a href="#fn:brooks-and-brooks" class="footnote" rel="footnote">7</a></sup></p>
</blockquote>

<p>Brooks &amp; Brooks’s (tentative) identification of <em>beichen</em> with the empty celestial pole—rather than any luminous
heavenly body—may receive some support from the oldest Chinese dictionary,<sup id="fnref:erya" role="doc-noteref"><a href="#fn:erya" class="footnote" rel="footnote">8</a></sup> though I think that evidence is weak
at best.</p>

<h4 id="words-from-the-ancients">Words from the Ancients</h4>

<p>Finally, I’ll contribute a hypothesis of my own. Let’s recall that Confucius described himself as:</p>

<blockquote>
  <p>A transmitter and not a maker, believing in and loving the ancients</p>

  <p>述而不作，信而好古”</p>

  <p>(<a href="https://ctext.org/analects/shu-er"><em>Analects</em> 7.1</a>)</p>
</blockquote>

<p>Could this ancient philosopher be transmitting a statement about (or even <em>from</em>) a yet-more distant antiquity? As it
happens, <a href="https://simbad.cds.unistra.fr/simbad/sim-id?Ident=Alpha+Draconis">Thuban (α Draconis, HR 5291)</a>, a star
significantly less bright than Polaris, though still easily visible with the naked eye, was at one point even closer to
the celestial north pole than Polaris will ever be:</p>

<div class="img-wrapper">
<canvas width="500" height="500" id="thuban"></canvas>

<p><i>The circumpolar region of the northern sky around 2800 BC (click to toggle pole star highlighting)</i></p>
</div>

<p>It would be quite remarkable indeed if Confucius was talking about this erstwhile pole star; the Earth’s celestial
north pole was at its closest to Thuban around 2800 BC, almost as distant to the time of Confucius as his time is
to mine. And there is another reason to discount this extraordinary possibility: Confucius makes no note of the north
pole having drifted away from this former pole star–a shift that I imagine would carry great cosmological significance,
and merit much commentary from Confucius.</p>

<hr />

<p>It might be tempting to just pick one of these hypotheses and present it as the truth, but I think the most honest
answer is that we simply don’t know! Confucius’s “<em>beichen</em> 北辰” could be an asterism, a point in the sky, or one of
several stars that have approached the pole over the course of human history. Perhaps, we may never know which one it
was.</p>

<h2 id="rendering-the-sky-of-spring-and-autumn">Rendering the Sky of Spring and Autumn</h2>

<p>To make the night-sky renderings you see above in this post, I created a program I’ve named “Beichen”. I made this
program using Rust, WebAssembly, and JavaScript. I’m sure I could have easily used any of a number of pre-existing
programs, but I decided to make something new anyway, for fun. For my star dataset I used
the <a href="http://tdc-www.harvard.edu/catalogs/bsc5.html">Yale Bright Star Catalogue</a> (hereafter, “YBS”).</p>

<p>This catalogue and others generally give stars using right ascension and declination, which are a type of spherical
(rather than Cartesian) coordinates. If you’re like me, you might imagine it would be inherently easier to rotate
spherical coordinates, which are just a pair of angles.<sup id="fnref:and-radius" role="doc-noteref"><a href="#fn:and-radius" class="footnote" rel="footnote">9</a></sup> Indeed, it is very easy to rotate around the polar
axis by simply adding-to or subtracting-from the <em>azimuthal</em> angle (which in our case would be right ascension), but it
is apparently a pain to rotate around any other axis, so I instead simply convert the star coordinates to Cartesian
coordinates using these formulae (where \(\theta\) is right ascension and \(\phi\) is declination):</p>

\[x = \cos (\theta) \cos (\phi)\]

\[y = \sin (\theta) \cos (\phi)\]

\[z = \sin (\phi)\]

<p>and then rotate them using the usual <a href="https://mathworld.wolfram.com/RotationMatrix.html">rotation matrices</a> that you’d
learn about in a typical linear algebra class, (specifically using the
<a href="https://docs.rs/ndarray/latest/ndarray/"><code class="language-plaintext highlighter-rouge">ndarray</code></a> crate, my only non-web-related dependency).</p>

<p>Rather than using WebGL (the recommended way of rendering 3D content on a
<a href="https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D"><code class="language-plaintext highlighter-rouge">CanvasRenderingContext2D</code></a>), the program
uses a 2D canvas context as a rasterization engine for the stars (projected onto a pinhole camera matrix). In
retrospect, my approach was likely very suboptimal for performance; making all of those rendering calls from inside my
WebAssembly program means lots of crossing the Wasm-JS boundary.</p>

<h3 id="skies-past-present-and-future">Skies Past, Present, and Future</h3>

<p>Here is an interactive version of Beichen, the night sky viewer. You can click and drag to move, or scroll to zoom.
Hovering over a star will show its designation, magnitude, and coordinates (unless you deselect that option) Try
out the checkboxes below to show equatorial coordinate lines, the ecliptic, and the north and south pole’s circles of
precession. The intended purpose of this program is to show the night sky at whatever year<sup id="fnref:precession-only:1" role="doc-noteref"><a href="#fn:precession-only" class="footnote" rel="footnote">2</a></sup> in
the past or future the user wants, but ultimately you can use it however you like.</p>

<div class="img-wrapper">
    <canvas id="interactive-canvas" style="touch-action: none" width="700" height="500"></canvas>
    <table style="text-align: center">
        <thead><tr><th colspan="3">Sky Parameters</th></tr></thead>
            <tr>
                <td>Precession (years)</td><td colspan="2"><input id="precession-years" type="number" value="0.0" step="10" /></td>
            </tr><tr>
                <td>Precession (°)</td><td colspan="2"><input id="precession-angle" type="number" value="0.0" step="0.1" min="-360" max="360" /></td>
            </tr>
            <tr><td colspan="3"></td></tr>
        <thead><tr><th colspan="3">View Settings</th></tr></thead>
        <tbody>
            <tr>
                <td>Roll (°)</td><td colspan="2"><input id="roll" type="number" value="0.0" min="-360" max="360" /></td>
            </tr><tr>
                <td>Brightness scaling parameter</td><td colspan="2"><input id="brightness-degree" type="number" min="0.1" max="3.0" value="2.0" step="0.01" /></td>
            </tr><tr>
                <td>Rotation speed (°/sec)</td><td colspan="2"><input id="rotation-speed" type="number" min="0.0" max="360.0" value="0.0" step="0.1" /></td>
            </tr><tr>
                <td>Show star info on hover</td><td colspan="2"><input id="star-hover" type="checkbox" checked="" /></td>
            </tr><tr>
                <th></th><th>Lines</th><th>Labels</th>
            </tr><tr>
                <td>Equatorial coordinates</td>
                <td><input id="celestial-lines" type="checkbox" checked="" /></td>
                <td><input id="celestial-text" type="checkbox" /></td>
            </tr><tr>
                <td>Circles of precession</td>
                <td><input id="orbital-lines" type="checkbox" checked="" /></td>
                <td><input id="orbital-text" type="checkbox" /></td>
            </tr><tr>
                <td>Ecliptic</td>
                <td>
                    <input id="ecliptic" type="checkbox" />
                </td><td>
                    <input id="quarters" type="checkbox" />
                </td>
            </tr><tr>
                <td>Orientation</td>
                <td></td>
                <td>
                    <input id="orientation-text" type="checkbox" checked="" />
                </td>
            </tr>
        </tbody>
    </table>
</div>

<h2 id="whats-so-special-about-2102">What’s so Special About 2102?</h2>

<p>Regardless of whether I’m reading a millennia-old Chinese text or conversing with my fellow 21st-century Americans, I
often find that people believe themselves to be living in an era of decline—a lesser shadow of some glorious past.
Back then, things were better; sons were filial, names were rectified, and coffee only cost a dime. I can’t say with
certainty whether things used to be better here on Earth, but I can tell you that the state of affairs up in the heavens
is very much the opposite. Whereas our ancestors (at least, for those of us with roots in the Northern Hemisphere) lived
under a sky that appeared to revolve around an empty point, our 21st-century celestial dome converges on a brilliant gem
of a star almost precisely at its center—that is, Polaris.<sup id="fnref:brightness" role="doc-noteref"><a href="#fn:brightness" class="footnote" rel="footnote">10</a></sup> What’s more, not only is the present greater
than the past, the future promises to out-do our present!</p>

<p><a href="https://pwg.gsfc.nasa.gov/stargaze/Sprecess.htm">Online pop science sources</a> tend to be a bit vague about exactly how
long the Earth’s precession cycle takes, generally claiming that the Earth’s axis of rotation will return to pointing at
the same spot after approximately 26,000 years. Rather than doing more research to find a rigorous source, I thought
it would be fun to try to calculate the cycle’s length using the data in the Yale Bright Star Catalogue. The YBS reports
observations for two <a href="https://astronomy.swin.edu.au/cosmos/*/Epoch">epochs</a>: B1900.0 &amp; J2000.0. By comparing the
apparent position of stars in 1900 vs. 2000, we can estimate the amount that the Earth’s axis precesses in a period of
100 “years”.<sup id="fnref:year-length" role="doc-noteref"><a href="#fn:year-length" class="footnote" rel="footnote">11</a></sup><sup id="fnref:equatorial-to-ecliptic" role="doc-noteref"><a href="#fn:equatorial-to-ecliptic" class="footnote" rel="footnote">12</a></sup> If we
take <a href="https://en.wikipedia.org/w/index.php?title=Orbital_pole&amp;oldid=1273206857">Wikipedia</a> as our source of truth,
trusting that the Earth’s northern orbital pole is at a declination of +66° 33′ 38.84″, then we find that the median
star observed in the YBS has appeared to “rotate” about 1.39651° around the orbital pole.<sup id="fnref:orbital-rotation" role="doc-noteref"><a href="#fn:orbital-rotation" class="footnote" rel="footnote">13</a></sup> Therefore:</p>

\[\frac{\textrm{precession}}{\textrm{year}} = \frac{1.39651°}{100 \textrm{yrs.}}\]

\[= \frac{360°}{25{,}778.47 \textrm{ yrs.}}\]

\[\frac{\textrm{year}}{\textrm{precession}} = \frac{25{,}778.47 \textrm{ yrs.}}{360°}\]

\[\textrm{precession cycle} = 25{,}778.47 \textrm{ yrs.}\]

<p>If we convert Polaris’s location to ecliptic coordinates<sup id="fnref:equatorial-to-ecliptic:1" role="doc-noteref"><a href="#fn:equatorial-to-ecliptic" class="footnote" rel="footnote">12</a></sup>, we find that its ecliptic longitude
is about 1.4323328° away<sup id="fnref:polaris-distance" role="doc-noteref"><a href="#fn:polaris-distance" class="footnote" rel="footnote">14</a></sup> from that of the celestial north pole (both positions according to the J2000.0 epoch).
Calculating the years from this angle, we find that the Earth’s celestial north pole will point closest to Polaris
in \(\frac{1.4323328°}{360°} \cdot 25{,}778.47 \textrm{ yrs.} = 102.56\) years.<sup id="fnref:precession-length-nasa" role="doc-noteref"><a href="#fn:precession-length-nasa" class="footnote" rel="footnote">15</a></sup> Taking a “
year” as 365.25 days (see note<sup id="fnref:year-length:1" role="doc-noteref"><a href="#fn:year-length" class="footnote" rel="footnote">11</a></sup>), we can use
a <a href="https://www.timeanddate.com/date/dateadded.html?m1=01&amp;d1=01&amp;y1=2000&amp;type=add&amp;ay=&amp;am=&amp;aw=&amp;ad=37461&amp;rec=">date calculator</a>
to find that our celestial north pole will point closest to Polaris on July 26, 2102, after which it will appear to
slowly drift further and further away.</p>

<div class="img-wrapper">
<canvas width="500" height="500" id="sky-2102"></canvas>

<p><i>The circumpolar region of the northern sky as it will appear in 2102, when Polaris is closest to the celestial north pole (click to toggle pole star highlighting)</i></p>
</div>

<p>It’s also possible (practically guaranteed, really) that other factors such as nutation, proper motion, and who knows
what else, could influence this date—but I believe only very slightly (as a sanity check, I’ve also played around
with <a href="https://stellarium.org/">Stellarium</a> and found that it generally agrees with my own software). If you think that
my calculations are wrong, if you have any new arguments, evidence, or hypotheses for the meaning of the word <em>beichen</em>
as used by Confucius, or if you just enjoyed my article, please leave a comment below, and I’ll try to get back to you!
As always, thanks for reading.</p>

<h2 id="footnotes">Footnotes</h2>

<script type="module" src="/static/confucius_axial_precession.js">
</script>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:beichen-north-star" role="doc-endnote">

      <p>As will become clear, “north polar star” is very likely a mistranslation; see below for discussion of the likely
meaning of the original Chinese “<em>beichen</em> (北辰)”. <a href="#fnref:beichen-north-star" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:precession-only" role="doc-endnote">

      <p>The software and calculations I present here only accounts for the Earth’s axial precession, and doesn’t factor
in <a href="https://earth.gsfc.nasa.gov/geo/multimedia/nutation-and-precession">nutation</a>,
<a href="http://hyperphysics.phy-astr.gsu.edu/hbase/Astro/para.html">parallax</a>,
nor <a href="https://www.esa.int/Science_Exploration/Space_Science/Gaia/Proper_motion">proper motion</a>. <a href="#fnref:precession-only" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:precession-only:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:needham-citation" role="doc-endnote">

      <p>See <a href="https://archive.org/details/Science-and-Civilisation-in-China/Vol.3%201959%20Mathematics%20and%20the%20Sciences%20of%20the%20Heavens%20and%20the%20Earth/page/259/mode/2up">pages 259-261 of volume 3 of <em>Science and Civilisation in China</em></a> <a href="#fnref:needham-citation" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:needham-circle" role="doc-endnote">

      <p>Needham appears to have drawn his conclusions about pole stars using an incorrect projection of the north pole’s
circle of precession onto a star map. This figure, from page 260, incorrectly shows the north pole’s circle of
precession almost perfectly intersecting Edasich (ι Draconis, HR 5744) while skirting far from Thuban (α Draconis, HR
5291); the reality is the opposite, Thuban comes within 15” of the north pole, while Edasich is at its closest
more than 4° away. The projection also shows Errai (γ Cephei, HR 8974) even closer to the circle of precession
than Polaris, when in fact it is over 1°50′ from it.</p>

      <div class="img-wrapper">
<img src="/images/confucius/needham-star-map.png" width="500" />
</div>

      <p>Most importantly, this chart likely leads Needham to overestimate Kochab’s (β Ursae Minoris, HR 5563) viability as
a pole star. Needham claims <a href="https://simbad.cds.unistra.fr/simbad/sim-id?Ident=10+Dra">10 Draconis</a> (HR 5226) is
“not much further away” from the north pole’s circle of precession than Kochab. This is, again, the opposite of reality;
at its closest, 10 Draconis was less than 1°15’ from the north pole, <em>much closer</em> than Kochab, which never got within
6° of the celestial north pole! <a href="#fnref:needham-circle" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:needham-nomenclature" role="doc-endnote">
      <p>Needham refers to stars using a designation system that I have been totally unable to
penetrate. I am tempted to think that the “4339” in “4339 Camelopardi” is a Lalande number for a star that happens
to be in the Camelopardalis constellation; in fact, the <em>Catalogue of the British Association</em> (BAC)
<a href="https://archive.org/details/catalogueofstars00britrich/catalogueofstars00britrich/page/194/mode/2up">shows 4339</a> as a
star that—judging by its coordinates—appears to be one of the stars in Σ 1694 (which I believe is now considered
part of Camelopardalis, but is marked as part of Ursa Minor in the catalogue for some reason). However, the other
stars that Needham refers to using this convention cannot be the stars corresponding to these numbers in BAC; he
refers to <a href="https://simbad.cds.unistra.fr/simbad/sim-id?Ident=5+UMi">5 Ursae Minoris</a>
(HR 5430) as “a3233 Ursae minoris”, <a href="https://simbad.cds.unistra.fr/simbad/sim-id?Ident=4+UMi">4 Ursae Minoris</a> (HR 5321) as
“b3162 Ursae minoris”, <a href="https://simbad.cds.unistra.fr/simbad/sim-id?Ident=10+Dra">10 Draconis</a> (HR 5226) as
“3067i Draconis”. While the <em>letters</em> in these designations correspond nicely to their
<a href="https://en.wikipedia.org/wiki/Bayer_designation">Bayer designations</a>, none of the four-digit numbers, when checked in
BAC, corresponds to a star that could plausibly be the same star as the one given by Needham; in many cases, they are
in the Southern Hemisphere! (To avoid cursing future generations with yet more confusion: the number-constellation
designations that <em>I</em> used above are <a href="https://en.wikipedia.org/wiki/Flamsteed_designation">Flamsteed designations</a>).</p>

      <p>Needham gives “\(32^2\) H” as an alternate designation for 4339 Camelopardi. Some secondary sources, such as
<a href="https://en.wikipedia.org/w/index.php?title=Struve_1694&amp;oldid=1253498411">Wikipedia</a> claim that Σ 1694 is given as 32
Camelopardi in <a href="https://en.wikipedia.org/wiki/Johannes_Hevelius">Hevelius’s</a> catalogue. Indeed, the superscripted “2”
hints to me that the referenced star may in fact be a member of a double star, such as Σ 1694. However, this claim
does not survive a cursory investigation; <a href="https://cdsarc.cds.unistra.fr/ftp/J/A+A/516/A29/hevelius.dat">the catalogue</a>
shows Hevelius’s 32nd star of Camelopardalis at an ecliptic latitude of scarcely more than 44° “borealis”. Rather than
attempting to calculate its equatorial coordinates, I’ll just say that this star’s declination cannot be more than
+68°, meaning that Hevelius’s 32 Camelopardi could neither be Σ 1694, nor could it be any pole star of the Han
dynasty.</p>

      <p>With any luck, the world’s experts in history of astronomy are currently yelling at their computer screens as they
are reading this, and I’ll soon receive a deluge of comments telling me exactly which obscure star catalogue
Needham’s 4-digit star designations come from. <a href="#fnref:needham-nomenclature" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:pankenier" role="doc-endnote">

      <p>Pankenier, David W. 2004. “A Brief History of <em>Beiji</em> 北極 (Northern Culmen), with an Excursus on the Origin
of the Character <em>Di</em> 帝.” <em>Journal of the American Oriental Society</em> 124 (2): 211–36. <a href="#fnref:pankenier" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:brooks-and-brooks" role="doc-endnote">

      <p>Brooks, E. Bruce and A. Taeko Brooks <em>The Original Analects : Sayings of Confucius and His Successors: A New
Translation and Commentary</em>. New York: Columbia University Press, 1988. (Page 109) <a href="#fnref:brooks-and-brooks" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:erya" role="doc-endnote">

      <p>A quick consultation of the <em>Erya</em> 爾雅, the oldest Chinese dictionary (some of whose content may date to the
time of Confucius), <a href="https://ctext.org/er-ya/shi-tian">defines</a> <em>beichen 北辰</em> simply as “north pole” (<em>beiji</em> 北極):</p>

      <blockquote>
        <p>北極謂之北辰。</p>

        <p>(爾雅 - 釋天)</p>
      </blockquote>

      <p>Frustratingly, the <em>Erya</em> here is simply defining one word using another word. <em>Beiji</em> 北極 quite likely means “north pole”
(at least on some level) but I can’t be certain it isn’t also a metonym for some star or asterism. <a href="#fnref:erya" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:and-radius" role="doc-endnote">

      <p>Spherical coordinates also have a radius. For our purposes, stars are points-at-infinity, so I simply represent them
as vectors on the unit sphere, i.e. having a radius of 1. <a href="#fnref:and-radius" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:brightness" role="doc-endnote">

      <p>While Thuban was at one point closer to the north celestial pole, Polaris is <em>much</em> brighter than Thuban.
Using <a href="https://simbad.cds.unistra.fr/simbad/sim-id?Ident=Polaris">Polaris</a>
and <a href="https://simbad.cds.unistra.fr/simbad/sim-id?Ident=Alpha+Draconis">Thuban</a>’s visual magnitudes of 2.02 and 3.68
respectively, I find that Polaris is \(\sqrt[5]{100}^{(3.68 - 2.02)} \approx 4.61\) times as bright as Thuban. <a href="#fnref:brightness" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:year-length" role="doc-endnote">

      <p>“Years” is in scare quotes because there are 36,525 days between the B1900.0 epoch (December 31, 1899) and the J2000.0
epoch (January 1, 2000), therefore a “year” for our purposes is precisely 365.25 days long, slightly longer than the
365.2422 days of an actual year. <a href="#fnref:year-length" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:year-length:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:equatorial-to-ecliptic" role="doc-endnote">

      <p>Before comparing the star’s coordinates across the two epochs, I had to convert them from equatorial to ecliptic
coordinates. This can be achieved by simply “un-tilting” the Earth; i.e. rotating the stars’ coordinates by the
obliquity of the ecliptic. Technically, my value for the obliquity of the ecliptic is only accurate for the present
day; it has probably changed slightly since J2000.0, and certainly somewhat since B1900.0. <a href="#fnref:equatorial-to-ecliptic" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:equatorial-to-ecliptic:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:orbital-rotation" role="doc-endnote">

      <p>My code to estimate the length of the Earth’s cycle of axial precession can be
found <a href="https://github.com/boyleconnor/beichen-wasm/blob/126d1befa6b9ef3751387e4eb28a9611dce323bd/src/sky.rs#L415">here</a> <a href="#fnref:orbital-rotation" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:polaris-distance" role="doc-endnote">

      <p>My code to estimate Polaris’s longitudinal displacement from the celestial north pole can be found
<a href="https://github.com/boyleconnor/beichen-wasm/blob/126d1befa6b9ef3751387e4eb28a9611dce323bd/src/sky.rs#L379">here</a> <a href="#fnref:polaris-distance" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:precession-length-nasa" role="doc-endnote">

      <p>This <a href="https://science.nasa.gov/science-research/earth-science/milankovitch-orbital-cycles-and-their-role-in-earths-climate/">public education article</a>
by NASA states that the Earth’s cycle of precession is 25,771.5 years long. On the off chance that I, an amateur
blogger with no formal astronomy training, am wrong, and NASA is right, then the Earth’s north pole will point
closest to Polaris 10 days earlier on July 16, 2102 (again, assuming no nutation, proper motion, etc.).
Prepare accordingly. <a href="#fnref:precession-length-nasa" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Connor Boyle</name></author><category term="astronomy" /><category term="history" /><category term="software" /><summary type="html"><![CDATA[Confucius once said:]]></summary></entry><entry><title type="html">Fast Fourier Transforms Part 1: Cooley-Tukey</title><link href="https://connorboyle.io/2025/09/11/fft-cooley-tukey.html" rel="alternate" type="text/html" title="Fast Fourier Transforms Part 1: Cooley-Tukey" /><published>2025-09-11T00:00:00+00:00</published><updated>2025-09-11T00:00:00+00:00</updated><id>https://connorboyle.io/2025/09/11/fft-cooley-tukey</id><content type="html" xml:base="https://connorboyle.io/2025/09/11/fft-cooley-tukey.html"><![CDATA[<p><strong><em>I’m planning to write a series of posts about fast Fourier transform algorithms. This first post covers the
Cooley-Tukey algorithm, which is the original and most well-known FFT algorithm.</em></strong></p>

<h2 id="the-discrete-fourier-transform">The Discrete Fourier Transform</h2>

<p>If \(x\) is a sequence of complex numbers with a length \(\lvert x \rvert\) and a starting index of 0, then the discrete
Fourier transform of \(x\), \(\mathcal{F} \{ x \}\), is defined as follows:</p>

\[|\mathcal{F} \{ x \}| = |x|\]

\[\mathcal{F} \{ x \}[k] = \sum_{j=0}^{|x|-1} x[j] \cdot e^{-i 2 \pi jk \frac{1}{|x|}}\]

<p>Since complex exponentiation is so commonly used in Fourier transforms, we’ll define a helpful term \(W_N\) as follows:</p>

\[W_N \triangleq e^{-i 2 \pi \frac{1}{N}}\]

<p>i.e. \(W_N = W_N^1\) is a \(\frac{1}{N}\)-turn rotation in the complex plane (starting at 1). \(W_N^2\) is
a \(\frac{2}{N}\)-turn rotation in the complex plane, etc. Substituting to the original discrete Fourier transform
definition, we get:</p>

\[\mathcal{F} \{ x \}[k] = \sum_{j=0}^{|x|-1} x[j] \cdot W_{|x|}^{jk}\]

<p>Naïvely evaluating this equation for <em>each</em> of the \(\lvert x \rvert\) different output frequency
buckets of the DFT (\(k = 0, 1, \ldots, \lvert x \rvert - 2, \lvert x \rvert - 1\)) requires a summation
of complex products of the \(\lvert x \rvert\) samples in the signal, thus giving any naïve DFT algorithm a time
complexity of \(O(\lvert x \rvert^2)\).</p>

<h2 id="the-cooley-tukey-algorithm">The Cooley-Tukey Algorithm</h2>

<p>If \(\lvert x \rvert\) is a composite number, we can pick two natural numbers \(r\) and \(d\), such that:</p>

\[\lvert x \rvert = r \cdot d\]

<p>This allows us to change the single summation over \(j\) into nested summations:</p>

\[\mathcal{F} \{ x \}[k] = \sum_{j_1=0}^{d-1} \sum_{j_0=0}^{r-1} x[j_1 r + j_0] \cdot W_{\lvert x \rvert}^{(j_1 r + j_0)k}\]

\[= \sum_{j_0=0}^{r-1} \sum_{j_1=0}^{d-1} x[j_1 r + j_0] \cdot W_{\lvert x \rvert}^{(j_1 r + j_0)k}\]

<p>Similarly, we can define variables \(k_0\) and \(k_1\) such that \(k = k_1 d + k_0\). Let:</p>

\[k_1 \triangleq \lfloor \frac{k}{d} \rfloor\]

\[k_0 \triangleq k - k_1 d\]

<p>In other words, \(k_1\) is the quotient and \(k_0\) is the remainder of
the <a href="https://en.wikipedia.org/wiki/Euclidean_division">Euclidean division</a><sup id="fnref:euclidean-division" role="doc-noteref"><a href="#fn:euclidean-division" class="footnote" rel="footnote">1</a></sup> of \(k\) by \(d\).</p>

<p>This allows us to again re-formulate the discrete Fourier transform:</p>

\[\mathcal{F} \{ x \}[k] = \sum_{j_0=0}^{r-1} \sum_{j_1=0}^{d-1} x[j_1 r + j_0] \cdot W_{|x|}^{(j_1 r + j_0) (k_1 d + k_0)}\]

\[= \sum_{j_0=0}^{r-1} \sum_{j_1=0}^{d-1} x[j_1 r + j_0] \cdot W_{|x|}^{j_1 r k_1 d + j_1 r k_0 + j_0 (k_1 d + k_0)}\]

\[= \sum_{j_0=0}^{r-1} \sum_{j_1=0}^{d-1} x[j_1 r + j_0] \cdot W_{|x|}^{j_1 k_1 |x| + j_1 r k_0 + j_0 k}\]

\[= \sum_{j_0=0}^{r-1} \sum_{j_1=0}^{d-1} x[j_1 r + j_0] \cdot W_{|x|}^{j_1 k_1 |x|} \cdot W_{|x|}^{j_1 r k_0} \cdot W_{|x|}^{j_0 k}\]

<p>Since \(W_{\lvert x \rvert}^{j_1 k_1 \lvert x \rvert} = (e^{-i \frac{2 \pi \lvert x \rvert}{\lvert x \rvert}})^{j_1 k_1} = 1^{j_1 k_1} = 1\), therefore:</p>

\[\mathcal{F} \{ x \}[k] = \sum_{j_0=0}^{r-1} \sum_{j_1=0}^{d-1} x[j_1 r + j_0] \cdot W_{\lvert x \rvert}^{j_1 r k_0} \cdot W_{\lvert x \rvert}^{j_0 k}\]

<p>At this point, we can split the elements of \(x\) into sub-sequences corresponding to modulo classes. Let
\(x_{r}^{j_0}\) be a sequence whose elements are equal to the elements of \(x\) whose indices are equivalent \(j_0\)
modulo \(r\). More formally, these sequences (of which there are \(r\) total) can be defined as follows:</p>

\[x_r^{j_0}[j_1] = x[j_1 r + j_0]\]

\[|x_r^{j_0}| = \frac{|x|}{r} = d\]

<p>Substituting this sequence definition, we get:<sup id="fnref:dft-xj0" role="doc-noteref"><a href="#fn:dft-xj0" class="footnote" rel="footnote">2</a></sup></p>

\[\mathcal{F} \{ x \}[k] = \sum_{j_0=0}^{r-1} \sum_{j_1=0}^{d-1} x_r^{j_0}[j_1] W_{|x|}^{k_0 j_1 r} W_{|x|}^{j_0 k}\]

\[= \sum_{j_0=0}^{r-1} \sum_{j_1=0}^{d-1} x_r^{j_0}[j_1] W_d^{k_0 j_1} W_{|x|}^{j_0 k}\]

\[= \sum_{j_0=0}^{r-1} \mathcal{F} \{ x_r^{j_0} \}[k_0] W_{|x|}^{j_0 k}\]

<p>Let’s consider how long it will take to evaluate the discrete Fourier transform in this formulation:</p>

<ul>
  <li>First, we need to pre-compute the discrete Fourier transforms of the \(r\) different sequences \(x_r^{j_0}\) (i.e.
\(\mathcal{F} \{ x_r^0 \}, \mathcal{F} \{ x_r^1 \}, \ldots \mathcal{F} \{ x_r^{r-1} \}\)), each of
length \(d\). Assuming each DFT is performed naïvely, this will take \(O(r \cdot d^2) = O(\lvert x \rvert \cdot d)\)
operations.</li>
  <li>Then, we need to compute the \(\lvert x \rvert\) values of \(\mathcal{F} \{ x \}\), each of which requires a summation
over \(r\) different values of \(j_0\) for \(\mathcal{F} \{ x^{(j_0)} \}[k_0]\),
taking \(O(\lvert x \rvert \cdot r)\) operations.</li>
</ul>

<p>Added together, these two sub-routines require \(O(\lvert x \rvert \cdot d + \lvert x \rvert \cdot r) = O(\lvert x \rvert \cdot (d + r))\)
operations, possibly a significant improvement from the original \(O(\lvert x \rvert^2)\) complexity of the original
naive formulation, depending on the values of \(r\) &amp; \(d\). More importantly, this manipulation can be applied
recursively. Specifically, each of the \(r\) discrete Fourier transforms of the \(d\)-length sequences \(x_r^{j_0}\) can
be broken down into \(r'\) Fourier transforms of length \(d'\), assuming that two natural numbers exist such
that \(d = r' \cdot d'\).<sup id="fnref:radix-one" role="doc-noteref"><a href="#fn:radix-one" class="footnote" rel="footnote">3</a></sup> In the ideal<sup id="fnref:ideal" role="doc-noteref"><a href="#fn:ideal" class="footnote" rel="footnote">4</a></sup> case where \(\lvert x \rvert = 2^n\), \(n \in \mathbb{N}\),
calculating the Cooley-Tukey algorithm will
require \(O(\lvert x \rvert \cdot (2 + 2 + \ldots + 2)) = O(\lvert x \rvert \cdot 2 \cdot \log_2(\lvert x \rvert)) = O (\lvert x \rvert \cdot \log(\lvert x \rvert))\)
operations.</p>

<p>The Cooley-Tukey algorithm can also be used to calculate the inverse discrete Fourier transform with only very slight
modification. In fact, the original Cooley-Tukey paper (see “related reading”) specifically described an algorithm to
compute the <em>inverse</em> discrete Fourier transform, not the “forward” DFT. I will leave the Cooley-Tukey iDFT algorithm as
an exercise for the reader.</p>

<p>However, note that the Cooley-Tukey algorithm gives no speed-up for input sequences of prime length, and provides
relatively little speed-up when the factors of the input length contain large primes. To efficiently compute the DFT for
sequences of prime or even non-highly-composite lengths, we will need additional algorithms. Ultimately, however, these
other FFT algorithms generally depend on Cooley-Tukey for part of the computation. I plan to cover at least one of these
techniques—Bluestein’s algorithm—in a future blog post(s).</p>

<h3 id="cooley-tukey-interactive-visualization">Cooley-Tukey Interactive Visualization</h3>

<p>This visualization shows how the discrete Fourier transform of some signal \(x\) is computed using the Cooley-Tukey
algorithm. The black boxes at the very bottom are the input signal. While the DFT can be applied to complex signals,
I’ve restricted the sample values of the input signal to be real numbers, for simplicity’s sake (this mimics some
real-world applications, such as performing a DFT on an audio recording). You can click and drag on the input boxes to
change their values.</p>

<p>The grey circles and the sometimes visible white “clock hands” inside of them represent the complex
exponent \(W_N^x = e^{-2 i \pi \frac{x}{N}}\) for some \(N\) (e.g. \(\lvert x \rvert\), \(d\), \(d'\), etc.) and
some \(x\). These complex exponents, which are equivalent to rotations in the complex plane, are applied to the relevant
input value. Unlike the usual convention, I’ve decided to show the real component of the complex plane as vertical (“up”
is positive-real) and the imaginary component as horizontal (“right” is imaginary-positive). You can see where the input
value is drawn from by hovering the mouse over a given “rotation” box.</p>

<p>The sum of those rotated input values is added together to calculate one element of a discrete Fourier transform. Hover
over a white “output” box of a discrete Fourier transform in the visualization to highlight the column of “rotation”
boxes that it was summed from.</p>

<p>\(|x| =\)
<input type="number" min="1" max="32" value="24" id="cooley-tukey-size-selector" /></p>

<p>Available factors:</p>
<div id="factor-check-boxes"></div>

<div id="r-array"></div>

<div id="ct-math-output" style="font-size: x-large; height: 150px; display: flex; justify-content: center; align-items: center; flex-direction: column; overflow: scroll"></div>

<canvas id="cooley-tukey-visualization" width="800" height="1000"></canvas>

<script src="/static/cooley_tukey_visualization.js"></script>

<h2 id="a-quick-rant-on-word-choice">A Quick Rant on Word Choice</h2>

<p>I’ve noticed an irritating and confusing tendency among many people–including
<a href="https://www.google.com/books/edition/The_Sparse_Fourier_Transform/I4ZTDwAAQBAJ?hl=en&amp;gbpv=1&amp;dq=%22the+FFT+of%22&amp;pg=PT78&amp;printsec=frontcover">published authors</a>–when
talking about discrete Fourier transforms. I find they often use the phrase “fast Fourier transform” (or perhaps more
often, the abbreviation “FFT”) when they mean “discrete Fourier transform” (or “DFT”). I think this is wrong and
confusing; to understand why, imagine you have a list:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="p">[</span><span class="mi">10</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">19</span><span class="p">,</span> <span class="o">-</span><span class="mi">2</span><span class="p">]</span>
</code></pre></div></div>

<p>would it make sense to refer to the following list as the “mergesort” of the previous list?</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">y</span> <span class="o">=</span> <span class="p">[</span><span class="o">-</span><span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">19</span><span class="p">]</span>
</code></pre></div></div>

<p>I think most people would find that very strange. We can say that the second list is the result of sorting the first
list, but we don’t know anything about the specific algorithm that was used to sort the list. It could have been sorted
using mergesort, quicksort, heapsort, bubblesort, bogosort, or any
other <a href="https://en.wikipedia.org/wiki/Sorting_algorithm#Popular_sorting_algorithms">sorting algorithm</a> (in reality, I
just sorted this one in my head). However, even if I <em>had</em> used mergesort, <code class="language-plaintext highlighter-rouge">y</code> still wouldn’t <em>be</em> the “mergesort” of
<code class="language-plaintext highlighter-rouge">x</code>, it would still just be “the result of sorting <code class="language-plaintext highlighter-rouge">x</code>”.</p>

<p>Similarly, the output of an FFT algorithm should not be referred to as “an/the FFT of” anything. In theory, calculating
the DFT of a sequence using Cooley-Tukey gives the exact same result as calculating that DFT using a naïvely-implemented
DFT algorithm. In practice, Cooley-Tukey will probably give a slightly more accurate result since there are fewer total
calculations and therefore fewer opportunities for floating point round-off error.</p>

<p>This is not just irritating to me; I think it causes confusion among the public. A friend of mine–an intelligent
mathematics major and software engineer–recently asked me “what would I lose by taking the fast Fourier transform
instead of the ‘normal’ Fourier transform? I’ve always just taken the normal Fourier transform”. He probably inferred
from the way many people throw around the phrase “FFT” that the “fast Fourier transform” is some kind of approximation
or related concept yet distinct from the discrete Fourier transform or Fourier transforms in general, rather than an
<em>algorithm</em> for calculating such.</p>

<h2 id="sources--related-reading">Sources &amp; Related Reading</h2>

<ul>
  <li><a href="https://web.stanford.edu/class/cme324/classics/cooley-tukey.pdf">An Algorithm for the Machine Calculation of Complex Fourier Series</a> by James W. Cooley &amp; John W. Tukey</li>
</ul>

<div style="text-align: center">
  <p><em>Thank you to my friend Andre Archer, who helped to proofread an earlier version of this post. Any mistakes are my own.</em></p>
</div>

<p><strong>Footnotes:</strong></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:euclidean-division" role="doc-endnote">
      <p>As far as I am aware, there is no widely-standardized notation for Euclidean division in
                   mathematics. Personally, I find this quite silly. <a href="#fnref:euclidean-division" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:dft-xj0" role="doc-endnote">
\[\mathcal{F} \{ x_r^{j_0} \}[k_0] = \sum_{j_1 = 0}^{d} x_r^{(j_0)}[j_1] \cdot W_d^{k_0 j_1}\]
      <p><a href="#fnref:dft-xj0" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:radix-one" role="doc-endnote">
      <p>If \(r\) is chosen to be equal to \(\lvert x \rvert\), and therefore \(d = 1\), then
          calculating \(\mathcal{F} \{ x \}\) using Cooley-Tukey is equivalent to calculating
          \(\mathcal{F} \{ x \}\) naïvely, taking \(O(\lvert x \rvert^2)\) steps. <a href="#fnref:radix-one" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:ideal" role="doc-endnote">
      <p>By “ideal” I do not necessarily mean optimal, in any sense. \(|x| = 2^n\) is “ideal” in that it is very simple
      to reason about. <a href="#fnref:ideal" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Connor Boyle</name></author><category term="mathematics" /><category term="software" /><summary type="html"><![CDATA[I’m planning to write a series of posts about fast Fourier transform algorithms. This first post covers the Cooley-Tukey algorithm, which is the original and most well-known FFT algorithm.]]></summary></entry><entry><title type="html">Rep. Paul Gosar’s Claims about OPT Contain Major Errors</title><link href="https://connorboyle.io/2025/05/23/gosar-wrong-on-opt.html" rel="alternate" type="text/html" title="Rep. Paul Gosar’s Claims about OPT Contain Major Errors" /><published>2025-05-23T00:00:00+00:00</published><updated>2025-05-23T00:00:00+00:00</updated><id>https://connorboyle.io/2025/05/23/gosar-wrong-on-opt</id><content type="html" xml:base="https://connorboyle.io/2025/05/23/gosar-wrong-on-opt.html"><![CDATA[<p>U.S. Congressional Representative Paul Gosar of Arizona<sup id="fnref:state" role="doc-noteref"><a href="#fn:state" class="footnote" rel="footnote">1</a></sup> recently reintroduced proposed legislation to ban the
Optional Practical Training program (OPT).<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">2</a></sup> OPT is a program that allows international students attending college &amp;
grad school to legally remain in the United States and work in their field of study for 1 year after completing their
degrees.<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">3</a></sup> Students majoring in an approved science, technology, engineering, and/or math
(STEM) <a href="https://www.ice.gov/sites/default/files/documents/stem-list.pdf">field</a> are eligible to apply for a 2-year
extension at the end of their initial 1-year OPT periods.</p>

<p>As a former volunteer at my alma mater’s international student program, I happen to have some knowledge about OPT and
the policies concerning student (F-1) visa holders that led me to suspect there were major errors in Paul
Gosar’s <a href="https://web.archive.org/web/20131031070722/http://www.irs.gov/Individuals/International-Taxpayers/Exempt-Individual-Who-is-a-Student">statement</a>
regarding the bill, as well as in the articles from advocacy groups that he cites. In addition to some superficial and
obvious errors, such as claiming that OPT was “expanded by three years by the Obama Administration” (it was in fact
expanded only by seven months), Gosar’s post claims that:</p>

<blockquote>
  <p>These foreign workers are exempt from payroll taxes[,] making them at least 10-15 percent [sic] cheaper than a
comparable American worker. NumbersUSA
<a href="https://www.numbersusa.com/wp-content/uploads/2024/11/Factsheet_-Optional-Practical-Training-OPT.pdf">reports</a>
OPT costs the Social Security and Medicare trust fund [sic] $4 billion annually.</p>
</blockquote>

<p>The document from NumbersUSA (an immigration restrictionist advocacy group) cites
a <a href="https://cis.org/Feere/Optional-Practical-Training-Foreign-Students-Now-4-Billion-Annual-Tax-Exemption">2024 article</a>
from the Center for Immigration Studies (CIS, another restrictionist group), which in turn simply scales up an earlier
estimate from a <a href="https://cis.org/North/Obscure-Immigration-Program-Hurts-US-Residents-Both-Young-and-Old">2015 CIS article</a>.</p>

<p>The author of the 2015 article, David North, tallies the number of OPT &amp; OPT STEM extension approvals, then multiplies
by 12 or 17 months respectively (the STEM extension was only 17 months long at the time) to determine the number of
years an OPT worker will have worked in the United States without being liable for Social Security and Medicare taxes.
He proposes an estimated average OPT salary based on that of the average college graduate, and estimates the total loss
to Social Security &amp; Medicare like so:</p>

\[(\textrm{FICA-exempt worker-years}) \cdot (\textrm{avg. salary}) \cdot (\textrm{total FICA rate}) = \textrm{total loss to Soc. Sec. &amp; Medicare}\]

\[= \textrm{524,021 years} \cdot $\textrm{50,000/yr.} \cdot 15.3\% = \textrm{\$4,008,760,600}\]

<p><br /></p>

<h2 id="many-opt-workers-do-in-fact-pay-fica">Many OPT Workers do in Fact Pay FICA</h2>

<p>There are a few problems with this estimate, but I will start with the most glaring one: it is <em>not true</em> that all or
even nearly all post-completion OPT workers (and their employers) are exempt from FICA. The reason that <em>some</em> OPT
workers are exempt from FICA is that they are <strong>nonresident aliens</strong>. And while F-1 student visa holders (including OPT
workers) are always considered nonresident aliens for immigration purposes, they are only treated as such for tax
purposes <em>until they pass
the <a href="https://www.irs.gov/individuals/international-taxpayers/substantial-presence-test">“substantial presence” test</a></em>.<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">4</a></sup></p>

<p>The “substantial presence” test is rather complicated, but in a typical case, an international student will not be
considered substantially present for their first 5 calendar years in the United States, due
to <a href="https://www.irs.gov/individuals/international-taxpayers/exempt-individual-who-is-a-student">a policy</a> exempting F-1
visa holders for that period. The visa holder will usually come to be considered “substantially present”—assuming they
spend a majority of that year in the United States—beginning January 1st of their 6th calendar year in the United
States. The visa holder is then treated as a <strong>resident alien</strong> for tax purposes and is thus subject to the payroll
taxes funding Social Security &amp; Medicare, a.k.a. FICA. The only likely way for an international student <em>not</em> to be
considered a resident alien is if they spend fewer than 183 days in the United States in their final calendar year in
the United States.</p>

<p>Substantial presence is not some niche exception; it ultimately applies to a very large number—perhaps a majority—of
post-completion OPT workers at some point in their OPT period. Keep in mind that degree programs are generally
mis-aligned to calendar years and therefore push F-1 holders/OPT workers into resident alien tax status sooner than if
they weren’t.<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">5</a></sup></p>

<table>
  <thead>
    <tr>
      <th>Degree type</th>
      <th>Typical length</th>
      <th>Max Yrs. as NRA post-completion (STEM / non-STEM)</th>
      <th>Explanation</th>
      <th>Exceptions</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Bachelor’s (B.A. / B.S.)</td>
      <td>4 years</td>
      <td>0.5 / 0.5</td>
      <td>Degree completed in 5th calendar year, likely begins post-completion OPT in June or July. Resident alien tax status kicks in January of the following calendar year</td>
      <td>If the OPT worker leaves the US before early July of the following year, they will remain a nonresident alien for tax purposes</td>
    </tr>
    <tr>
      <td>Master’s (M.A. / M.S.)</td>
      <td>2 years</td>
      <td>2.5 / 1</td>
      <td>Degree completed in 3rd calendar year; resident alien tax status kicks in January 1 of 6th calendar year, roughly two and a half years after completion</td>
      <td>Many master’s students were previously bachelor’s students in the US and thus would have 0 years of nonresident alien tax status post-completion</td>
    </tr>
    <tr>
      <td>Law (J.D.)</td>
      <td>3 years</td>
      <td>N.A. / 1</td>
      <td>Degree completed in 4th calendar year; resident alien tax status would kick in January 1 of 6th calendar year, roughly two and a half years after completion; however, law students do not qualify for the STEM extension</td>
      <td>Many law students were previously bachelor’s students in the US and thus would have 0 years of nonresident alien tax status post-completion</td>
    </tr>
    <tr>
      <td>Doctorate (Ph.D.)</td>
      <td>5+ years</td>
      <td>0 / 0</td>
      <td>Degree completed in 6th calendar year (optimistic for many fields); resident alien tax status kicks in during last year of degree (or sooner, if the student had previously studied in the US)</td>
      <td> </td>
    </tr>
  </tbody>
</table>

<p>Unfortunately, I couldn’t seem to find public reporting from ICE or USCIS to break down the share of OPT participants by
level of degree attained. However, based on this report from the Niskanen center, which obtained additional statistics
on OPT educational attainment levels via
a <a href="https://en.wikipedia.org/wiki/Freedom_of_Information_Act_(United_States)">FOIA request</a>, we can see the different
levels of education attained by OPT workers:</p>

<p><img alt="Share of all OPT authorizations by level of educational attainment" src="/images/opt_gosar/degree_shares.png" /></p>
<div style="text-align: center"><i>From <a href="https://www.niskanencenter.org/wp-content/uploads/old_uploads/2019/03/OPT.pdf">Niskanen Center</a></i></div>
<p><br /></p>

<p>I have attempted to combine the Niskanen
data, <a href="https://www.ice.gov/doclib/sevis/btn/23_0425_hsi_sevp-sevis-btn-2022-opt-growth-2007-2022.pdf">official ICE-reported numbers</a>
on OPT &amp; OPT STEM extension approvals in 2022, and no small number of naïve and arbitrary assumptions<sup id="fnref:assumptions" role="doc-noteref"><a href="#fn:assumptions" class="footnote" rel="footnote">6</a></sup>
to <a href="https://docs.google.com/spreadsheets/d/1EkBYy6-Ye0vIBNCjsKlkrP-Iaam0adIFH5ng5Su4oag/edit?usp=sharing">estimate</a> the
total number of years worked by post-completion OPT workers as nonresident aliens (NRAs) vs. as resident aliens (RAs)
for tax purposes.</p>

<p>I estimated that in 2022, a narrow majority—about 53%—of all post-completion OPT time was worked while in <em>resident</em>
alien status. In other words, if this estimate is accurate, most OPT workers and their employers at any given time are
paying into the Social Security and Medicare trust funds at exactly the same rate as a US citizen and their employer
would be. This estimate comes with much uncertainty; the result can shift quite dramatically depending on how many
master’s graduates we assume to be substantially present by the time of graduation. That said, I think this estimation
is useful as an exercise showing how complicated it is to estimate the amount of FICA paid by OPT workers, as well as a
demonstration of the magnitude of CIS, NumbersUSA, and Paul Gosar’s error.</p>

<h2 id="nonresident-alien-tax-status-is-not-always-advantageous">Nonresident Alien Tax Status is not Always Advantageous</h2>

<p>Excepting FICA exemption, nonresident alien status generally leads to a <em>higher</em> tax burden. Nonresident aliens other
than Indian
residents<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">7</a></sup> <a href="https://www.irs.gov/individuals/international-taxpayers/nonresident-figuring-your-tax">cannot take the standard deduction</a>,
and almost no nonresident aliens<sup id="fnref:13" role="doc-noteref"><a href="#fn:13" class="footnote" rel="footnote">8</a></sup> can take the earned income tax credit, the American opportunity tax credit, the
lifetime learning credit, or many other deductions and credits that an American worker would be able to take.</p>

<p><img alt="Differences in taxation for nonresident aliens" src="/images/opt_gosar/taxation_differences.png" /></p>
<div style="text-align: center">
    <i>I generated this chart <a href="https://colab.research.google.com/drive/1HIGntXqqdbEEwDEg_2GL2LMsc-Szfd0X?usp=sharing">here</a></i>
</div>
<div style="text-align: center">
    <i>(*) Assuming the nonresident alien cannot use the standard deduction, which is not true for Indian residents</i>
</div>
<div style="text-align: center">
    <i>(†) If the worker is a full-time student working for a college or university, that income is exempt from FICA, so the "FICA not paid" would instead be zero</i>
</div>
<p><br /></p>

<p>In the above chart I have plotted the additional tax burden caused by a lack of standard deduction compared to the tax
burden avoided by being exempt from FICA.<sup id="fnref:treas" role="doc-noteref"><a href="#fn:treas" class="footnote" rel="footnote">9</a></sup> Note that the lower one’s income, the less of a tax advantage one is
conferred by nonresident alien status. At income levels of around $20,000 or less, more than the entire employee FICA
advantage for NRAs is wiped out by the increased tax burden due to a lack of standard deduction. You may think it
unlikely that a post-completion OPT worker would make only $20,000 (or less) in a year; if so, I think you would be
correct (see next section).</p>

<p>However, many international students likely earn wages in this range when
they <a href="https://www.ice.gov/sevis/employment#onCE">work on-campus</a> while enrolled as a student. International students
(like their domestic peers) can take part-time jobs at their home institutions, often helping to run facilities such as
libraries and cafeterias or serving as teaching assistants (or even instructors, in the case of PhD students). In the
case of these student workers, nonresident alien status loses its main advantage (that is, FICA exemption), because all
student workers at colleges and universities—including U.S. citizens—are
universally <a href="https://www.irs.gov/charities-non-profits/student-exception-to-fica-tax">exempt from FICA</a>. In other words,
F-1 student visa holders are most likely to have nonresident alien status precisely when it is least advantageous,
often <em>increasing</em> their total tax burden relative to a US citizen’s.</p>

<p>I would like to pause very briefly to bring up just how unfair the tax treatment of international students can be.
Nonresident aliens are excluded from paying FICA for a specific reason—FICA specifically funds social insurance
programs that most nonresident aliens will never be able to collect from (granted, this logic doesn’t particularly apply
to their employers). What is the justification for why we take an extra 10% out of the wages of an international student
paying their way through college by washing dishes in the cafeteria? If anyone can tell me a good reason, I’m all ears,
but until then I’ll assume it’s nothing more nor less than “because we can”.</p>

<h2 id="opt-workers-are-probably-paid-much-more-than-the-average-new-college-graduate">OPT Workers are Probably Paid Much More than the Average New College Graduate</h2>

<p>In CIS’s 2015 estimate of lost FICA income due to OPT workers, author David North estimates the average income of an OPT
worker as $50,000 per year, based on a <a href="https://money.cnn.com/2011/02/10/pf/college_graduates_salaries/">2011 report</a>
showing the average new college grad salary as $50,034.<sup id="fnref:feere" role="doc-noteref"><a href="#fn:feere" class="footnote" rel="footnote">10</a></sup> I’m a bit surprised he didn’t try to find a newer
estimate or at least adjust for general wage growth, as this would have allowed him to claim that OPT was depriving
Social Security and Medicare of an even larger amount of money.</p>

<p>On the other hand, I think this would contradict a larger narrative that Rep. Gosar and CIS seem to believe about OPT.
Namely, they assert that OPT is fundamentally a sham program to funnel “inexpensive foreign labor” (in Paul Gosar’s
words) into American jobs. Listening to immigration restrictionists, you’d get the impression that the average OPT
worker has “graduated” from some fly-by-night for-profit diploma-cum-visa mill, or at best an un-selective associate’s
degree program, in order to exploit the OPT “loophole” with their sham degree and work in some low-skill job with no
real connection to their supposed field of study. As David
North <a href="https://cis.org/North/OPT-Program-Provides-Laborers-Contractors-825-Discount">wrote</a> for CIS in 2018, “many a
pizza place is staffed with OPT workers”.</p>

<p>This certainly does not resemble the experiences of my international classmates at Macalester College, who often
out-earned me at renowned firms such as Google, Ernst &amp; Young, or Merck, to name a few. But perhaps my view from a
selective liberal arts college is not representative of the typical OPT worker and their
career. <a href="https://www.ice.gov/doclib/sevis/btn/23_0425_hsi_sevp-sevis-btn-2022-top100-prepost-opt-schools.pdf">This spreadsheet</a>
published by ICE showing the top schools for active OPT records in 2022 might give us some insight into what kind of
new college and university graduates are participating in OPT. Here are the top 20 schools:</p>

<div style="height: 500px; overflow: scroll">

  <table>
    <thead>
      <tr>
        <th>#</th>
        <th>Campus Name</th>
        <th style="text-align: right">Graduates Employed through OPT</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>1</td>
        <td>Northeastern University</td>
        <td style="text-align: right">4,593</td>
      </tr>
      <tr>
        <td>2</td>
        <td>Columbia University</td>
        <td style="text-align: right">4,127</td>
      </tr>
      <tr>
        <td>3</td>
        <td>University of Southern California</td>
        <td style="text-align: right">3,520</td>
      </tr>
      <tr>
        <td>4</td>
        <td>New York University</td>
        <td style="text-align: right">2,996</td>
      </tr>
      <tr>
        <td>5</td>
        <td>Arizona State University</td>
        <td style="text-align: right">2,663</td>
      </tr>
      <tr>
        <td>6</td>
        <td>University of California at Berkeley</td>
        <td style="text-align: right">2,363</td>
      </tr>
      <tr>
        <td>7</td>
        <td>Carnegie Mellon University</td>
        <td style="text-align: right">2,134</td>
      </tr>
      <tr>
        <td>8</td>
        <td>The University of Texas at Dallas</td>
        <td style="text-align: right">2,096</td>
      </tr>
      <tr>
        <td>9</td>
        <td>Boston University</td>
        <td style="text-align: right">2,017</td>
      </tr>
      <tr>
        <td>10</td>
        <td>University of Illinois, Urbana-Champaign</td>
        <td style="text-align: right">1,996</td>
      </tr>
      <tr>
        <td>11</td>
        <td>The University of Texas at Arlington</td>
        <td style="text-align: right">1,894</td>
      </tr>
      <tr>
        <td>12</td>
        <td>Purdue University</td>
        <td style="text-align: right">1,797</td>
      </tr>
      <tr>
        <td>13</td>
        <td>University of Washington - Seattle</td>
        <td style="text-align: right">1,758</td>
      </tr>
      <tr>
        <td>14</td>
        <td>State University of New York at Buffalo</td>
        <td style="text-align: right">1,754</td>
      </tr>
      <tr>
        <td>15</td>
        <td>University of California San Diego</td>
        <td style="text-align: right">1,673</td>
      </tr>
      <tr>
        <td>16</td>
        <td>University of Michigan - Ann Arbor</td>
        <td style="text-align: right">1,635</td>
      </tr>
      <tr>
        <td>17</td>
        <td>University of California, Los Angeles</td>
        <td style="text-align: right">1,620</td>
      </tr>
      <tr>
        <td>18</td>
        <td>University of Pennsylvania</td>
        <td style="text-align: right">1,553</td>
      </tr>
      <tr>
        <td>19</td>
        <td>Harvard University</td>
        <td style="text-align: right">1,497</td>
      </tr>
      <tr>
        <td>20</td>
        <td>Georgia Institute of Technology</td>
        <td style="text-align: right">1,469</td>
      </tr>
      <tr>
        <td> </td>
        <td>…</td>
        <td style="text-align: right">…</td>
      </tr>
      <tr>
        <td> </td>
        <td><strong>Total for Top 100 Schools</strong></td>
        <td style="text-align: right">106,060</td>
      </tr>
      <tr>
        <td> </td>
        <td><strong><a href="https://www.ice.gov/doclib/sevis/btn/23_0425_hsi_sevp-sevis-btn-2022-opt-growth-2007-2022.pdf">Full Total (*2022)</a></strong></td>
        <td style="text-align: right">171,635</td>
      </tr>
    </tbody>
  </table>

</div>

<p>Not only are these schools reputable, legitimate institutions of higher learning, many of them are the most prestigious
universities in the entire world!<sup id="fnref:pause" role="doc-noteref"><a href="#fn:pause" class="footnote" rel="footnote">11</a></sup> The top 100 schools for OPT authorizations includes almost the entire Ivy
League (with the exceptions of Dartmouth and Princeton), Stanford, MIT, Duke, Johns Hopkins, and the University of
Chicago. This is remarkable, considering that these exceptionally prestigious schools are generally not particularly
large; for example, Harvard’s
total <a href="https://web.archive.org/web/20230720173229/https://bpb-us-e1.wpmucdn.com/sites.harvard.edu/dist/6/210/files/2023/06/harvard_cds_2022-2023.pdf">enrollment</a>
of 21,278 is smaller than that of 11 California State University campuses, only 1 of which is among the top 100 OPT
schools.</p>

<p>Using data on the top schools for OPT authorizations in 2022,
I <a href="https://docs.google.com/spreadsheets/d/1d4AIca2vP4HfnnYRUuTKnIH2tcFpIVcViR-n4RXaAOI/edit?usp=sharing">compared</a>
several schools’ share of all degrees awarded to their share of OPT workers. In that year, compared to the general
population of new bachelor’s degree recipients, an OPT authorization recipient was:</p>

<ul>
  <li>over <strong>7 times</strong> as likely to have graduated from Stanford,</li>
  <li>over <strong>11 times</strong> as likely to have graduated from MIT, and</li>
  <li>over <strong>13 times</strong> as likely to have graduated from Harvard</li>
</ul>

<p>this discrepancy is due in part to the fact that these institutions grant far more master’s degrees compared to
similarly-sized institutions. However, even comparing to the average master’s recipient, an OPT worker is:</p>

<ul>
  <li><strong>2.23 times</strong> as likely to have graduated from Stanford,</li>
  <li><strong>2.77 times</strong> as likely to have graduated from MIT, and</li>
  <li><strong>1.92 times</strong> as likely to have graduated from Harvard</li>
</ul>

<p>Sources such as
this <a href="https://www.stlouisfed.org/on-the-economy/2024/oct/how-important-instructional-spending-college-students-future-earnings">article from the St. Louis Federal Reserve Bank</a>,
as well as common sense, tell us that graduates of more selective universities tend to significantly out-earn graduates
of less selective institutions. We also know that most OPT recipients are master’s degree holders; according
to <a href="https://www.bls.gov/careeroutlook/2023/data-on-display/education-pays.htm">this report from the Bureau of Labor Statistics</a>,
master’s degree holders out-earned bachelor’s degree holders by 15% in 2022. Contrary to Paul Gosar’s description of OPT
workers as “inexpensive foreign labor”, I suspect that OPT workers significantly out-earn the average new college
graduate.</p>

<h2 id="gosar-et-als-assumptions-about-employment-are-likely-wrong">Gosar et al.’s Assumptions about Employment are Likely Wrong</h2>

<p>Representative Gosar, NumbersUSA, and CIS seem to assume that OPT workers’ only impact on the U.S. economy is to take
jobs away from U.S. citizens and green card holders who could have worked these jobs and now must be unemployed as a
result. In their counterfactual analysis, these critics of OPT don’t consider the possibility that the quantity and
quality of labor available could itself influence the rate of formation, expansion, or closure for firms. They also
don’t consider that the quantity and quality of labor available could affect the quality and price of goods &amp; services,
and by extension the real incomes of U.S. citizens. They do not entertain the possibility that some of the jobs worked
by OPT workers could have otherwise been outsourced, taken by a foreign competitor, or simply gone unfilled.
Nevermind that many former international students
have <a href="https://cdo.som.yale.edu/blog/2023/03/14/how-to-launch-a-startup-as-an-international-student/">founded multi-billion dollar companies</a>
that employ many thousands of U.S. citizens.</p>

<p>There is a large body of economics research analyzing the effects of skilled immigration on the US economy and workers.
I am not by any means an expert in economics, however, The University of Chicago, itself home to one of the most
highly-regarded economics departments in the world, conducts polls of economists on various economics questions. For
example, on the question
of <a href="https://kentclarkcenter.org/surveys/high-skilled-immigrant-visas/">whether reducing the number of H-1(b) visas would increase employment opportunities for American workers</a>,
45% of polled economists responded “disagree”, 36% responded “strongly disagree”, and 19% responded “uncertain”; 0% of
economists responded “agree” or “strongly agree”. Put differently, the field of economics generally believes that
H-1(b)—a program that allows foreign skilled workers similar to those in OPT to work temporarily in the United
States—does not significantly negatively impact the employment prospects of American workers.</p>

<p>I would be interested to see whether these experts feel similarly about OPT as they do H-1(b); if so (and if they are
not all wrong), then each OPT worker may in fact <em>not</em> be simply displacing an American worker, but instead adding new
value to the American economy that wouldn’t have existed otherwise. In that case, I think it is only fair to consider
some share of the FICA paid by OPT workers to be a net gain relative to the counterfactual in which those OPT workers
were not allowed to work in the United States. Rather than depriving Social Security &amp; Medicare of badly needed funds,
OPT workers may be contributing more than would otherwise be present.</p>

<p><a href="https://www.richmondfed.org/-/media/RichmondFedOrg/publications/research/working_papers/2024/wp24-04.pdf">This paper</a>,
which studies the effects of the random lottery for H-1(b) visas on firms, estimates that each H-1(b) lottery win for a
firm increase that firm’s total employment by 0.83. This is only a firm-level estimate; the effect on total employment
in the United States economy may be greater or lesser than this number.</p>

<p>Out of a desire to estimate the net effect of OPT on the Social Security &amp; Medicare trust funds, I’ll split the
difference between:</p>

<ul>
  <li>Rep. Gosar, CIS, and NumbersUSA, who implicitly assume that each OPT worker adds roughly 0 new employment to the
United
States, and</li>
  <li>the 42 expert economists polled by the University of Chicago, who seem to think that each skilled worker (in their
case, in the H-1(b) program) adds approximately 1 (or more) total comparably-employed workers to the United States
economy.</li>
</ul>

<p>therefore, I will assume that each OPT worker only increases total employment (comparable in compensation to the job
worked by an OPT worker) in the United States by 0.5. Put differently, I’ll suppose, for the sake of estimation, that
OPT workers really <em>do</em> displace American workers (for the duration of their OPT periods), but because a) they increase
firm profitability b) lower the cost of goods &amp; services to American consumers, and c) work some jobs that may have
otherwise gone unfilled or taken by a foreign competitor, they generate enough additional economic value in the U.S.
that the number of displaced workers is only half of the number OPT workers.</p>

<h2 id="conclusion">Conclusion</h2>

<p>To estimate the net gain or loss to the Social Security and Medicare trust funds, we must compare the reality of the US
with OPT workers to the counterfactual without OPT workers. First, the FICA revenue generated in reality by OPT
workers (directly or indirectly):</p>

\[(\textrm{OPT}_{RA} + \textrm{OPT}_{total} \cdot \textrm{NewEmployment}) \cdot \textrm{AvgSalary} \cdot \textrm{FICA}\]

<p>minus the counterfactual (in which OPT does not exist):</p>

\[\textrm{OPT}_{total} \cdot \textrm{AvgSalary} \cdot \textrm{FICA}\]

<p>where:</p>

<ul>
  <li>\(\textrm{OPT}_{total}\) is the total number of OPT worker-years</li>
  <li>\(\textrm{OPT}_{RA}, \textrm{OPT}_{NRA}\) is the number of OPT worker-years spent in resident or nonresident alien
status, respectively</li>
  <li>\(\textrm{NewEmployment}\) is the total (worker-years of) employment added to the US economy by each worker-year of
OPT</li>
  <li>\(\textrm{AvgSalary}\) is the average salary of OPT workers</li>
  <li>\(\textrm{FICA}\) is the total FICA rate (employer- and employee-)</li>
</ul>

<p>Subtracting reality from the counterfactual gives us:</p>

\[(\textrm{OPT}_{RA} + \textrm{OPT}_{total} \cdot \textrm{NewEmployment} - \textrm{OPT}_{total}) \cdot \textrm{AvgSalary} \cdot \textrm{FICA}\]

\[= (\textrm{OPT}_{total} \cdot \textrm{NewEmployment} - \textrm{OPT}_{NRA}) \cdot \textrm{AvgSalary} \cdot \textrm{FICA}\]

<p>Now I’ll substitute the values we estimated earlier.
This <a href="https://www.naceweb.org/about-us/press/final-average-starting-salary-for-class-of-2022-rises-more-than-7-percent">report</a>
from the National Association of Colleges and Employers says that new bachelor’s graduates in 2022 took home a median
starting salary of $60,028. To give a very rough estimate of the salary of OPT workers (who, on average, are master’s
degree holders), I’ll increase that by 15%, to get an average salary of $69,032.20.<sup id="fnref:underestimate" role="doc-noteref"><a href="#fn:underestimate" class="footnote" rel="footnote">12</a></sup></p>

\[= (\textrm{246,989 worker-years} \cdot 0.5 - \textrm{115,508 worker-years}) \cdot \textrm{\$69,032.20} \cdot \textrm{15.3%}\]

\[= \textrm{7,686.5 worker-years} \cdot \textrm{15.3%} \cdot \textrm{\$69,032.20}\]

\[= \textrm{\$81,184,248.81}\]

<p>I estimate that the total net gain to the Social Security &amp; Medicare trust funds due to the existence of OPT could be
$81,184,248.81. I don’t expect anyone to take this estimate too seriously; I had to make arbitrary guesses for many
very important values.<sup id="fnref:assumptions:1" role="doc-noteref"><a href="#fn:assumptions" class="footnote" rel="footnote">6</a></sup></p>

<p>That said, this estimate is at least as valid as that of CIS, which through a chain of blindly trusting references,
ended up being cited by a member of United States Congress as if it were fact—despite a blatant error, which could
easily have been revealed by anyone through a simple web search.</p>

<p>OPT workers are by and large graduates of quite selective colleges and universities, who in fact <em>do</em> contribute
significantly to American social insurance programs from which they ultimately might never collect. Even from the most
self-interested, nationalist perspective, these are among the most desirable people in the world to have in any country.
When I read Paul Gosar describe OPT workers—which many of my closest friends have been at one time or another—as
“cheap foreigners”, I was more than a little insulted on their behalf. I hope this blog post serves to correct the
record. Most OPT workers are anything but cheap, low-quality labor incentivized by unfair tax breaks;
they are some of the best and brightest of our workforce.</p>

<p>Rather than making yet another futile attempt to eliminate the OPT program, I would suggest that Congressman Gosar
instead propose legislation to reduce or eliminate the time (currently 5 calendar years) that F-1 student visa holders
are exempt from the substantial presence test. If the years of exemption were eliminated entirely, this would cause
virtually all OPT workers to be subject to FICA for the entirety of their OPT periods, making international students
contribute even more to the coffers of Social Security and Medicare than they already do. However, this would come at a
cost to the United States Treasury, increasing our national deficit, as international student workers would then be able
to take credits and deductions otherwise denied to them.</p>

<h3 id="addendum-no-minor-blunder">Addendum: No Minor Blunder</h3>

<p>David North, the author of the 2015 CIS article providing the original framework for estimating FICA revenue supposedly
lost to OPT, writes that he volunteered to help graduate students—including international students—file their taxes.
And yet, he writes:</p>

<blockquote>
  <p>The assumption is that all these years of tax-free status were used; in reality, probably a small fraction of the
tax-free status was not used because the recent graduate either moved on to another visa category, left the nation,
or, in a very few cases, died.</p>
</blockquote>

<p>North gives no indication that a typical OPT worker would, in fact, be fully liable for FICA for a significant share
(possibly most or all) of their OPT time. I’m tempted to presume that he knew how many OPT workers used OPT after
getting master’s degrees, which in some cases could allow them to be FICA-exempt for <em>nearly</em> the entirety of 1-year OPT
and a STEM extension (assuming they did not get their bachelor’s degrees in the US, too). However, he specifically uses
the average salary of <em>new college graduates</em> as the estimated salary of OPT workers, indicating he assumes they are
primarily bachelor’s degree holders.</p>

<p>This is no small blunder! Determining tax residency is literally the first (and probably most important) step in
filing taxes as a noncitizen. It determines whether the noncitizen filer can submit a normal form 1040 (resident) or
1040-NR (nonresident) as well as which tax filing software the filer can use
(e.g. <a href="https://taxprep.sprintax.com/non-resident-alien-tax-1040nr-turbotax.html">TurboTax vs. Sprintax</a>). So either:</p>

<ul>
  <li>North was woefully incompetent to assist any noncitizen living in the U.S. with their taxes, possibly causing them to
file completely incorrectly</li>
  <li>North lied about having helped international students with their taxes, or</li>
  <li>North knew that many OPT workers would <em>not</em> be exempt from FICA, but intentionally lied to his readers by writing
that nearly all OPT workers are exempt from FICA</li>
</ul>

<p>What’s more, North did not just make this misrepresentation in a blog post (that was ultimately read and cited by a
member of the United States Congress), he made this same claim in
an <a href="https://cis.org/sites/default/files/2019-11/Accepted%20Brief.pdf">amicus brief</a> submitted to a federal court in
2019!<sup id="fnref:misrepresent" role="doc-noteref"><a href="#fn:misrepresent" class="footnote" rel="footnote">13</a></sup> This level of carelessness from a professional think-tank writer who had
written <a href="https://cis.org/North?type=All&amp;special_issues_target_id%5B0%5D=771&amp;page=5">well over a dozen articles</a> on this
topic before submitting this amicus brief is remarkable, especially considering this information was
easily <a href="https://web.archive.org/web/20131031070722/http://www.irs.gov/Individuals/International-Taxpayers/Exempt-Individual-Who-is-a-Student">available to the public on the IRS website</a>
as early as 2013.</p>

<p><em>Thank you to my friend Stephanie Hou, who helped to proofread this post. Any mistakes are my own.</em></p>

<p><strong>Footnotes:</strong></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:state" role="doc-endnote">
      <p>In an earlier version of this post, I mistakenly wrote that Paul Gosar is the representative from Nebraska
instead of Arizona. A friend helpfully pointed out this error to me. <a href="#fnref:state" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:1" role="doc-endnote">
      <p>I will mostly discuss post-completion OPT in this post, since post-completion OPT seems to specifically be the
  part of the program that Paul Gosar and other restrictionists criticize, as opposed to pre-completion
  OPT or Curricular Practical Training (CPT). <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>F-1 visa holders can (and often do) pick a start date for their post-completion OPT period as late as sixty (60)
  days after the end of their program. The OPT work permit lasts for 365 days, and, assuming the student maintains valid
  employment for the entire period, ends with a 60-day grace period where the student cannot work but can legally remain
  in the United States. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>The “substantial presence” test has an exception: the “closer connection” test (TODO: write more about this and
  why it’s unlikely to apply to OPT holders) <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:7" role="doc-endnote">
      <p>Many international students set their OPT start date as late as possible; they can set it up to 60 days after
  their graduation. They do this for two reasons:</p>
      <ul>
        <li>USCIS can be very slow to process OPT applications, taking 90-120 days in some cases (note that F-1 visa
holders can only apply at most 90 days in advance of graduation!!). A late start date ensures that the OPT
worker maximizes their allowed work time and none is spent waiting for approval.</li>
        <li>Once their OPT period starts, F-1 visa holders can only be unemployed for a maximum of 90 days (although the STEM
extension adds 60 unemployment days) before they will be marked out-of-status and be required to leave the
country; since students have to set their OPT start date when they apply, they often set it late to ensure they
will be able to find a job before their allowed unemployment time runs out.</li>
      </ul>
      <p><a href="#fnref:7" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:assumptions" role="doc-endnote">
      <p>I made the following assumptions when calculating post-completion OPT time in RA vs. NRA tax status:</p>
      <ul>
        <li>that the share of degrees represented in OPT approvals in 2022 was identical to that in the 2004-2017 period</li>
        <li>that no OPT workers finished their degrees early or late</li>
        <li>that no bachelor’s degree OPT workers had previously studied or lived in the United States</li>
        <li>that 25% (an arbitrary figure that sounded plausible to me) of master’s degree holders had completed their bachelor’s
degree in the US or would otherwise be considered substantially present by the time of graduation</li>
        <li>that master’s degrees take 2 years</li>
        <li>that all OPT workers completed their entire OPT (and STEM extension if so approved) without early termination or
moving to another visa type (e.g. H-1b)</li>
        <li>that OPT STEM extensions are proportionately distributed among degree levels (excluding associate’s degrees, for
which the STEM extension is not allowed)</li>
      </ul>

      <p>See the spreadsheet I used for estimation <a href="https://docs.google.com/spreadsheets/d/1EkBYy6-Ye0vIBNCjsKlkrP-Iaam0adIFH5ng5Su4oag/edit?usp=sharing">here</a> <a href="#fnref:assumptions" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:assumptions:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:9" role="doc-endnote">
      <p>Indian residents are allowed to take the standard deduction even when they are nonresident aliens (for tax
  purposes) due to a tax treaty between the United States and Inida. Some other countries have tax treaties as
  well, such as China, whose residents can take a deduction of $5,000 (while in NRA tax status), compared to the
  standard deduction of $14,650 for single filers in 2024.
  See <a href="http://web.archive.org/web/20250426123115/https://blog.sprintax.com/tax-treaties-whats-deal/">Sprintax</a> for
  more examples. <a href="#fnref:9" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:13" role="doc-endnote">
      <p>For example, if a married couple, one of whom is a U.S. resident and one is not (for tax purposes), files
   jointly, the nonresident spouse can choose to be <a href="https://www.irs.gov/individuals/international-taxpayers/nonresident-spouse">treated as a resident for tax purposes</a>. <a href="#fnref:13" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:treas" role="doc-endnote">
      <p>The extra tax burden paid by nonresident aliens due to not being able to take the standard deduction and other
      deductions and tax credits is paid to the U.S. treasury, rather than to the Social Security and Medicare trust funds, as
      would be the case with FICA. This is a distinction without a difference in my opinion. <a href="#fnref:treas" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:feere" role="doc-endnote">
      <p>In Jon Feere’s 2024 estimate he continues to use this figure ($50,000 / year) as the estimated wage for OPT
      workers. I’m particularly baffled by him continuing to use this decade-old number, especially since it would
      help his argument to update it! <a href="https://www.bls.gov/data/inflation_calculator.htm">Inflation</a> alone from 2015
      to 2024 was 35%—average wages for new college graduates must have changed a lot, too! <a href="#fnref:feere" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:pause" role="doc-endnote">
      <p>While the full “top 100” list contains overwhelmingly reputable institutions, I would say that at least one,
      possibly two, schools on the list do not quite fit most people’s idea of an ideal institution of higher
      learning. That said, the inclusion of these unorthodox schools doesn’t change the more important point: that
      OPT is <em>very disproportionately</em> used by graduates of the most prestigious schools in America. <a href="#fnref:pause" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:underestimate" role="doc-endnote">
      <p>I believe $69,032.20 / year for the average OPT worker may still be an underestimate because: 1)
we want the mean, not the median, and 2) it does not account for the more selective schools and higher-earning majors
that OPT workers disproportionately choose. <a href="#fnref:underestimate" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:misrepresent" role="doc-endnote">
      <p>In the amicus brief, not only does North misrepresent the exemption from FICA as being universal for
OPT workers, he also seems to misattribute the reason for the exemption. He conflates the F-1 “substantial presence”
exception (the real reason why <em>some</em> OPT workers are exempt from FICA) with the
<a href="https://www.irs.gov/charities-non-profits/student-exception-to-fica-tax">“student” exemption to FICA</a>, which <em>only</em>
applies to students working at their home institution where they are currently enrolled full-time. In this
<a href="https://cis.org/North/Some-Foreign-Workers-Pay-Payroll-Taxes-Some-Do-Not-Puzzling-Pattern">later post</a> from 2023,
he seems to recognize that OPT workers’ exemption from FICA stems from their nonresident status due to exemption
from the substantial presence test. <a href="#fnref:misrepresent" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Connor Boyle</name></author><category term="politics" /><category term="immigration" /><summary type="html"><![CDATA[U.S. Congressional Representative Paul Gosar of Arizona1 recently reintroduced proposed legislation to ban the Optional Practical Training program (OPT).2 OPT is a program that allows international students attending college &amp; grad school to legally remain in the United States and work in their field of study for 1 year after completing their degrees.3 Students majoring in an approved science, technology, engineering, and/or math (STEM) field are eligible to apply for a 2-year extension at the end of their initial 1-year OPT periods. In an earlier version of this post, I mistakenly wrote that Paul Gosar is the representative from Nebraska &#8617; I will mostly discuss post-completion OPT in this post, since post-completion OPT seems to specifically be the &#8617; F-1 visa holders can (and often do) pick a start date for their post-completion OPT period as late as sixty (60) &#8617;]]></summary></entry><entry><title type="html">I Found Out I’m Colorblind, So I Made a Program to Generate Images That I Can’t Read</title><link href="https://connorboyle.io/2025/04/07/color-blind-ishihara.html" rel="alternate" type="text/html" title="I Found Out I’m Colorblind, So I Made a Program to Generate Images That I Can’t Read" /><published>2025-04-07T00:00:00+00:00</published><updated>2025-04-07T00:00:00+00:00</updated><id>https://connorboyle.io/2025/04/07/color-blind-ishihara</id><content type="html" xml:base="https://connorboyle.io/2025/04/07/color-blind-ishihara.html"><![CDATA[<p><strong><em>After realizing I was mildly colorblind, I made <a href="https://connorboyle.io/ishihara/">this program</a> to generate colorblindness tests</em></strong></p>

<p>I recently (read: several months ago) watched a <a href="https://www.youtube.com/watch?v=Ppobi8VhWwo&amp;t=49s">video</a> about
so-called “color-corrective lenses” that supposedly enable colorblind people to see the full range of colors that people
with full color vision can see. The video’s creator argues—successfully, in my opinion—that these color corrective
lenses are essentially a scam. They cannot restore full color vision, and only assist in distinguishing pairs of colors
by entirely blocking the light of one of the colors. They may have some practical use, but only at significant cost; for
example, green traffic lights may appear completely black, as occurred for one reporter who tried on the glasses and
drove his car (!!!) with them on.</p>

<p>What really got my attention was the video creator showing some of the colorblind tests that he, a colorblind person,
had tried and failed. These tests—known as Ishihara test plates—consist of circles filled with irregularly sized and
placed dots I noticed that <em>I too</em> often could not read these test plates. Somewhat alarmed, I took several online
colorblindness tests (including <a href="https://enchroma.com/pages/test">one</a> furnished by a color corrective lens company).
These tests generally indicated that I had
mild <a href="https://en.wikipedia.org/wiki/Congenital_red%E2%80%93green_color_blindness">deutan</a> colorblindness. The cone
receptors in my retinas that should be activated by green light are lacking in quantity or quality, and therefore my
ability to distinguish red from green is significantly worse than someone with full color vision.</p>

<div style="text-align: center">
<img alt="74 written in green dots, surrounded by orange dots" src="/images/color_blind/74_ishihara.png" width="400" id="ishihara-image" />
<br />
<p><i>An example of an Ishihara plate, showing "74" in green dots surrounded by orange dots (from
  <a href="https://commons.wikimedia.org/wiki/File:Ishihara_9.svg">Wikipedia</a>) </i></p>
</div>

<p>I was particularly shaken when I showed one of the Ishihara test plates to my friends &amp; girlfriend. The plate above is
the number 74 in green dots surrounded by orange dots; my friends &amp; girlfriend told me they could easily perceive it
as 74. This 74 was not (and still isn’t) clear at all to me! I can see that there’s a number, but I originally thought
it might be a 21, and still can’t help but see it as a 21 sometimes.</p>

<div style="text-align: center">
<img alt="a t-shirt with a message 'I *heart-shape* the colorblind'. The heart-shape is filled with red dots; green dots inside it spell out the message 'secretly loathe'" src="/images/color_blind/secretly_loathe_tshirt.webp" width="400" id="secretly-loathe-shirt" />
<br />
<p><i>I can't read this shirt. You can buy it from
  <a href="https://www.redbubble.com/i/t-shirt/Ishihara-Colourblind-Test-I-Heart-the-Colourblind-AU-UK-spelling-by-ThisOnAShirt/28252545.NL9AC">here</a>, if you think it's funny </i>
</p>
</div>

<p>I became fascinated with Ishihara test plates and things like them (see above). I tried a few programs that I found to
make my own Ishihara plates but none of them quite satisfied me in terms of power and customizability, so I decided to
make my own using Rust and WebAssembly. This was my first time using WebAssembly, and I found
the <a href="https://rustwasm.github.io/book/game-of-life/introduction.html">Conway’s Game of Life WASM tutorial</a> very helpful,
as well as Carl M. Kadie’s post <a href="https://medium.com/data-science/nine-rules-for-running-rust-in-the-browser-8228353649d1"><em>Nine Rules for Running Rust in the
Browser</em></a>.</p>

<h2 id="creating-the-algorithm">Creating the Algorithm</h2>

<p>I decided to start with the simplest generation algorithm I could think of. First, we load the image as an array of
pixels; we decide which pixels in the image should be “in” versus “out” depending on whether the pixel has a luma value
greater or lesser than <code class="language-plaintext highlighter-rouge">0x7F</code> (i.e. 127, or 50% of maximum illumination). Then we generate dots with random radii and
coordinates within the image. If a dot doesn’t overlap with any already-added dot, we add the dot to the image. The
dot’s color will depend on whether more than 50% of the pixels inside of it are marked as “on”; if so, it will be drawn
with the “on” color, otherwise it will be drawn with the “off” color.</p>

<p>This algorithm worked surprisingly well, except that it slowed down very quickly as the number of dots grew; total
checks for overlapping dots grew quadratically with the number of dots, i.e. \(O(n^2)\). To cut down on required
operations, I kept the added list of dots sorted by x-coordinate and used binary search to narrow down the list of dots
to check for overlap to just those dots whose x-coordinate could possibly be in range of the new, candidate dot.</p>

<div style="text-align: center">
<img alt="a colorblind dot test containing the words 'HELLO WORLD'; the outline of the text is jagged and uneven" src="/images/color_blind/hw_full_tolerance.jpeg" width="400" />
<img alt="a colorblind dot test containing the words 'HELLO WORLD'; the outline of the text is sharp and well-defined" src="/images/color_blind/hw_zero_tolerance.jpeg" width="400" />
<br />
    <p><i>The text "HELLO WORLD", with (bottom) and without (top) a set maximum share of the dot's area that can cross
        the on/off boundary</i></p>
</div>

<p>I also noticed that the “in” and “out” dots sometimes crossed “in/out” boundary quite significantly, which made the
outline of the text or number represented unclear. To make up for this, I added a user-set “tolerance” parameter, which
defines the maximum share of a dot that can contain pixels of the “wrong” on/off value.</p>

<h2 id="playing-with-the-ishihara-generator">Playing with the Ishihara Generator</h2>

<p>I still don’t have a good pipeline for generating the black-and-white text images to use as input for the Ishihara test
generator; my quick-and-dirty solution is to make a Google Doc with very large font bold text and take a screenshot of
that. I’ve noticed that if you set padding (the minimum space between dots) and tolerance very low, you can end up with
images where you can see the outline of text even with identically-colored dots:</p>

<div style="text-align: center">
<img alt="gray dots on a white background; the outline of the words 'HELLO WORLD' is faintly visible" src="/images/color_blind/hello_world_gray.png" width="400" />
<br />
    <p><i>Even without any coloring, you can sometimes see the outline of text if the tolerance parameter is set very low. Here, the tolerance is set to 0% and padding is set to 0</i></p>
</div>

<p>Technically, you don’t <em>have</em> to input a black-and-white image of text. It’s also fun to play around with wacky colors
on any image with significant numbers of high-luma and low-luma pixels. For example, check out this cool visual output
from a picture of an astronaut in Earth orbit:</p>

<div style="text-align: center">
<img alt="a picture of an astronaut in Earth orbit" src="/images/color_blind/astronaut.jpg" width="400" />
<img alt="multi-colored dots on a red background, showing the outlines of the previous astronaut image" src="/images/color_blind/trippy_astronaut_ishihara.png" width="400" />
<br />
    <p><i>Trippy!</i></p>
</div>

<p><a href="https://connorboyle.io/ishihara/">Here</a>’s the link to my program that I used to make all these images. The only thing
you should need to use it is a modern web browser. Thanks to the power of WebAssembly, computation happens on the
client-side, so you don’t even need a persistent network connection.</p>]]></content><author><name>Connor Boyle</name></author><category term="software" /><category term="colorblindness" /><summary type="html"><![CDATA[After realizing I was mildly colorblind, I made this program to generate colorblindness tests]]></summary></entry><entry><title type="html">Flipping Coins in 100,000 Universes Wouldn’t Be as Close as the Polls in Wisconsin</title><link href="https://connorboyle.io/2024/11/05/polling-margins.html" rel="alternate" type="text/html" title="Flipping Coins in 100,000 Universes Wouldn’t Be as Close as the Polls in Wisconsin" /><published>2024-11-05T00:00:00+00:00</published><updated>2024-11-05T00:00:00+00:00</updated><id>https://connorboyle.io/2024/11/05/polling-margins</id><content type="html" xml:base="https://connorboyle.io/2024/11/05/polling-margins.html"><![CDATA[<p>I just read Nate Silver’s <a href="https://www.natesilver.net/p/theres-more-herding-in-swing-state">blog post</a>, where he writes
that pollsters are systematically altering their data to roughly match the average of existing polls. According to
Silver, rather than releasing their findings as-is, they’re worried they’ll look uniquely wrong, and so they’re settling
for blending in with the crowd. He infers this bias from the numbers that the pollsters themselves report; the margins
in several swing states are too consistently close to be plausible, even if the election truly is a dead tie among
decided voters.</p>

<p>Other than a passing mention of the binomial distribution, Silver doesn’t “show his work” with much detail. Since
probability math can be really easy to get wrong (at least for me!), I thought I’d take a stab at trying the brute force
option of simulating polls in a hypothetical dead tie, i.e. exactly 50% of decided voters plan to vote for each of the
major candidates, Harris &amp; Trump (I also happen to be a computer programmer by hobby and profession, so maybe I’m just a
hammer looking for a nail).</p>

<p>This little project was made possible thanks to Nate Silver’s blog, Silver Bulletin, collecting and distributing poll
results. Here are the data files containing the poll results that I used 
for <a href="https://static.dwcdn.net/data/PMbPp.csv">Wisconsin</a>, <a href="https://static.dwcdn.net/data/uyZgi.csv">Pennsylvania</a>,
and <a href="https://static.dwcdn.net/data/nLq7K.csv">New Hampshire</a>.<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup></p>

<h2 id="simulating-wisconsin-polls">Simulating Wisconsin polls<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup></h2>

<p>If the exact same number of likely or registered voters (depending on which poll) plan to vote for Harris as Trump, we
can easily simulate the act of surveying them by flipping a coin. Even more easily, we can run the random number
generator on my computer and checking whether the output floating point number is greater than <code class="language-plaintext highlighter-rouge">0.5</code>; if it is, that’s a
Trump voter. Otherwise, that’s a Harris voter.</p>

<p>After we simulate our polls, lets extract our statistic of interest: the <strong>mean absolute margin</strong>. For example, if I
have three polls with margins:<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup></p>

\[\text{Trump} \space \text{+5%}\]

\[\text{Harris} \space \text{+2%}\]

\[\text{Harris} \space \text{+3%}\]

<p>then their absolute margins are:</p>

\[\text{0.05}\]

\[\text{0.02}\]

\[\text{0.03}\]

<p>and the mean absolute margin for this universe of polls is:</p>

\[\frac{0.05 + 0.02 + 0.03}{3} \approx 0.03333333333\]

<p>Here’s what the actual polls<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup> in real world Wisconsin done by real pollsters look like:<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup></p>

<p><img alt="Wisconsin observed polling margins" src="/images/poll_margins/wisconsin_observed_margins_fixed.png" /></p>

<p>An average poll of Wisconsin has one candidate beating the other (some of them Trump beating Harris, some of them vice
versa) by about 2% (or <del>0.203</del> 0.0203, as shown in the graph). While the trendline is not terribly strong, we do find that the
absolute margin of a poll goes down as sample size goes up, as we’d expect if the race were truly tied.</p>

<p>And here’s a simulation of what those polls <em>could</em> look like in an alternate universe where pollsters perfectly
randomly sample the same number of people in a perfectly matched race between Donald Trump and Kamala Harris:</p>

<p><img alt="Wisconsin simulated polling margins" src="/images/poll_margins/wisconsin_simulated_margins.png" /></p>

<p>The output of this simulation certainly differs from our observed results–our mean absolute margin is a full percentage
point higher than our observed one. That doesn’t prove anything on its own, though; maybe this simulation of the
Wisconsin polls just happened to result in a high mean absolute margin by chance.</p>

<h2 id="simulating-a-multiverse-of-polls">Simulating a multiverse of polls</h2>

<p>What happens if we run that simulation many, many times, keeping track of the resulting mean absolute margin for each
simulation? Let’s look at the histogram we get when we do that:</p>

<p><img alt="Multiverse of simulated Wisconsin polling margins" src="/images/poll_margins/wisconsin_mam_multiverse.png" /></p>

<p>Whoa! Our <em>observed</em> mean absolute margin of polls (the dashed red line to the left) is <em>way</em> lower than any of the MAMs
in the multiverse where Harris and Trump are neck-and-neck. In fact, the lowest MAM out of 100,000 simulated universes
is 0.02092 or 2.092%, still 0.06 percentage points higher than our observed MAM. Does this mean something is wrong?
Well, I can’t think of any way these polls could get consistently closer margins than our simulations while still
remaining scientifically valid. It’s hard to get a low variance estimate of a mean without increasing your sample size;
that’s why the sample size \(n\) is so important in scientific papers.</p>

<p>Recall also that we generously assumed the candidates had exactly even shares of decided voters. The more imbalanced the
share of voters between the candidates, the higher we would expect the mean absolute margin to be. If you’re not
convinced, look at this graph of simulations with varied shares for the candidates:</p>

<p><img alt="Universes of simulated Wisconsin polling margins with varied Harris shares" src="/images/poll_margins/wisconsin_harris_shares.png" /></p>

<p>So, assuming the presidential race in Wisconsin isn’t <em>exactly</em> tied, the poll margins would look even more suspiciously
close to zero than they already do!</p>

<p>With all that in mind, it seems hard to deny that there could be some systemic bias distorting these polls away from
being true random samples of their populations–possibly herding driven by an aversion to publishing too strong of a
poll for one or more of the candidates.</p>

<h2 id="some-other-states-poll-margins">Some other states’ poll margins</h2>

<p>Nate Silver noted that he observed herding in Pennsylvania as well, and our simulations reveal as much as well:</p>

<p><img alt="Universes of simulated Wisconsin polling margins with varied Harris shares" src="/images/poll_margins/pennsylvania_mam_multiverse.png" /></p>

<p>(the unnatural bias towards a tie looks even worse for Pennsylvania than it did for Wisconsin). However, the polls in
New Hampshire apparently don’t suffer from herding:</p>

<p><img alt="Universes of simulated Wisconsin polling margins with varied Harris shares" src="/images/poll_margins/new_hampshire_mam_multiverse.png" /></p>

<p>You can see that not only is the mean absolute margin for New Hampshire not well below (to the left) the simulation’s
distribution, it is actually far above (to the right of) it. This makes sense; New Hampshire appears to be nowhere near
tied, with nearly all polls giving a strong Harris lead. Note that our simulations actually wouldn’t be able to detect
herding if its not occurring around a near-tie polling average; therefore all we can say is that Nate Silver <em>could</em> be
right to acquit New Hampshire pollsters of the herding accusation.</p>

<h2 id="how-is-this-happening">How is this happening?</h2>

<p>To be clear, no individual poll–even one with a very close margin–is by itself indicative of foul play by the pollster
who created it. Rather, the aggregation of poll results for each of multiple swing states indicate systemic bias. I
know almost nothing about political polls, but I recently read a great book about systemic problems in modern science
called <a href="https://www.sciencefictions.org/p/book">Science Fictions</a> and it seems like there’s a <em>lot</em> of ways to
manipulate your data–even without intending to or realizing that you are doing it.</p>

<p>It’s totally plausible to me that pollsters are just focusing a lot more scrutiny on any result that shows a strong
swing toward one candidate or another; maybe they’re more likely to throw out outliers or keep collecting more data if
they start to see “too” wide of a margin. These hypotheses may sound very foolish to people more familiar with how
polls are typically conducted; I’ll stop speculating before I make too much of a fool of myself, but suffice it to say
there are a lot of ways for data to get distorted in any field of science and I would expect no less of political
polling.</p>

<h2 id="why-does-this-matter">Why does this Matter?</h2>

<p>After election results (either from exit polls or counting the votes) are announced, it may turn out that one or more of
the swing states goes to either Trump or Harris by a very wide margin. Some people might look back at these polls and
conclude that they constitute evidence of interference, cheating, voter suppression, fraud, or the like. After all, the
pollsters nearly <em>all</em> agreed that these swing states were <em>right</em> on the margin. However, the high level of agreement
between pollsters is not evidence that we know what the results will be, but rather that we can’t trust these polls, and
therefore should be very <em>un</em>certain about the outcome of this race.</p>

<hr />

<p><strong>Footnotes:</strong></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>I had to delete a row representing a YouGov poll from each of the Wisconsin and Pennsylvania data files. For some
  reason, these polls had their sample sizes listed as 0, which is both logically impossible and impossible to
  simulate. I don’t believe they could have made a significant difference; each one being only one of 134
  (Pennsylvania) or 100 (Wisconsin) polls, these YouGov polls could have at most impacted the observed or simulated
  mean absolute margin by a 100th of their corresponding values. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>I used a Jupyter notebook to simulate these polls, which can be found <a href="https://github.com/boyleconnor/poll-margins-2024/blob/main/simulate_polls.ipynb">here</a> <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>In order to simplify the problem, I transformed each poll into a strictly binary poll consisting of only those
  respondents who responded that they intended to vote for Trump or Harris. This introduces some numerical error,
  since we have to infer the number of strict Trump-&amp;-Harris-only respondents by dividing by the sum of the
  percentages for each candidate. Out of generosity to the quality of the polls, we consistently round up the
  inferred sample size to the nearest whole number. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>This whole post rests on the assumption that the polls on Silver Bulletin represent well the full
  distribution of seemingly “good” polls. Since Silver is complaining about and drawing attention to herding among
  pollsters, I have assumed that he himself is not consciously or unconsciously selecting specifically for closer
  polls in swing states. But technically, he or his blog staff could be responsible for 100% of the apparent
  herding if they are doing this! <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>This figure in a previous version of this blog post had the observed mean absolute margin at completely the wrong
  value (just in the chart, not in the text of the blog post). A helpful
  Redditor <a href="https://www.reddit.com/r/fivethirtyeight/comments/1gk3ers/comment/lvlgf3n/">pointed this out</a> to me and
  I corrected this around 2024-11-05T21:59 UTC. <del>I show the full, transparent edit history of my entire website on
  <a href="https://github.com/boyleconnor/boyleconnor.github.io">this GitHub repo</a>.</del> (EDIT 2024-11-16: I no longer do this;
  I’m currently figuring out a good way to show edit history without exposing my work-in-progress posts) <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Connor Boyle</name></author><category term="politics" /><category term="statistics" /><summary type="html"><![CDATA[I just read Nate Silver’s blog post, where he writes that pollsters are systematically altering their data to roughly match the average of existing polls. According to Silver, rather than releasing their findings as-is, they’re worried they’ll look uniquely wrong, and so they’re settling for blending in with the crowd. He infers this bias from the numbers that the pollsters themselves report; the margins in several swing states are too consistently close to be plausible, even if the election truly is a dead tie among decided voters.]]></summary></entry><entry><title type="html">How to build &amp;amp; push a Docker image directly to Minikube</title><link href="https://connorboyle.io/2024/08/17/building-to-minikube.html" rel="alternate" type="text/html" title="How to build &amp;amp; push a Docker image directly to Minikube" /><published>2024-08-17T00:00:00+00:00</published><updated>2024-08-17T00:00:00+00:00</updated><id>https://connorboyle.io/2024/08/17/building-to-minikube</id><content type="html" xml:base="https://connorboyle.io/2024/08/17/building-to-minikube.html"><![CDATA[<p>The other day, I was attempting to develop a Knative service and try it out on my local development set-up, which was
a Minikube cluster. I assumed (incorrectly) that I could build a Docker image on the host machine and it would be
automatically available to Minikube. However, this is not true, because Minikube has its own Docker daemon, inside of
its own virtual machine (which, if your set-up is like mine, is itself running in a container on top of the host’s
Docker daemon). While there <em>is</em> an easy and simple method that allows building a Docker image and pushing directly to
your Minikube cluster’s Docker daemon, I don’t believe it is well-documented anywhere on the public web, so I thought I
would write my own walkthrough.</p>

<p>The following walkthrough assumes that you have a running Minikube cluster and have installed <code class="language-plaintext highlighter-rouge">kubectl</code>.</p>

<h3 id="un-installing-snap-docker">Un-installing Snap Docker</h3>

<p>First, if you are on Ubuntu, you need to make sure that you are <em>not</em> running
the <a href="https://ubuntu.com/core/services/guide/snaps-intro">Snap</a> version of Docker; the Docker client on your host machine
will need to authenticate to the Docker daemon on the Minikube host using a cert file that is inaccessible to the
Snap version of Docker, due to that Snap’s containment policy. So make sure you have
installed <a href="https://docs.docker.com/desktop/install/linux-install/">Docker Desktop</a> from the downloadable <code class="language-plaintext highlighter-rouge">.deb</code> file, or
add Docker’s package repository
and <a href="https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository">install Docker CE</a> using Apt.</p>

<h2 id="connecting-to-the-minikubes-docker-daemon">Connecting to the Minikube’s Docker daemon</h2>

<p>In order to connect to the Docker daemon inside the Minikube VM, we will need to change the values of certain
environment variables. Luckily, Minikube makes it easy for us to get these values with the following command:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>minikube docker-env
</code></pre></div></div>

<p>(you will need to instead run <code class="language-plaintext highlighter-rouge">minikube -p &lt;PROFILE-NAME&gt; docker-env</code> if you want to connect to a Minikube profile other
than the currently activated one)</p>

<p>this should return an output similar to the following:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://192.168.58.2:2376"
export DOCKER_CERT_PATH="/home/your-username/.minikube/certs"
export MINIKUBE_ACTIVE_DOCKERD="profile-name"

# To point your shell to minikube's docker-daemon, run:
# eval $(minikube -p profile-name docker-env)
</code></pre></div></div>

<p>Export these values to your current terminal’s environment by running the command described in the last line of the
output, i.e.:</p>

<pre><code class="language-commandline">eval $(minikube -p profile-name docker-env)
</code></pre>

<p>(note: <code class="language-plaintext highlighter-rouge">profile-name</code> will likely be a different value when run on your machine, you should copy &amp; run the output of
<em>your</em> <code class="language-plaintext highlighter-rouge">minikube docker-env</code> command, not the one on this webpage)</p>

<p>You can verify that your Docker client has successfully connected to the Minikube VM Docker daemon by running:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>docker images
REPOSITORY                                         TAG                                        IMAGE ID       CREATED         SIZE
registry.k8s.io/kube-apiserver                     v1.30.0                                    c42f13656d0b   4 months ago    117MB
registry.k8s.io/kube-controller-manager            v1.30.0                                    c7aad43836fa   4 months ago    111MB
registry.k8s.io/kube-scheduler                     v1.30.0                                    259c8277fcbb   4 months ago    62MB
registry.k8s.io/kube-proxy                         v1.30.0                                    a0bf559e280c   4 months ago    84.7MB
...
</code></pre></div></div>

<p>your output should similarly contain several images from the Kubernetes official registry.</p>

<p><strong>NOTE: the above will have to be re-run every time you open a new terminal, open a new SSH session, restart the computer,
etc.</strong></p>

<h2 id="running-the-image-on-minikube">Running the Image on Minikube</h2>

<p>To test that we can actually build an image to the Minikube VM’s Docker daemon, let’s start by making a directory named
<code class="language-plaintext highlighter-rouge">test-docker</code>, then make a <code class="language-plaintext highlighter-rouge">Dockerfile</code> in it with the following contents:</p>

<div class="language-Dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="s"> python</span>
<span class="k">CMD</span><span class="s"> python -c "print('Hello, world. This is Python, inside a Docker container, possibly on a Kubernetes cluster')"</span>
</code></pre></div></div>

<p>In a terminal that has connected to the Minikube VM’s Docker daemon (by following the instructions above), <code class="language-plaintext highlighter-rouge">cd</code> into the
parent directory of <code class="language-plaintext highlighter-rouge">test-docker</code>, then run:</p>

<pre><code class="language-commandline">docker build --tag my-python test-docker/
</code></pre>

<p>Now check that the image is available by running:</p>

<pre><code class="language-commandline">$ docker images
REPOSITORY                                         TAG                                        IMAGE ID       CREATED         SIZE
my-python                                          latest                                     17f99b663100   10 days ago     1.02GB
registry.k8s.io/kube-apiserver                     v1.30.0                                    c42f13656d0b   4 months ago    117MB
registry.k8s.io/kube-scheduler                     v1.30.0                                    259c8277fcbb   4 months ago    62MB
registry.k8s.io/kube-controller-manager            v1.30.0                                    c7aad43836fa   4 months ago    111MB
...
</code></pre>

<p>Now run a pod using this image with the following command:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>kubectl run <span class="nt">--image</span> my-python <span class="nt">--image-pull-policy</span> Never my-python-pod
pod/my-python-pod created
</code></pre></div></div>

<p>(the <a href="https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy"><code class="language-plaintext highlighter-rouge">--image-pull-policy Never</code></a> is
necessary because Kubernetes looks for images in a default registry, without even considering images in its own
Docker daemon)</p>

<p>Check that the pod has run:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>kubectl get pods
NAME                                             READY   STATUS                   RESTARTS         AGE
...
my-python-pod                                    0/1     Completed                2 <span class="o">(</span>13s ago<span class="o">)</span>      15s
</code></pre></div></div>

<p>(the pod’s <code class="language-plaintext highlighter-rouge">STATUS</code> may eventually change to <code class="language-plaintext highlighter-rouge">CrashLoopBackOff</code>; I think this is because Kubernetes does not expect pods
to execute one command and then terminate)</p>

<p>You can see that the pod has completed the command described in the Dockerfile’s CMD directive by running:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>kubectl logs my-python-pod
Hello, world. This is Python, inside a Docker container, possibly on a Kubernetes cluster
</code></pre></div></div>

<p>And there you have it! A Docker image built and run on your local Minikube cluster.</p>]]></content><author><name>Connor Boyle</name></author><category term="software" /><summary type="html"><![CDATA[The other day, I was attempting to develop a Knative service and try it out on my local development set-up, which was a Minikube cluster. I assumed (incorrectly) that I could build a Docker image on the host machine and it would be automatically available to Minikube. However, this is not true, because Minikube has its own Docker daemon, inside of its own virtual machine (which, if your set-up is like mine, is itself running in a container on top of the host’s Docker daemon). While there is an easy and simple method that allows building a Docker image and pushing directly to your Minikube cluster’s Docker daemon, I don’t believe it is well-documented anywhere on the public web, so I thought I would write my own walkthrough.]]></summary></entry><entry><title type="html">Scikit-Learn’s F-1 calculator is broken</title><link href="https://connorboyle.io/2023/12/17/sklearn-f1-bug.html" rel="alternate" type="text/html" title="Scikit-Learn’s F-1 calculator is broken" /><published>2023-12-17T00:00:00+00:00</published><updated>2023-12-17T00:00:00+00:00</updated><id>https://connorboyle.io/2023/12/17/sklearn-f1-bug</id><content type="html" xml:base="https://connorboyle.io/2023/12/17/sklearn-f1-bug.html"><![CDATA[<p><strong>TL;DR:</strong> if you are using scikit-learn 1.3.X and
use <a href="https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html"><code class="language-plaintext highlighter-rouge">f1_score()</code></a>
or <a href="https://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html"><code class="language-plaintext highlighter-rouge">classification_report()</code></a>
with the argument <code class="language-plaintext highlighter-rouge">zero_division=1.0</code> or <code class="language-plaintext highlighter-rouge">zero_division=np.nan</code><sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, then there’s a chance that the output of that function
is wrong (possibly by any amount up to 100%, depending on the number of classes in your
dataset). E.g. for <code class="language-plaintext highlighter-rouge">zero_division=1.0</code>:</p>

<pre>
&gt;&gt;&gt; sklearn.__version__
'1.3.0'
&gt;&gt;&gt; sklearn.metrics.f1_score(y_true=list(range(104)), y_pred=list(range(100)) + [101, 102, 103, 104], average='macro', zero_division=1.0)
<b>0.9809523809523809</b>  <i># incorrect</i>
</pre>

<p>compare to (the exact same expression in an earlier version of Scikit-Learn):</p>

<pre>
&gt;&gt;&gt; sklearn.__version__
'1.2.2'
&gt;&gt;&gt; sklearn.metrics.f1_score(y_true=list(range(104)), y_pred=list(range(100)) + [101, 102, 103, 104], average='macro', zero_division=1.0)
<b>0.9523809523809523</b>  <i># correct</i>
</pre>

<p>Similar cases for <code class="language-plaintext highlighter-rouge">zero_division=np.nan</code> (which was introduced in 1.3.0, so I can’t directly compare to the output in
1.2.2):</p>

<pre>
&gt;&gt;&gt; sklearn.metrics.f1_score([0, 1], [1, 0], average='macro', zero_division=np.nan)
<b>nan</b>  <i># should be 0.0</i>
&gt;&gt;&gt; sklearn.metrics.f1_score([0, 1, 2], [1, 0, 2], average='macro', zero_division=np.nan)
<b>1.0</b>  <i># should be ~0.67</i>
</pre>

<p>Both myself and the Scikit-Learn maintainers consider the behavior in 1.3.X to be incorrect. <del>While a
<a href="https://github.com/scikit-learn/scikit-learn/pull/27577">pull request</a> to fix this behavior was just merged, the fix
has not yet shipped on any released version of Scikit-Learn. Therefore, the easiest solution to this specific problem is
to revert to Scikit-Learn 1.2.2, or use <code class="language-plaintext highlighter-rouge">zero_division=0.0</code> if possible, while being careful to understand how this
parameter change will affect precision, recall, &amp; F-1 (see below for an explainer on the purpose and function of
the <code class="language-plaintext highlighter-rouge">zero_division</code> parameter).</del></p>

<p>(<strong>EDIT 2024-01-24</strong>: Scikit-Learn 1.4.0 has been released as of a week ago and contains a fix for this bug. Go and
update now!)</p>

<p>The problem is that F-1 for an individual class is getting calculated as <code class="language-plaintext highlighter-rouge">1.0</code> or <code class="language-plaintext highlighter-rouge">np.nan</code> when precision &amp; recall are
both <code class="language-plaintext highlighter-rouge">0.0</code> (which is <em>not</em> the desired behavior for the <code class="language-plaintext highlighter-rouge">zero_division</code> parameter).</p>

<h2 id="how-did-this-happen">How did this happen?</h2>

<p>Let’s take a look at some formulae for classification metrics:</p>

\[\textrm{precision} = \frac{\textrm{true positive}}{\textrm{true positive} + \textrm{false positive}}\]

\[\textrm{recall} = \frac{\textrm{true positive}}{\textrm{true positive} + \textrm{false negative}}\]

\[\textrm{F}_1 = \frac{2 \cdot \textrm{precision} \cdot \textrm{recall}}{\textrm{precision} + \textrm{recall}}\]

<p>There are three different places here where a division by zero can occur:</p>

<ul>
  <li>in precision, if <code class="language-plaintext highlighter-rouge">true positive + false positive = 0</code> (the classifier made no
positive predictions for the class)</li>
  <li>in recall, if <code class="language-plaintext highlighter-rouge">true positive + false negative = 0</code> (there are no truly
positive examples of the class in the dataset)</li>
  <li>in F-1, if <code class="language-plaintext highlighter-rouge">precision = recall = 0</code> (the classifier has made a nonzero number
of exclusively incorrect predictions)</li>
</ul>

<p>Two of these are interesting cases where reasonable people could disagree on
what the correct behavior should be:</p>

<ul>
  <li>When the classifier has made <em>zero</em> positive predictions for the class, should that count as a precision of 1.0? If
“perfect precision” is interpreted as “no false positives”, then this is totally reasonable behavior.</li>
  <li>When the gold dataset has <em>zero</em> true positive examples of the class, should that count as a recall of 1.0? This is a
much more unusual scenario than the “zero positive predictions” example–a good evaluation dataset should almost never
be entirely missing a class. However, this can realistically occur when evaluating on subsets of a large multiclass
dataset. Again, if the definition of “perfect recall” is taken as “no false negatives”, then assigning a recall of 1.0
in this case is totally reasonable behavior.</li>
</ul>

<p>For F-1, however, the “division by zero” case is not interesting or controversial in any way. If a classifier has
achieved a recall of 0.0 (all negative predictions are false) <em>and</em> a precision of 0.0 (all positive predictions are
false), I don’t think any reasonable person would disagree what the F-1 score should be: 0.0. Indeed, this is exactly
how Scikit-Learn calculated F-1 right up to (and including) version 1.2.2, regardless of the value of
the <code class="language-plaintext highlighter-rouge">zero_division</code> parameter.</p>

<p>However, in Scikit-Learn 1.3.0, the <code class="language-plaintext highlighter-rouge">zero_division</code> parameter was turned into a kind
of <a href="https://en.wiktionary.org/wiki/monkey%27s_paw">monkey’s paw</a> that defines the behavior of <em>any</em> division-by-zero
that happens to occur during the calculation of an F-1 score, leading to the bizarre scenario where a 100% wrong
classifier can get an F-1 score of 100%:<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; sklearn.__version__
'1.3.0'
&gt;&gt;&gt; print(sklearn.metrics.classification_report(y_true=[0, 1, 2, 3, 4], y_pred=[1, 2, 3, 4, 0], zero_division=1.0))
              precision    recall  f1-score   support

           0       0.00      0.00      1.00       1.0
           1       0.00      0.00      1.00       1.0
           2       0.00      0.00      1.00       1.0
           3       0.00      0.00      1.00       1.0
           4       0.00      0.00      1.00       1.0

    accuracy                           1.00       5.0
   macro avg       0.00      0.00      1.00       5.0
weighted avg       0.00      0.00      1.00       5.0

</code></pre></div></div>

<p>Why? Because precision and recall are both 0, which means the denominator of the F-1 formula is 0,
and <code class="language-plaintext highlighter-rouge">zero_division=1.0</code> now (as of Scikit-Learn 1.3.0) applies to the F-1 calculation itself, so that means F-1 is
calculated (incorrectly) as 1.0!</p>

<h2 id="why-does-this-matter">Why does this matter?</h2>

<p>I don’t know if there are rigorous statistics on this, but I’d wager that macro average F-1 is the most commonly used
metric for multiclass classification by a wide margin. Scikit-Learn’s <code class="language-plaintext highlighter-rouge">f1_score()</code> function is in turn very likely the
most commonly used implementation of F-1. Try asking Google or ChatGPT how to calculate F-1; the first results will very
likely tell you to use this exact function in Scikit-Learn.</p>

<p>The kinds of tasks F-1 could be used for range from low-risk, like sentiment analysis on customer reviews, to some
conceivably really safety-critical things. Imagine a researcher at an autonomous car company thinks their computer
vision system is performing really well at recognizing all categories of objects &amp; entities on the road. But actually,
their classifier is completely missing every single example of a few classes!</p>

<p>Ideally, any machine learning practitioner probably <em>should</em> notice this bug well before a classifier is put into
production or reporting results in a submitted journal paper. On the other hand, you really would not expect the
definition of F-1 to change from one version of Scikit-Learn to the next! While just about any programmer should be able
to implement an F-1 calculator in very little time, most of us prefer to just import Scikit-Learn’s <em>specifically</em> to
avoid gotcha edge cases like this one.</p>

<h2 id="what-should-i-do-now">What should I do now?</h2>

<p>If your project:</p>

<ul>
  <li>is using, has used, or may have used any Scikit-Learn version starting with 1.3.0 (released 2023-06-30)</li>
  <li>contains any call to <code class="language-plaintext highlighter-rouge">classification_report()</code>, <code class="language-plaintext highlighter-rouge">f1_score()</code>, or <code class="language-plaintext highlighter-rouge">fbeta_score()</code>, and</li>
  <li>that call contains the parameter <code class="language-plaintext highlighter-rouge">zero_division=1.0</code> or <code class="language-plaintext highlighter-rouge">zero_division=np.nan</code></li>
</ul>

<p>it may have been affected by this bug. To determine if any particular F-1 score calculation was impacted by this bug,
first change that F-1 score calculation to a <code class="language-plaintext highlighter-rouge">classification_report()</code> if possible. If any class in that classification
report contains a precision of <code class="language-plaintext highlighter-rouge">0.0</code>, a recall of <code class="language-plaintext highlighter-rouge">0.0</code>, and an f1-score of <code class="language-plaintext highlighter-rouge">1.0</code> or <code class="language-plaintext highlighter-rouge">nan</code>, then the F-1 score for this
classifier has been calculated incorrectly.</p>

<p><del>Any call using <code class="language-plaintext highlighter-rouge">zero_division=1.0</code> can be fixed by reverting to Scikit-Learn version 1.2.2. Unfortunately, the
parameter <code class="language-plaintext highlighter-rouge">zero_division=np.nan</code> did not exist in Scikit-Learn 1.2.2, and I don’t believe there is any easy way to
replicate it.</del></p>

<p>(<strong>EDIT 2024-01-24</strong>: Scikit-Learn 1.4.0 has been released, and you should update to it ASAP!)</p>

<hr />

<p><strong>Footnotes:</strong></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>In this post, <code class="language-plaintext highlighter-rouge">np.nan</code> refers to <code class="language-plaintext highlighter-rouge">numpy.nan</code> <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>A completely wrong classifier can also get an F-1 score of 0.0 in Scikit-Learn 1.3.X, for example:</p>
      <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; print(sklearn.metrics.classification_report(y_true=[0, 0, 0], y_pred=[1, 1, 1], zero_division=1.0))
              precision    recall  f1-score   support

           0       1.00      0.00      0.00       3.0
           1       0.00      1.00      0.00       0.0

    accuracy                           1.00       3.0
   macro avg       0.50      0.50      0.00       3.0
weighted avg       1.00      0.00      0.00       3.0
</code></pre></div>      </div>

      <p>(correctly) receives an F-1 of 0.0, because in each class, <em>either</em> precision <em>or</em> recall (but never both) is zero,
which means that the denominator of the F-1 score for each class is nonzero. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Connor Boyle</name></author><category term="software" /><category term="statistics" /><summary type="html"><![CDATA[TL;DR: if you are using scikit-learn 1.3.X and use f1_score() or classification_report() with the argument zero_division=1.0 or zero_division=np.nan1, then there’s a chance that the output of that function is wrong (possibly by any amount up to 100%, depending on the number of classes in your dataset). E.g. for zero_division=1.0: In this post, np.nan refers to numpy.nan &#8617;]]></summary></entry></feed>