Sunspots and influenza

Several papers present data and arguments purporting to show a correlation between peaks in sunspot activity and pandemic levels of influenza. A crystal clear paper written by statistician Simon Towers shows that this is not the case, that the papers contained several errors of fact and that even when these errors are corrected, the correlation does not exist.

The paper explains in some detail the specific mistakes made by Tapping and Yeung and gives a good general lesson on the sort of mistakes that can be made by those who do not specialise in statistical analysis.

The Tapping paper is important as having been cited by Arthur Firstenberg in his influential book “The Invisible Rainbow”.

Sunspot activity and influenza pandemics: a statistical assessment of the purported associationSimon Towers

As mentioned, there were mistakes made in copying dates etc. prior to statistical analysis. These were corrected by Towers and the resulting data displayed in the chart below.

We see sunspot numbers plotted through the years with influenza pandemics plotted as circles on the chart.

At first sight it might look like there is a good correlation between pandemics and sunspot peaks as many blue circles seem to be at or near peak sunspot activity. However, looking closer, there appear to be as many pandemics at low sunspot years as there are at peak years and this has led some researchers to speculate that either high or low sunspot conditions are somehow causal in pandemic influenza.

Possibly, but if we are now considering a causal relationship instead of merely a correlation, we should now look at the pandemics in 1830 and 1847. Both these occurred near a sunspot year but both occurred before the sunspot peak so it cannot be the case that the pandemics were caused by an event that happened a year later.

What about absolute sunspot values? Maybe there is a correlation not between sunspot peaks as such but between the total numbers of sunspots at any time? Looking at the chart however we see pandemics at high, mid, and low values of sunspots and at peaks, troughs and intermediate years in the cycle so this possibility looks very unlikely indeed.

When is sunspot activity considered to be ‘high’?
The Yeung paper looked at times when the number of sunspots exceeded the ’60th percentile’ and found significant correlation ‘p < 0.05’ but this procedure is rightly criticised by Towers.

Why was the 60th percentile chosen when 90% or 95% is more common? The figure of 60 seems arbitrary in this respect. The chart shows the significance of the results when the percentile is varied, with results that are unlikely to happen at random appearing below the dotted line.

What we see is that, as the percentile varies, the significance of the results varies wildly. The outcome of the analysis is not in any way robust to the arbitrary choice of a cut-off point at 60%. The claimed significance is therefore a function of the methodology as much as of the data itself. This is bad practice.

More arbitrary choices are made which, when taken together, amount to ‘cherry-picking’ and are actually controlling the outcome of the study:

  • How is a ‘pandemic’ defined? (This is subjective)
  • How are sunspots counted? (Two different methods are available)
  • What is defined as ‘near’ a sunspot peak?
  • How is ‘significance’ calculated? (Many statistical techniques are available).

There are additional problems with the analysis:

  • There is really not enough data for a meaningful result
  • There is no purported mechanism by which sunspots can cause influenza
  • Pandemics were identified by counting citations from other researchers who tended to cite each other
  • Mistakes were made in transcribing data
  • Data was ‘bucketed’ or ‘categorised’ which again is an arbitrary process and will necessarily discard useful information
  • Mistakes were made in the authors ‘own calculations
  • One list of pandemics was excluded without explanation

Cosmic Factors of Pandemic Influenza – A paper (see references) claims that pandemics occur during sunspot minima but only when a large comet comes within 0.03 astronomical units of the sun. This introduces another arbitrary variable into the model (why 0.03?) and again, no credible mechanism for this phenomenon is supplied.

If more variables are introduced into the model then more data is needed to confirm the validity of the model and to distinguish the results from those of cherry picking. This doesn’t happen here and so we have no need to explore this idea further; the results will be meaningless whatever the correlation as the sample size is not large enough.


Towers corrected the mistakes of the authors and used the amended data to perform analysis using several established statistical techniques to eliminate these arbitrary choices and thus eliminate bias from the calculations – but still found no significant correlation.

When corrected lists of pandemic years were used, along with more powerful un-binned non-parametric tests to compare the distribution of sunspot numbers for pandemic years to that of all years, no significant result was obtained” – Simon Towers


Sunspot activity and influenza pandemics: a statistical assessment of the purported association – Simon Towers

Tapping KF, Mathias RG, Surkan DL. Influenza pandemics and solar activity. Canadian Journal of Infectious Diseases 2001; 12: 61–62

Yeung JWK. A hypothesis: sunspot cycles may detect pandemic influenza a in 1700–2000 ad. Medical Hypotheses 2006; 67(5): 1016–1022.

Cosmic Factors of Pandemic Influenza – Darejan Japaridze, Natela Oghrapishvil
Georgian scientists: Vol. 5 Issue 4, 2023
The graph shows (Fig. 2) that influenza pandemics usually occur during solar activity maxima, and during solar activity minima, pandemics occur only when large comets approach the Sun within 0.03 AU.

Ertel S. Influenza pandemics and sunspots—easing the
Naturwissenschaften 1994; 81(7): 308–311