Logging the Virus

There are a lot of coronavirus graphs these days. I’ve been seeing more log graphs around and some confusion about how to read them. I think we’ll see log graphs more often as we go forward, so it’s a good idea to understand what they mean. If you’re already comfortable with log graphs, there’s probably nothing new here. If you aren’t, I hope this helps.

Here are two graphs that display the same information. The first is a regular graph, the second is a log graph. Both are taken from Johns Hopkins’ excellent tracker.

No alt text provided for this image

No alt text provided for this image

The difference between the two graphs is entirely in the vertical scale. In the first graph each mark on the vertical axis represents adding a certain number of cases, 50,000 in this graph. The steepness of the line at any given point, called the “slope”, shows how quickly cases are being added. Because the line gets steeper going left to right, it’s clear that cases are being added faster and faster.

What’s going on in the second graph? Notice that the marks on the vertical axis represent different numbers of cases. The first mark is for 10 cases, then 100, then 1000 and so on, each mark is 10 times the mark below it. When the line goes from one mark to the next, it may be adding 90 cases or 900,000, that’s what’s confusing to many people. But log graphs aren’t about how quickly cases are being added, they’re about how quickly cases are being multiplied. Each time you go up a tick, the cases are multiplied by 10. In a log graph, the slope of the line shows how quickly cases are being multiplied.

Why are they called “log” graphs? Well, 10, 100, 1000, 10000, 100000 etc. can be written as powers of 10 using exponents. That would be 10^1, 10^2, 10^3, 10^4, 10^5, etc. Notice that the exponent goes up by one with each mark. So we could label the vertical axis with the exponents and it would look like a regular graph, each mark increases by 1. Now, the “log” of a number is just the exponent that 10 has to be raised by to equal the number. For example, 1700 = 10^3.23 approximately, so we can plot 1500, but putting the point 3.23 ticks up on the vertical axis. If that makes your head spin, just remember that slope on a log graph corresponds to growth rate as a percentage. So if cases are increasing 33% per day, it takes about three days on the horizontal axis for the line to go up one tick on the vertical axis.

In short, regular graphs track addition, log graphs track multiplication. Which one we use depends on what we’re interested in. (Caveat: Technically, the second graph is a “semi-log” graph because only one axis is logarithmic. However, it’s pretty common to see these graphs just labelled as log graphs. That’s what Johns Hopkins tracker does, as well as several others. But if you do see someone calling it a semi-log graphs, they’re just being precise.)

In looking at the two graphs, notice that while we’ve been adding cases at a faster and faster rate, we’ve been multiplying them at a pretty constant rate. The changing slope of the regular graph really shows how the cases are snowballing, but it makes it difficult to see where things are headed other than UP. The more constant slope of the logarithmic graph makes it easier to see how the line might be extended. It also clearly shows that while it may seem like the numbers are changing all over the place, there’s a fairly consistent underlying pattern to their growth. While it doesn’t change the numbers, it can make it easier to see what to expect, at least in the short term.

Let’s look at another graph. This one is from the 1point3acres tracker. I’ve chosen to display lines for China, Italy, and the US. If you want to make different choices, click on “Trends” near the top of the page and it should take you to the right section. Here’s the graph.

No alt text provided for this image

This graph shows the growth of the virus in each country for each day after there were 100 cases in the country. So, “10” on the horizontal axis will be a different date for each country, whatever date is 10 days after the first 100 cases there. Notice this is a log graph, the vertical axis is based on powers of 10. Now that we understand log graphs, what information can we see here?

The last point on the US line is day 19. We can see that on day 19, the US has more cases than Italy did, but not quite as many as China did. We could see that on a regular graph too. But here we can see more because the slope of each line represents the percentage daily growth rate in each country. For example, we can see that China had a steeper line than the US in days 5-9, but that by day 19, the line had become less steep. Now look at Italy’s line. It has that dip on day 19, but then goes back up, so just look at the overall slope of the line in the area around day 19. We can see that Italy’s line in this region is steeper than China’s but less steep than the US’. So on their 19th days, Italy’s cases were growing faster than China’s, but slower than the US’.

So, by using a log graph to represent this information, we can easily compare not only how many cases each country has on a day, but also how fast those cases were growing. We can also compare how the growth rates change over time. As China’s curve flattens, we can see that while its total cases continue to increase, they do so at ever slower rates. Italy’s curve also changes. Its growth rate is also slowing down, but not as quickly as China’s did. The US’ curve has stayed pretty straight thus far. You can see that the dashed line matches it pretty well. Percentage-wise, US cases are growing about as much over the last few days as they were a couple of weeks ago.

I hope this has been helpful. I’ll do my best to address questions and concerns in the comments.

Leave a Reply

Your email address will not be published. Required fields are marked *