Deadly Swiss Avalanches, in Charts

Snow-covered mountains are one of the most beautiful sights in nature, but in the wrong circumstances they can kill you. Skiers and other mountain enthusiasts sometimes refer to avalanches as the “white death”, and for good reason. Hundreds die in avalanches every year, and a great deal of effort is spent on trying to understand the factors that cause avalanches in the hope of decreasing this toll.

Located in the Alps and a mecca for winter sports, Switzerland takes avalanches seriously. The Swiss Institute for Snow and Avalanche Research  (SLF) monitors snow conditions, issues warnings, and collects data on avalanches. Their web site is very interesting for those interested in winter sports in the Alps. I find the snow maps particularly useful. But for this post I will use their data on fatal Swiss avalanches in the last 20 years to experiment with different ways to visualize some patterns and relationships.

The dataset includes information on the date, location, elevation, and number of fatalities, in addition to the slope aspect, type of activity involved (e.g. off-piste skiing), and danger level at the time of the avalanche. Over the last 20 years there have been 361 fatal avalanches in Switzerland, for a total of 465 deaths. Most avalanches killed only one victim.

Because I wanted to experiment with radial plots, I’ll focus on the variable of slope aspect in this post. Aspect is the compass direction that a slope faces. In this case we’re looking at the slope where the avalanche occurred. In Switzerland, the majority of avalanches occur slopes facing NW – NE, as you can see from this plot:


The gaps at NNE and NNW are probably artifacts of how the aspect data was reported.

This pattern is common in the temperate latitudes of the northern hemisphere. Avalanches are more common on north-facing slopes because they are more shaded and therefore colder, which allows snowfall to remain unconsolidated for longer. When more snow falls, these unconsolidated layers can act as planes of weakness on which snow above can slide. It’s much more complicated that that, with factors like wind and frost layers coming into play. To learn more about how aspect and avalanches, see here. The pattern is unmistakable, but does it hold all year long? I separated the data by month to find out:


Fatal avalanches occurred in all months, but are much more common December – April

A few interesting insights emerge from this plot. First, February is clearly the most deadly month for avalanches.  In December there are actually quite a few avalanches on SE facing slopes, but by January the predominate direction is centered around NW. In February, and to some extent in March, it changes to N-NE. In April it’s NW again, but by then there are significantly few avalanches. So there are some monthly patterns, but I’m not exactly sure what the explanation is. Of course to really nail this down we’d want to do some statistics as well.

One pattern I expected, but did not see, was a decrease in the dominance of northern aspects later in the spring. I expected this because as the days get longer, the shading effect of north facing slopes decreases. It’s important to remember that these are fatal avalanches, and a dataset of all avalanches would look different. For example there are probably a lot of wet avalanches on southern slopes in the spring. But these are much less dangerous than the slab and dry powder avalanches, and therefore not reflected in the fatality data.

The rose style plots above are useful, but I wanted to try to illustrate more variables at once. So I tried a radial scatter plot:

Fatal Swiss avalanches 1995 - 2016: Slope aspect, elevation, and activity

Click on the image for the interactive version

This plot is similar to the previous ones in that the angular axis represent compass direction (e.g. 90 degrees means an east-facing slope). The radial axis (the distance form the center) represent the elevation where the avalanche occurred. And color represents the type of activity that resulted in the fatality or fatalities. Each point is one avalanche. The data are jittered (random variations in aspect) to minimize overplotting. This is necessary because the aspect data are recorded by compass direction (e.g. NE or ESE). The density of the points clearly illustrates the dominance of north-facing aspects. It’s also clear that most avalanches occur between 2000 and 3000 meters (in fact the mean is 2507 m). In terms of activity, backcountry touring and off-piste skiing and boarding dominate. And avalanches at very high altitudes are mostly associated with backcountry touring, which makes sense, as not many lifts go up above 3000m. Perhaps especially perceptive viewer can make out some other patterns in the relationships between variables, but I can’t. Any thoughts on the usefulness of this plot for the dataset?

Finally, I want to share a couple graphics from SLF (available here). Here is a timeline of avalanche fatalities in Switzerland since 1936:

The average number of deaths per year is 25, but this has decreased a bit in the 20 years. There were also more deaths in buildings and transportation routes prior to about 1985. Presumably improvements in avalanche control and warnings reduced fatalities in those areas. And what happened in the 1950/51 season. That was the infamous Winter of Terror. The next plot shows the distribution of fatalities by the warning level in place when the avalanche occurred:

Interestingly, the great majority of deaths happened when warning levels where moderate or considerable. There were significantly fewer deaths during high or very high warning periods. One reason must be that high/very high warnings don’t occur that frequently, but it’s also likely that skiers and mountaineers exercise greater caution or even stay off the mountain during these exceptionally dangerous times. There’s probably some risk compensation going on here. To really quantify risk, you have to know more than just the number of deaths at a given time or place. You also have to know how many people engaged in activities in avalanche country without dying. One clever approach is to use social media to estimate activity levels, as demonstrated in this paper.

Have fun in the mountains and stay safe!

Data and code from this post available here.

All data from WSL Institute for Snow and Avalanche Research SLF, 25 March 2016

The 1960 Chile Earthquake Released Almost a Third of All Global Seismic Energy in the Last 100 Years

I just saw a trailer for the movie San Andreas. It looks preposterous but I love geology disaster movies, so I’ll probably see it. In the film, a series of earthquakes destroy California, culminating with a giant magnitude 9.5 quake. Fortunately the Rock is on scene to help save the day.

The largest earthquake ever recorded in real life struck central Chile on May 22, 1960. With a magnitude of 9.6 (some estimates say 9.5) this was a truly massive quake, more than twice as powerful as the next largest (Alaska 1964), and 500 times more powerful than the April 2015 Nepal quake. The seismic energy released by the 1960 Chile quake was equal to about 20,000 Hiroshima atomic bombs. Thousands were killed. It also triggered a tsumami that traveled 17,000 km across the Pacific Ocean and killed hundreds in Japan.

But I think the most striking thing about this quake is that it accounts for about 30% of the total seismic energy released on earth during the last 100 years. To illustrate this, I calculated the seismic moment (a measure of the energy released by an earthquake) of all earthquakes greater than magnitude 6 and plotted the global cumulative seismic moment over the last 100 years.

Global Cumulative Seismic Moment 1915-2015

Click for interactive version

This plot clearly shows how the 1960 Chile quake (and to a lesser extent the 1964 Alaska event) dominates the last 100 years in terms of total energy released. This is not always obvious as the earthquake magnitude scale is logarithmic. So a magnitude 9.6 releases twice as much energy as a 9.4 and 250 times as much as an 8.0.

Technical notes: To make this plot I downloaded from the USGS archive data on all the earthquakes greater than magnitude 6 from 1915-2015. There are about 10,500 of them.

I calculated the seismic moment for each quake relative to a magnitude 6 (the smallest in the database) using

\Delta M_{0} = 10^{3/2(m_{1}-m_{2})}\

Where m1 is the magnitude of each quake and m2 = 6.

So a mag 9.6 is about 250,000 times more powerful than a mag 6.0. (Note that this refers to energy released, not necessarily ground shaking, which is influenced by many factors, such as earthquake depth).

Then I summed all the relative moments, normalized to 1, and plotted the cumulative seismic moment over the time period.

A few caveats. First, the quality of the magnitude measurements has improved over time, so that the data from the earlier part of the 20th century is not as reliable as the more current data.

Second, this analysis only looks at earthquakes larger than magnitude 6.0. Of course there are many, many smaller earthquakes. However, the cumulative amount of seismic energy released by these smaller quakes is very small compared to the larger ones (again, remember the logarithmic scale).

Third, the magnitudes listed in the USGS archive are calculated in different ways. The majority are moment magnitude or weighted moment magnitude. The equation above is meant for these types of magnitude. Other magnitude measurements, such as surface wave magnitude, have slightly different ways of calculating total energy release. This may introduce some inaccuracies, However, they will be small compared relative to total energy release.

If any seismologists would like to weigh in, I would be most grateful.

More information on calculating magnitude and seismic moment here and here.

Data and R code here. Graph made with

Weighted Density and City Size: Who Knew Milwaukee Is So Dense

You’re probably familiar with the concept of population density. It’s the total population divided by the area. When talking about cities, it’s commonly understood that high population density is a necessary if not sufficient condition for urban vibrancy and efficient mass transit. But it can be difficult to compare population densities of metropolitan areas because the administrative boundaries have an arbitrary effect on measurement. For example, if the LA metro area is defined at the county level and includes all of San Bernardino County, which is mostly empty desert, you get a pretty meaningless density measurement.

Now, you can look at smaller administrative areas to get a better handle on the population density of a city. In the U.S. the census tract is the highest resolution. With the areas and populations of each census tract, you can calculate an even more interesting metric: population-weighted density, which is the the average of each resident’s census tract density. That means that areas where more people live get more weight in the overall density calculation.

Another way to think about population-weighted density is the density at which the average person lives. The simple population density of the entire U.S. is 87 people per square mile. That really does not tell us much. But the population-weighted density is over 5,000 people per square mile. The average American lives in an urban area. (That example is from a U.S. Census report on metropolitan areas.)

An interesting (if not intuitive) insight from population-weighted density is the strong relationship between city size and density. Big cities are more dense. The plot below shows the population weighted densities and total populations of the 100 U.S. largest cities (well, technically core-based statistical areas). Click on the image for the interaction version if you want to mouse over the dots to identify individual cities.

Larger US Cities have Higher Population-Weighted Densities

Click for interactive version. Note log-log scale.

The cities are categorized by region, showing the general pattern that southern cities are the least dense and northeastern and western cities the most dense. This regional difference is emphasized in the linear fits shown for each region. I was surprised by how dense on average the western cities are. Honolulu is a real outlier in terms of having a high density for its size. Unsurprisingly, the sprawling giants of Atlanta, Dallas, and Houston are low-density outliers.

Incidentally, I got the idea for this graph after listening to a very interesting podcast on Streetsblog about the urban form of Milwaukee. It mentioned that Milwaukee is actually one most the most dense cities for its size, especially when looking in the Midwest. And sure enough, Milwaukee lies well above the blue trend line for Midwest cities. If you have 45 minutes and are interested in Milwaukee you should definitely listen to the podcast. Full disclosure: I was born and raised there.

Technical notes: Plot made with using data from U.S. Census. The color palette is inspired by the film Rushmore and is from Karthik Ram’s wesanderson R package. Yes, this was all an elaborate excuse to try out the Wes Anderson color palettes.

A nice in-depth look at urban density and implications for transit can be found here.

Finally, if you are interested in extreme urban density, check this out. I can’t vouch for the accuracy of the data, but the web site name suggests it’s probably pretty legit.

The Rio Declaration and the Decline of Multilateral Environmental Agreements

It’s been quite some time since my last post. I have been busy with a young child, new job, and an international move. But I’m hoping to get back into posting and making visualizations on a regular basis.

The reason for this post is that I came across an interesting resource called the International Environmental Agreements Database Project, hosted at the University of Oregon. The database contains information on about 1100 multilateral environmental agreements (MEAs) dating back to 1857. The data include the title, type (an original agreement or a protocol or amendment to an existing agreement), dates of signature and entry into force, and the parties. For some agreements there is even data on performance as well as coding to allow for comparison of the actual legal components.

As an initial exploration, I simply looked at how many agreements were concluded over time. The plot below shows the results for the last 100 years. Click for the interactive and shareable version.

100 Years of Multilateral Environmental Agreements

Click for interactive version

There is a pretty interesting pattern. From the early 20th century until the 1950s there are not that many MEAs. Then the pace picks up in mid-century, peaking in the early 1990s, and declining considerably after that.

What’s going on? Have all the easy agreements been reached and there is nothing more for countries to negotiate about? Maybe that’s part of it, but I think it has something to do with an event that coincided with the peak in MEAs – The 1992 Earth Summit and the resulting Rio Declaration on Environment and Development.

The Earth Summit was a huge event in the global environmental community, and occurred at a high point of optimism about multilateralism. There was a flurry of MEA activity around this time. But there was also a building movement to ensure that international environmental diplomacy was benefiting the poor, and in particular, developing countries.

The Rio Declaration enshrined the principle of common but differentiated responsibilities. This is the idea that while all nations have a responsibility to protect the global environment, rich nations should shoulder a greater share of the burden.

It is a noble sentiment, and one that in my view makes a lot of sense. But it had the effect of making it more difficult to reach agreements in international environmental negotiations. Developing countries started going into the negotiations expecting more support, in the form of funding, reduced obligations, or technology transfer, from the developed world. Common but differentiated responsibilities is at the root of a major sticking point in global climate talks. Should China, India, and other rapidly developing nations have the same stringent obligations as more mature economies?

I certainly don’t think this is the only cause of the decline in new MEAs in the last 20 years. And neither can I claim to be the first to think about the Rio Declaration’s impact on MEAs. There’s an entire literature on it. For example, Richard Benedick discussed this theme at length in reference to the Montreal Protocol and its aftermath in his book Ozone Diplomacy.

As a final disclaimer, for this analysis it would be best to filter the IEA database to exclude those MEAs that only have a few parties. That way you could really focus on the rate of global or large regional MEAs over time. Perhaps I’ll do that next.

But in any case, it’s an interesting dataset and an interesting pattern. And a good excuse to step back and think about the big picture in global environmental politics.

Graphics for Fitness Motivation using

This post is intended to illustrate the cool things you can do with’s API for R. is a web-based tool for making interactive graphs. It uses the D3.js visualization library, and lets you create very attractive plots that can be easily shared or embedded in a web page. With the R API you can manipulate data in R and then send it over to to create an interactive graph. There’s also a function that let’s you create a plot in R using ggplot2, and then shoot the result directly over to (summarized nicely here).

I have great little free app on my iPhone called Pedometer++ that keeps track of how many steps I take each day. I exported the data, plotted up a time series with ggplot2, and used the API to make the graph in It worked quite nicely. The only hiccup was that did not recognize the local regression curve, so I had to add that separately.

You can see from the plot that I’m not consistently meeting my 10,000 step goal. In fact, I averaged 7,002 steps over this period. That still comes out to a total of 1,470,463 steps. From October through February my step count was trending slightly downward, but since then it’s picked up. Maybe that had something to do with the cold winter. Hopefully as the weather (and my motivation) improves, I’ll hit my goal.


Click to see the interactive version

Any here’s a bonus box plot showing steps taken by day of the week (also using the R API):


Click to see the interactive version

If there are any pedometer users out there who are interested, let me know and I can post the code.