SxSW 2017: Looking back on 20 years of data

One of the best tips for a successful SXSW visit is to attend the sessions which lie outside your field of expertise. You will learn the least there, but you will have the opportunity to see the best from other disciplines. However, when about.com presents an analysis of 20 years of data, as a data nerd, you will certainly not miss the opportunity to be part of it.

About.com? Maybe not the site most frequently used in the Netherlands, but as a combination between Google and Wikipedia, in the United States it is not small fry. It was set up a year before Google and with a history of 20 years, it has 3.5 million content pages. In short: it is one of the dinosaurs of the internet.

The heartbeat of the internet

The most remarkable thing that Jon Roberts, Chief Innovation Officer at about.com, tells us is that during the past 20 years, nothing has actually radically changed in our search behaviour. Of course, there have been technical innovations, changing trends and historical events. And of course, the answers to some questions are now different. However, our questions, and therefore our need for information, are basically very predictable.

In a fascinating graph, Roberts shows the categorized growth in the content pages of about.com over the past 20 years. The first remarkable fact is the jagged edge. Every winter season, there is a peak with a chunk missing – the month of December. It shows a rhythm, the heartbeat of the internet.

Predictable capacity

Within this heartbeat, you basically see the same thing year after year. Roberts explains this on the basis of the interest measured with regard to searching under the word ‘flu’. The number of searches in the summer months is low, but increases as autumn approaches, decreases slightly during the festive season and then gradually decreases again. Year in, year out.

On the basis of these tremendous datasets, patterns can therefore be derived. The trick to good data analysis is not so much pointing out these patterns, but finding deviations to them. You make a prediction of what a trendline will be like, and the deviations from that trendline are precisely the most interesting. Roberts uses the category of ‘health’ again as an example:

A vital event in the data is of course 9/11. An obvious effect was the increased number of searches regarding terrorism. However, an unusual second deviation from the data concerned weight loss. As if the whole of America wanted to be able to race down the stairs quicker.

At Christmas time, there is generally no traffic to health pages. However, after the election of Trump on 9 November last year, there was a zero line not previously shown. Conclusion? America had a headache which lasted as many as three days.

Demographics show patterns which are in fact the most interesting. For instance, Roberts shows the difference between age groups by means of the analysis that young people search more often for eating disorders and STDs, while people in their thirties search for pregnancy and miscarriages.

Young women are the most interested in a weekend in Paris than older women. On the other hand, men of all ages show an equally low interest.

Good advice from a data nerd: perhaps we as men can take a lesson from these patterns…

Most sought-after takeaway?

An enjoyable talk, but not the height of inspiration. Yet, during the Q&A, Roberts ends on an impressive note. He zooms in on segmentation: good predictions can be made from aggregated data, but the more you look in detail, the more erratic it becomes. You can recognize and help segments, but as for an individual? An individual is practically impossible to predict.

This forms a good bridge to personalization routes, because of course a one-to-one personal approach sounds super sexy. However, it is also exponentially complicated and expensive. The advice is therefore to think back on the repeated trends and, from this starting point, anticipate deviations from these patterns. It is still at the right time with the right message, but thought of from a segment which keeps it efficient.

And that is the great thing about SXSW – from an unexpected angle, you are sent in exactly the right direction!