Tag Archives: data mining

An Introduction to SuperCrunching

Super crunching has very little to do with all the other types of “crunching” that dominate headlines this year. Super crunching is a very clever way of looking at data, and coming up with interesting conclusions. And when I say interesting, I am not suggesting any kind of mediocre, marginally self-obsessed or obscurely academic kind of interesting. No. I am talking predict-the-future kind of interesting. Predictions that only need a handful of statistical parameters to prove themselves much more accurate than any panel of experts (see note 1 below) however carefully selected they may


One day, maybe, simple - and not so simple - mathematics could be driving your business

be. Trust me on this, I am an expert 🙂

Although the term super crunching sounds like yet another feeble attempt by a bunch of statisticians to make their jobs sound more exciting, this is definitely not the case here. Super crunching is a (relatively) new way of using large volumes of data.

You see, the norm for any type of research and development today, is to have a hypothesis, collect data and test it.

The super crunching way of doing things is reversing this traditional hypo

thesis-data order and starting from the end. You take the data (the more the merrier) and you ask it to tell you what is going on. If there is a causality within it, you can now know about it.

Letting a database tell you if there is a (not necessarily visible) relationship between data that concerns your business, is effectively telling you how the various parts of your business are effecting each other. The proper term for this: modelling of causal relationships, sounds a bit like science fiction and far fetched for a modest hotelier, but in reality it is a fascinating fact of life and rather feasible to consider for your business. Most importantly, it will make you really more efficient, and it will make you money in ways you just couldn’t guess by yourself. I fully recommend the book on the subject by Ian Ayres, Super Crunchers: How Anything Can Be Predicted.

Super crunching is used by many companies today. Do you have an iPod in your pocket? If yes, the chances are that you have noticed a little feature on it, called Genius. Genius is Apple’s super crunching software that looks at what you have bought, what you have in your music library, and your own rating of your songs. It then figures out (based on other people that appear to have the same tastes) what you are most likely to want to also buy. Amazon, Wall-Mart and Continental Airlines are just a few names you will recognise, which are becoming increasingly efficient at handling larg

e volumes of data. And they make very, very good money from it.

The real-life applications of today, and the execution of this kind of crunching, can be enormously interesting for the forward thinking hotelier. In fact, the chances are you are using a form of forward looking software already when setting prices for the future. IDEAS, or one of their competitors, are looking at your PMS’ data and are telling you how to price your empty rooms. Hoteliers that are using such software consider properties without it to be less sophisticated, and in a competitive environment the non-equipped property is seen as the most likely one to miss out. Although what IDEAS does shouldn’t quite qualify as super crunching, it is an example of how important it can be to look at data seriously.

What makes me think twice on this is that despite super crunching (effectively a collection of statistical and mining techniques) being perfectly suitable for the electronic transaction world, I am not aware of any hotel company that is using it today – at least not effectively.

Hotel companies use e-mail software to manage their e-mail campaigns. Quest

ionnaire software to collect data from actual and potential guests, and they have an on-line booking engine that will allow them to take bookings on-line. Some also use CRM software to monitor their sales efforts and a variety of software and techniques to monitor the progress and change the content of their website. On occasion, if they are luc

ky, they will be working with one of the better companies out there (I personally like Avvio from Ireland) which will give them the opportunity to combine some of these processes – even with external sources (so hotels can monitor and control their promotional activities with third party websites). But – given the cost and limitations that larger companies with “enterprise” solutions – the only way to truly use all this data to find out how you can be more efficient, is to look at it all together. Everything, dumped in a database, and then super crunched.

We are delighted to announce that eHotelworks will soon begin its first super crunching calculations within March. The project is extremely hush-hush at this stage, but we expect that we will be able to formally let you know what we will use it for, and the goal should impress. Even though we are not particularly seeking change, we expect that it will be enough to not only convince us to alter the way we do business. We think that it will be enough to also affect the way you do business too 🙂



When I was first introduced to Super Crunching, I knew I was falling in love. Super crunching is fun not only because it serves useful reminders of our inability to be as cool as we would like (which is otherwise very easy to forget); it also delivers shattering blows to two annoying constants of professional life:

First, come “the followers”.

For a number of reasons (in my mind these are ranging from the need to keep the boss happy to well documented flaws in the human thinking process) people – and consequently businesses – are very likely to become followers. Following is the easy option, and it is therefore sometimes confused with “bad” – which in itself, it is not. Following the example of the others, provides us with a safety net, and by comparing ourselves with the leaders it helps us assess what the results of our efforts will be (in a very crude way it helps us predict the future). All in all, there are good reasons why it is OK to follow. Nonetheless, there is such a thing as “early entrant benefits”, and becoming a permanent follower (almost a certainty amongst those that like the safety of having seen it done somewhere else first) eventually translates to long term, and opportunity costs. That is what is annoying about it.

In the internet world, these costs can be of monumental proportions. The example

of having a website in a particular country (localisation) comes to mind. Allowing your competitors – who also want French visitors in their hotels – to enter the French market by introducing a French website with a French domain and locally hosted, means that you will never, ever be the first entrant (from your competitive set) in that market. Your SEO professionals will be able to do a lot of things to improve your rankings, but they can never, ever make your website older (which is an advantage) than that of your competitors who have already done all that. If you don’t act early, the opportunity is lost for ever.

Ian Ayres, in his supreme book Super Crunchers, summarises the issue with poetic

beauty. Talking about the medical world (scary stuff actually) he says:

…once a consensus has developed about how to treat a particular disease, there is a huge urge in medicine to follow the herd. Yet blindly playing follow the leader can doom us to going to wrong way if the leaders are poorly informed.

Ian Ayres in Super Crunchers: How Anything Can Be Predicted. 2008

To make things worse, we – as a species – love to hold on to our errors. As new evidence arrives, evidence which contradicts our beliefs, we tend to discount it; we are focusing instead in evidence that supports our views.

Just the other day I was talking to a Director of Sales in a small group of hotels in Manhattan. She discounted all the evidence I provided that multilingual websites can be useful for converting higher levels of lookers to bookers. She also wiped away all suggestions about strong evidence that early entrants will have longer benefits from multilingual search engine optimisation and PPC, all with a single statement that the h

otels were doing just fine without them. Knowing the basic principle of higher demand = higher RevPAR (OK, I agree that some yielding has to happen somewhere in the process, but we are talking yielding 101 here), it practically hurt my ears listening to her. (Nevertheless, one absolutely has to respect the “we are fine as we are” argument… At the end of the day, maybe it was my manners…)

The second annoying constant, “the statistical lie”.

I used to have a boss at Hilton International (a long, long time ago, in a galaxy far, far away) that was just precious. During one of our arguments about salaries and payroll (no, I didn’t last long) he put his case forward by explaining that I was being given the same % of pay increase as all my other colleagues who were doing the same job. Correct though he was, he was conveniently forgetting that my salary was significantly

lower than the salaries of everyone else at my level. I was immediately pointing out that 100% of nothing is still nothing, and if you take the actual base salary figure into consideration, the amount of money that I was getting in the end was negligible.

To this day I am not sure if it was all a tactic, and he just wanted to pretend that he was tired with me by the time I had explained the flaw in his logic, or if he genuinely didn’t get it. He was generally an intelligent man, so I would normally opt for the former, but on the other hand it is quite amazing what people will completely fail to understand if not understanding helps them with their interests…

In any case, the use of statistics to prove erroneous (and occasionally barking mad) points is an international phenomenon, with implications that can be very severe. The problem is that acquiring the data is frequently full of problems. Sometimes the collection of data happens at the wrong place. Sometimes the actual collection process is flawed (maybe field researchers that are trying to get questionnaires in, even if it means bending the rules) or the actual questions asked lack the necessary insight to assure quality of data. In any case, it is no secret that “in this world there are lies, there are damned lies, and then there are statistics”.

So, it is little wonder that the idea of super crunching the truth, with its reminders that

a) experts don’t know as much as we would like them to and therefore shouldn’t be blindly followed, and

b) its promise that flawed statistical analysis will have more difficulty in supporting flawed logic,

immediately found an avid supporter in me.



Ted Ruger, a law professor at the University of Pennsylvania challenged political scientists Andrew Martin and Kevin Quinn (super crunchers) in predicting the outcome of the US Supreme Court’s decisions in 2002, and comparing their results to the predictions of a panel of experts.

Ruger used some 83 crème de la crème legal experts (top of the ladder, hand-picked and ultra

experienced professionals), each casting predictions on their field of expertise. Martin and Quinn used no more than six factors in predicting the results. Experts achieved 59.1% accuracy in their predictions, whilst the algorithms of Martin and Quinn produced a 75% of correct predictions.

Yannis Anastasakis



Leave a comment

Filed under Data, Mining, Supercrunching