On 25th March we addressed the Technology for Marketing and Advertising show (TFM&A) in Olympia, London on the subject of ‘Big Data for the smaller business’. The talk formed part of their education programme, within which it was directed at the ‘beginner’ level so it aimed to introduce the audience to some of the core concepts as well as offering insights to some of the tools and techniques available. In our experience, only the biggest multi-nationals, plus innovative new entrants specifically set up for the purpose, are really doing much with ‘Big Data’ at this stage, so by ‘smaller business’ we include even businesses with very significant sales and a global presence.
After a brief discussion of what was meant by ‘Big Data’, we discussed the obvious question of whether it was even relevant to smaller organisations, given that the really huge datasets tend to be generated by scientific experiments and large-scale industrial processes with multiple sensors (the ‘Industrial Internet of Things’). The conclusion we arrived at was that it is of direct relevance, primarily because the ecosystem of tools, techniques and resources that has grown up alongside Big Data could be used, even by smaller businesses, to gain competitive advantage. Equally, any company that did not attempt to make use of these facilities risked being outsmarted by a more innovative competitor and left behind in the marketplace.
We identified four aspects of ‘Big Data’ that we felt were directly relevant to smaller organisations:
- New analytical capabilities. New tools like Hadoop and NoSQL databases open up analytical possibilities that were not available previously, such as constructing product recommendation engines.
- New sources of data. The move towards opening up and sharing datasources – particularly in the realms of social media and public data – allows automated access to data that can be used to understand customers better and optimise business processes.
- Real-time data/analysis. Many of the new data sources are available as streaming data, and enhancements in processing power open up the possibility of real-time analysis allowing companies to respond so as to customise each customer interaction.
- Reduced cost of entry. The availability of cloud-based infrastructure from the likes of Amazon Web Services at low cost and with no up-front commitment permits even the smallest businesses to trial ideas requiring significant processing power without breaking the bank.
We made five simple recommendations for any smaller business wishing to make a start with Big Data and then rounded off the talk with a quick run-through of some resources that can help smaller businesses to hit the ground running. The five recommendations were:
- Start small – don’t try to do too much at first, but do make a start.
- Stand on the shoulders of giants – take advantage of online tools and resources, such as those provided by Google and Amazon, and learn from what other companies are doing.
- Use visualisation as much as possible – it is the best way to make sense of a large set of data.
- Join big with small – the value of information is increased many times when combined with other information. This is particularly relevant when accessing new, external, data sources.
- Think Big Data – above all, this is a business transformation to become more data-driven. Try to develop a culture whereby everyone in the organisation is encouraged to think about how smarter use of data could improve the business.
Above all, our conclusion was that whilst some of the things that companies like Google do with Big Data are hugely impressive and probably do require a significant investment in specialised staff, there is no reason why even the smallest business cannot sip from the firehose of Big Data and start to investigate new capabilities right now. Click here to download the slides.
Below are links to some of the online resources we mentioned in our talk which we hope that readers find useful.
Platforms (computing infrastructure in the cloud)
Google Cloud Platform
Amazon Web Services
Python (Anaconda distribution)
R (and R studio)
Google Apps Script (particularly for experimenting with purely Google services)
Google Fusion Tables
Public tools and data sources
Amazon Elastic MapReduce
Google API explorer
Google Public Data explorer
Google shared tables
Coursera online courses
Roshan data mining videos