Government agencies are increasingly collecting big data and performing analysis to improve existing processes and gain valuable insights by engaging in new types of analyses that weren’t possible before. A number of technologies are enabling businesses and governments to handle vast data sets without having to spend millions on deployment. But, as with any new technology, the CIO must build a business case for the adoption of a big data platform.
LIVE Singapore! is just one example of a proposed big data platform research project which aims to produce “a system of systems” in which big data can be used to reflect urban activity. This project, funded by the National Research Foundation and developed by MIT’s Senseable City Lab, aims to transform Singapore into a knowledge based growth industry, driven by data and analytics.
Since developed in 2011, the project has six visualisations from several sets of data streams. These visualisations report on vehicular traffic, rainfall effect on taxi supply, daily routine consequences of Formula One activities, temperature rise and energy consumption, mobile phone penetration, and transport system global reach.
When developing a big data platform, similar to LIVE Singapore!, here are some important questions that need answering.
1. What can a big data technology do better than traditional database management technologies that you’ve already invested in? If you’ve identified opportunities for big data analytics, in what ways can the technology help you tap into this potential? The value delivered by a big data technology investment must be clear from the outset.
2. Cost is an important factor for any big data technology deployment. What kind of investment can you afford given the estimated returns from big data analytics? If you have a big budget, would you stick to a relational database management system like SQL or Oracle that can cost you a six-figure sum or more? Or would you like to use an open source software like Hadoop which has zero licensing fees, can run on commodity hardware, and costs just a few thousand dollars to get started?
Then there are personnel costs. There is currently a huge demand for data scientists but they are in short supply. Should you keep expertise in-house and invest in training as needed or set up a formalized process with a trusted third-party to execute big data strategies? An estimate of overall costs – the expenses of running a big data platform and skilled labor or outsourcing costs – is critical.
3. What is the reality of a big data technology implementation? What kind of exploration and use cases are provided by various big data technologies? Additionally, what kind of resources and training will the implementation entail? While technologies and tools are making some emerging big data frameworks easier to understand, implementing them is still complex. A careful review of implementation challenges, solutions as well as the common reasons for deployment failures can help in making judicious choices.
4. Security is another important issue. What kind of security issues do various technologies pose? In what way do they meet compliance requirements? Additionally, what measures and processes are required to manage security risks?
5. As you continue dealing with vast data sets and adding information to your big data platform, it becomes necessary to have appropriate information life-cycle management strategies in place. You also need to have an idea about the requisite disaster recovery strategies.
Big data conversations must start at the earliest. The sooner you invest in high-performance analytics, the better you’ll be able to capture the full potential of big data.