Ken Zimmerman reviews the IT roundtable event presented by the Fairfield/Westchester SIM (Society for Information Management) that took place on Thursday, June 21, where he facilitated discussions for over 60 CIOs and IT Executives.
The topic of Big Data was very well received, I think. Generally, folks have a good understanding about the boundaries of “small data”, but there seemed to be a wide range of perception regarding the definition of Big Data. Some folks thought that it really has to do with unstructured data only (or primarily), while others thought that it is defined as “humongous amounts of data”. In reality, it’s really any amount, or type, of data that as a result of its quantity, or requirements for timeliness, cannot be managed by typical tools. We discussed technologies like Hadoop (which has been “harnessed” by IBM, Microsoft and Oracle), as well as SAP’s HANA, and how organizations like Facebook have approached the Big Data problem.
I think that folks see Big Data as something that is on the horizon and, while not necessarily here yet (for them), will be here soon, and therefore they need to start thinking about how to deal with it. Some folks are in the actual strategy stage, while others are contemplating creating a strategy in the next year or so.
At least one participant spoke about how their company works with Big Data on a daily basis since it is their raison d’être, and thus they have learned how to deal with the quantities and types of information inherent in the Big Data Problem. We spoke about some of the issues that John Parkinson has written about:
- Data quality – the more data you accumulate, the harder it is to keep everything consistent and correct
- Data characterization (metadata) – How you deal with the “data about data” requires that you to know how much data you will need to deal with and how fast it’s likely to change as it grows
- Interpretation – Technologists have had to design pattern recognizers that can sift through this huge amount data quickly to find (potentially unanticipated) data sets that can be acted upon expeditiously
- Data visualization – Representing results in an easily consumable form is a big challenge with such large quantities and types of information, but there are ways to approach and solve this issue
- Real-time view or retrospective view – Organizations will need to choose between real-time and retrospective views in order to work with Big Data on a reasonable budget
- Retention – How long the data is relevant or valuable needs to be part of the Big Data planning and strategy
It was a very good session, and I think people generally came away with a better understanding of what Big Data means. But, of course, it’s a Big Problem to solve.