What data should we collect? Part 3: Big data vs small data

When you are, like us, fully convinced that data collection is worth the bother, and have decided what problem you need to solve, another question might be arising. Is traditional small scale data still good enough, or do you need to get your head around the emerging discipline of data science and collect Big Data? This is what we want to discuss in part 3 of our series ‘What data should we collect?’

Big Data is one of those buzzwords that has risen to popularity over recent years. It sounds important and great. Not just data, no, BIG data. What adds to the myths and mysteries of Big Data is that analysts describe it as the next big frontier and a way to improve innovation, competition and productivity. For the very reason that Big Data seems to solve all business problems, a sense of excitement surrounds the topic. The Big Data hype has reached most industries by now and construction is no different. In 2014, ARUP published a thought piece on the topic (though without explicitly labelling it as ‘big data’) arguing that data will be the new currency in construction. But what exactly is Big Data and is it really superior?

Many different definitions of Big Data exist. Some people describe it simply as larger data sets, others focus on its complexity. One popular definition comes from analysts at Gartner who focus on the 3 V – volume, velocity and variety. So Big Data according to this definition is data that is high in volume, gathered and analysed speedily and constantly, and is of a greater variety than usual, often combining different sources and data sets. A brilliant and critical definition of Big Data comes from social media scholars Danah Boyd and Kate Crawford, who define Big Data in their article ‘Critical Questions for Big Data‘ as:

“We define Big Data as a cultural, technological, and scholarly phenomenon that rests on the interplay of:
(1) Technology: maximizing computation power and algorithmic accuracy to gather, analyze, link, and compare large data sets.
(2) Analysis: drawing on large data sets to identify patterns in order to make economic, social, technical, and legal claims.
(3) Mythology: the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy.”

Boyd and Crawford then go on to dispel many Big Data myths, among them the claim that bigger is better and that Big Data is more objective and accurate. They also criticize Big Data for creating new digital divides, taking data out of context and raising ethical issues.

How does this relate to the construction industry and data we might collect in projects around work and the workplace?

One thing to notice is that Big Data is only making a very slow and careful appearance in the realm of workplace. Notable exceptions such as Humanyze aside, Big Data in the field of workplace is rather in its infancy. This is not to say that Big Data studies do not exist. Data sets such as the Leesman index or the Gensler workplace study are increasing in size; comparative studies of workplaces are on the rise, for instance the one we were involved in some years ago on the ‘Generative Office‘, which evaluated 61 office buildings in 2012, or the most recent analysis of social behaviours in the workplace by Koutsolampros et al. But on the whole, Big Data in the workplace is certainly not the norm.

Is this a problem? Definitely not. Big Data can undoubtedly lead to new insights into patterns that weren’t visible with traditional methods of data analysis. But so can carefully crafted studies collecting ‘small data’. There are so many things we do not yet know in the workplace that any type of rigorously collected data will add insights, for instance qualitative data gathered in interviews, ethnographic observations or statistics derived from a clearly defined single source data set. Big is not necessarily better. Yet again, the answer to whether you should collect Big Data is “It depends”. If the problem you need to solve requires real-time analytics, automatically collected data and the combination of diverse sets of data sources, plus you have the resources required including data science capacity and skills, then by all means go for it. Most often though, a clearly defined small scale study, collecting exactly the data that is needed to answer a question can be much more valuable to businesses than jumping on the bandwagon that is Big Data.

Read the final part 4 of this series of blogs here

2 thoughts on “What data should we collect? Part 3: Big data vs small data

  1. Pingback: What data should we collect? Part 2: It depends… | brainybirdz

  2. Pingback: What data should we collect? Part 4: The organisational reality | brainybirdz

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s