The proper use of big data has only become possible in recent years thanks to the internet and instant global communications as well as the increased computing power needed to store it all and analyze it. But the concept has been around for much longer. The economies of yesteryear may not have needed big data to succeed like today, but it’s almost common sense that important knowledge can be gained from understanding mankind based on data gathered from thousands, millions or even billions of sources.
John Graunt is credited as being the first person to use statistical data analysis. In 1663, his approach to studying the bubonic plague tearing Europe to pieces must have seemed very different, not to mention time-consuming. At a time when most people blamed the plague on magic, the devil or ethnic conspiracy theories, Graunt was painstakingly reviewing death statistics in the city of London in an attempt to create a science-based early warning system for the spread of the plague.
While that system never became a functioning reality, Graunt’s work leads to many other insights about London’s population and produced the first statistical estimation of the number of the city’s inhabitants. The spark had been lit, though it would take some time to develop into a full-blown fire of big data advocates.
Gradually a precedent would be set, but there were many obstacles to the gathering of important public data, not to mention the analysis of what was collected. The technology was the main limiting factor; if you wanted data, someone had to physically go collect it. This meant going door to door, scouring over old paper documents or a combination of the two, which would then need to be brought into sync with one another. In short, the process was extraordinarily labor intensive. By the time enough data had been gathered, it would reflect statistical realities that were as much as a decade old.
By the late 1800s, this was becoming a problem. The Industrial Revolution had changed the pace of life and populations were changing quicker than ever. Birth rates grew and infant mortality fell. Large numbers of people moved into cities in rapid waves and changed demographic realities in a few short years. Data collectors needed a new solution.
The U.S. Census Bureau estimated that it would take eight years to gather and process data from the 1880 census and thought that the 1890 census would last for over a decade. But, in 1881, a bureau employee named Herman Hollerith built a machine inspired by the loom machines of the Industrial Revolution to speed up the process. The census was miraculously completed in three months.
With the discovery that machinery could open new possibilities in the gathering of data, new inventions started taking off. In 1928, Austrian-German engineer Fritz Pfleumer patented a way of storing gathered information on the tape. In 1943, the British invented the world’s first data processor in order to break Nazi codes, scanning 5,000 characters a second. Computers evolved quickly and made the NSA possible, tracking, analyzing and breaking Soviet codes.
Governments started large-scale efforts to store records on tape like tax returns and then personal computers hit the market. But the next proper breakthrough in big data technology came with internet access. With increasing storage capabilities, businesses, governments or even hackers could track and save online activity for their own purposes. The more data could be stored, the more useful it became.
Now any kind of physical storage has become irrelevant as cloud storage can taken over on the largest of scales. Bots gather it all automatically and it far more useful information than just your age or how many children you have. Data today can tell us your political beliefs, your buying habits, your interests and even some things you probably consider to be secrets.
Which brings us to data analysis. Why gather all that information? Analyzing census data used to be a job left to a few experts who would pour over documents, gather the numbers and try their best to use the human brain’s capacity for pattern recognition to draw meaningful conclusions from the numbers. Today, you guessed it, computers can pretty much do that for us — and they’re only getting better at it.
The future appears to be a world in which big data will give other entities (people and algorithms alike) an unparalleled understanding of who you are. While there are some legitimate concerns over how this capability could be used, the positive implications are also irrefutable. Dramatically increased and personalized healthcare is one, alongside new economic opportunities. The issue isn’t the technology itself, it’s how we choose to administer it.
Join our live conversations and updates on Telegram!**