In Wired, a recent article “The End of Theory” popped up. I decided to check it out, but found myself too distracted to read it until I bumped into it again through a link on Digg. I’m glad I took the time. To summarize, the author describes how the new information age enables us to view exponentially large amounts of data, gathered from clusters of computers, whether they be in our homes or super-computers in laboratories. The new age of technology has seen the beginning of a literal neural network, a complex digital brain that is already a super computer. Because of the seemingly infinite amount of data, and the super-computers that help crunch the numbers, we are able to easily see correlatives. The numbers speak for themselves. Up until recently, information gathering was a slow process, but these days, we can’t get enough of it. In fact, some argue there is too much data to swim through. But this vast sea of information affords us the ability to see bigger pictures, tendencies and correlations first, and theories second. For instance,
There is now a better way. Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.
The best practical example of this is the shotgun gene sequencing by J. Craig Venter. Enabled by high-speed sequencers and supercomputers that statistically analyze the data they produce, Venter went from sequencing individual organisms to sequencing entire ecosystems. In 2003, he started sequencing much of the ocean, retracing the voyage of Captain Cook. And in 2005 he started sequencing the air. In the process, he discovered thousands of previously unknown species of bacteria and other life-forms.
Afforded this new technology, science may at last be aided by the fruits of its labor. The digital brain, the super computer may assist us in this data overflow, helping direct more appropriate causation. The author seems to go out of his way to insist the old method is now superfluous. I doubt he is correct on this one, for the sole purpose that causation helps us deepen our understanding of correlation. It’s nice to see the acknowledgement of correlation though. For instance, if there is some form of psychic phenomenon, the traditional scientific answer would be: There’s no way to prove it, it’s probably not real. However, if we take correlation into account, we might see the numbers at least prove it is a significant phenomenon, whether it be social or truly parapsychological.
Learning to use a “computer” of this scale may be challenging. But the opportunity is great: The new availability of huge amounts of data, along with the statistical tools to crunch these numbers, offers a whole new way of understanding the world. Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.
I agree with Wired on this one. Correlation has the backburner too often. It may end up shining the light where science traditionally takes ages to finally illuminate. With this new and rapidly developing tool, the possibilities of science, technology and at last handling the information age will be even more fascinating. It makes me wonder: Does this open us up for a future integral age? We’re beginning to see less traditional methods in science, adapting to new technological possibilities. Will data summaries, connections, underlying themes and patterns start to gain significance? I suppose we are going to have to wait and see what the future will bring.