DXPG

Total Pageviews

Sunday, April 7, 2013

The Potential and the Risks of Data Science

Columbia University held a daylong symposium on Friday as a kind of brainy coming-out party for its new Institute for Data Sciences and Engineering. The institute is a collection of interdisciplinary centers including ones for cybersecurity, financial analytics, health analytics, new media and smart cities. It points to the direction universities will have to take if the bundle of technologies called Big Data â€" new data and artificial intelligence tools â€" really are to transform industries, as its champions predict.

The symposium, “From Big Data To Big Ideas,” was mainly a celebration of the promise of the technology in fields from health care to transportation, with presentations from Columbia professors and computer scientists from companies like Google, Facebook, Microsoft and Bloomberg.

The privacy and surveillance perils of Big Data came up only in passing. But during a question-and-answer portion of one panel, Ben Fried, Google’s chief information officer, expressed a misgiving. “My concern is that the technology is way ahead of society,” Mr. Fried said. There is danger, he suggested, if only a technical elite understand Big Data and its implications, with the risk of a runaway technology or a public rejection.

I spoke with Mr. Fried briefly afterward. “I think it is a mistake if conversations about this technology leave out the humanities,” he said. Broader social concerns, he explained, should be a guide and will affect the spread and use of Big Data technology.

Mr. Fried works for a company that has at times tested the limits of Big Data technology, notably the privacy threat posed by overaggressive data collection. But he makes a good point, and he’s not the only one making it.

Alex Pentland, a computational social scientist at the MIT Media Lab, is leading a group at MIT and elsewhere in exploring the implications of what he calls “a data-driven society.”

At Columbia, Mark Hansen, a professor of journalism and director of the institute’s New Media Center, has his own plan for bringing the humanities into Big Data. He teaches students from Columbia’s Graduate School of Journalism how to do some data programming. The goal, Mr. Hansen explains, is not to make them professional programmers, but mainly to give journalists â€" whom he calls “society’s explainers of last resort” â€" a firmer understanding of computer technology. Software algorithms, he said, are not impartial. They are written by people, and can embody human values and biases.

Mr. Hansen’s small-scale educational program recalls the major initiative at Dartmouth College in the 1960s, when mainframe computers, the transforming technology of the day, came into widespread use in business, government and science. It was there that two professors, John Kemeny and Thomas Kurtz, developed Basic, a simplified programming language, initially for Dartmouth students. They were influenced by the concerns raised by C.P. Snow, the English scientist and novelist, in his 1959 lecture, “The Two Cultures,” which analyzed the difference between scientific and literary intellectuals, and pointed to the danger of the schism.

The Columbia data sciences institute is just getting under way. New centers could be added. One of the panelists on Friday suggested Columbia might want to add something like the Berkman Center for Internet and Society at Harvard University, which focuses on the impact of technology on society.

The Columbia institute is a science-led undertaking, and its director is a computer scientist, Kathleen R. McKeown. Incidentally, Ms. McKeown, who was the first female professor to receive tenure at Columbia, holds a bachelor’s degree in comparative literature.