When Assistant Professor of Economics Erik Nelson heard what was going on at the computer science department earlier this semester, he was psyched. “I was excited when she sent out an email about a month and a half ago looking for data that her students could work with.” The “she” Nelson refers to here is Clare Bates Congdon, visiting associate professor of computer science.
This semester her students have been taking the course Machine Learning, also known as data mining. In a nutshell, she explains, this is using artificial intelligence to sift through data and recognize patterns, with a view to predicting future behavior.
In Nelson’s case, he wants to use data mining to help in the analysis of crop yields. “I have this global data set of country-by-country yields from 1975 up to now,” he says. “I have information on factors like soil quality, weather conditions and the level of investment in agricultural technology.” His aim is to build a model of what determines crop yield in a country over time.
The difference between this machine learning approach and the technique economists traditionally use to analyze information, says Nelson, is that computers “let the data alone do the talking,” whereas economists look at data and make certain assumptions.
“The great thing about machine learning,” Nelson says, “is that it lets the machine discover for itself how data are shaped.”
Congdon’s students have not finished data mining Nelson’s crop yields yet. But when they do, Nelson says he intends to compare the results with his own conclusions, in what he calls a “robustness check.”
He sees machine learning not as a replacement for established analytical techniques, but as an important way for economists to verify that their methods are working. “A lot of times economists don’t get these robustness checks” says Nelson.
Helping Nelson with the project is Clarissa Hunnewell ‘17 – one of six students who have chosen to take the machine learning course.
One of the reasons she chose it, she says, was an interest in working with, and learning from data.
“I’ve learned both how troublesome and how powerful working with data can be in helping us solve problems.” Hunnewell, a math and computer science major, says she wants to take more classes like this, and as for life beyond college: “I know that work with Big Data is becoming and more regular and more popular, so I could definitely see myself doing something related to that.”
Congdon’s students are also helping two other professors crunch data, in completely unrelated areas.
“[Associate Professor of Education] Doris Santoro is working on a social media-based project, and my students are helping her look for patterns in the tweets of K through 12 teachers,” says Congdon.
They’re also helping Anja Forche, a research assistant professor in biology, analyze the behavior of yeast in the human body.
Congdon says machine learning as a form of computer science is particularly interdisciplinary, helping us to understand phenomena in a wide variety of fields.
Devising a machine learning program, she says, is “pretty complex. There’s a lot of math under the hood, figuring out where the patterns are.”
Data-mining computer programs have to be able to adapt as they run, she says, continually drawing new conclusions as new data pour in.
Congdon says her students need to learn programming skills, and in some cases build or modify software.
These skills, she says, make her students “tool-makers” rather than “tool-users.”
Congdon has been studying and teaching machine learning for the past quarter of a century, so she’s fascinated by the upsurge in interest on the subject in the past few years.
“Everyone wants to own data mining now and machine learning”, she says. “It’s because of the amount of data that are out there today”, she adds, noting that how many of our everyday transactions are now electronically logged.
“It’s crazy how much about us has been recorded somewhere.”