This new system is designed to eliminate the human element in data analysis.
Researchers at Massachusetts Institute of Technology (MIT) and Artificial Intelligence Laboratory (CSAIL) have developed a new system which can outperform even some of the smartest people in the world.
Don't Miss: See the first leaked Black Friday 2016 Ad
The system, named ‘Data Science Machine, is a breakthrough in artificial intelligence, which aims to take human element out of data analysis.
Big-data analysis generally requires human intuition for searching patterns and choosing which features of data need to be analyzed but the new system can do both without human involvement, in fact, better than humans themselves.
To test its ability, MIT researchers enrolled the system in three science data competitions, where it made 94, 96 and 87 percent accurate predictions, finishing ahead of 615 out of 906 human teams. Where human teams took several months to generate prediction algorithms, Data Science Machine, took just 2 to 12 hours to complete this task.
"We view the Data Science Machine as a natural complement to human intelligence," said Max Kanter, who is doing his MIT master's thesis in computer science on Data Science Machine. "There's so much data out there to be analyzed. And right now it's just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving."
Kanter and his thesis advisor Kalyan Veeramachaneni will present features of the system in the coming IEEE International Conference on Data Science and Advanced Analytics.
Data Science Machine is designed in a way that can it correlates structural relationships within a database design. Database is stored typically in different tables. New system tracks them using numerical identifiers and takes cues for them for further construction.
For instance, one table might list retail items and their costs; another might list individual customers' purchases. The machine will perform a number of operations such as importing costs from the first table into the second and matching different items with same purchase numbers and come forward with candidate features like total cost per order, average cost per order, minimum cost per order, and so on.
Once it prepares a number of candidates, then it reduces their number by identifying those whose values seem to be correlated and recombine them in different ways to make accurate predictions.
Don't Miss: The Best HDR TVs
"What we observed from our experience solving a number of data science problems for industry is that one of the very critical steps is called feature engineering," said Veeramachaneni. "The first thing you have to do is identify what variables to extract from the database or compose, and for that, you have to come up with a lot of ideas."