Menu
Megan Fox works with James Franco in Zeroville

Megan Fox to work with James Franco in Zeroville

Minka Kelly is Dating Captain America Star Chris Evans

Minka Kelly is Dating Captain America Star Chris Evans

Jerry Seinfeld Teases Season 5 of Comedians in Cars Getting Coffee

Jerry Seinfeld Teases Season 5 of Comedians in Cars Getting Coffee

Samsung Galaxy S5 to get Android 5.0 Update in December

Samsung Galaxy S5 to get Android 5.0 Update in December

Jenny McCarthy Dishes on Donnie Wahlberg's Penis

Jenny McCarthy Dishes on Donnie Wahlberg's Penis

Big Data Solutions Through The Combination Of Tools

Feb 12 2014, 2:57pm CST | by , in News | Technology News

Big Data Solutions Through The Combination Of Tools
 
 

By Ben Lorica

As a user who tends to mix-and-match many different tools, not having to deal with configuring and assembling a suite of tools is a big win. So I’m really liking the recent trend towards more integrated and packaged solutions. A recent example is the relaunch of Cloudera’s Enterprise Data hub, to include Spark(1) and Spark Streaming. Users benefit by gaining automatic access to analytic engines that come with Spark(2). Besides simplifying things for data scientists and data engineers, easy access to analytic engines is critical for streamlining the creation of big data applications.

Another recent example is Dendrite(3) – an interesting new graph analysis solution from Lab41. It combines Titan (a distributed graph database), GraphLab (for graph analytics), and a front-end that leverages AngularJS, into a Graph exploration and analysis tool for business analysts:

Users of Spark explore Spark Streaming because similar code for batch (Spark) can, with minor modification, be used for realtime (Spark Streaming) computations. Along these lines, Summingbird – an open source library from Twitter – offers something similar for Hadoop MapReduce and Storm. With Summingbird, programs that look like Scala collection transformations can be executed in batch (Scalding) or realtime (Storm).

In some instances the underlying techniques from a set of tools makes its way into others. The DeepDive team at Stanford just recently revamped their information extraction and natural language understanding system. But already techniques used in DeepDive have found their way into many other systems including MADlib, Cloudera Impala, “a product from Oracle,” and Google Brain.

Related content:


 

(1) Full disclosure: I am an advisor to Databricks – a startup commercializing Spark.
(2) Some potential applications of Spark and Spark Streaming include stream processing and mining, interactive and iterative computing, machine-learning, and graph analytics.
(3) Hat tip to Danny Bickson.

This post originally appeared on O’Reilly Data (“Big Data solutions through the combination of tools”). It’s republished with permission.

Source: Forbes

Recommended For You

Megan Fox Scores Next Big Role

Megan Fox Scores Next Big Role

6 hours ago, 1:47pm CDT

Lady Gaga Profiled

Lady Gaga Profiled

6 hours ago, 1:33pm CDT

Paris Hilton slams Deadmau5

Paris Hilton hits back at Deadmau5

7 hours ago, 1:26pm CDT

Black Friday 2014 Update

Black Friday 2014 Update

10 hours ago, 9:39am CDT

Comments

blog comments powered by Disqus