Strata, like any O’Reilly Media conference, will always celebrate the power of raw technology, the speeds and feeds. But this year, based on the keynotes, the vendors, and the attendees it is clear that Strata is growing up and becoming far more about the business user, and buyer, than ever before. As usual, there was a lot to enjoy, both sublime and ridiculous. Read on for the roundup.
The Most Gloriously Nerdy Session Award goes to Ian Timourian for “Unlocking the Secrets of Gertrude Stein,” a session in which advanced methods of data science and visualization were used to analyze the poetry of Gertrude Stein. Timourian, @KIDDPHUNK on Twitter, is one of the eclectic people that must be part of any O’Reilly conference. The highlight of the talk was a transcription of the structure of Stein’s poetry to music. The next step for Timourian should be to find a way to create a videogame-style visualization tool for cleaning and distilling data. This is not so far-fetched since he works at Paxata, a startup that makes a product for that purpose.
Unlike many past Stratas, the vendors were pleased with the number of business buyers. In past years, vendors who laid down serious cash for space at Strata whined, “Too many hoodies, not enough suits.” Not so this year. “The market has finally arrived,” said Monica Pal, VP of Marketing at Aerospike, a super-scale database used in advertising and other high performance use cases. “The business users are here and are starting to do real work.” Dave Rich, CEO of Revolution Analytics, which offers a distribution of the R open source statistical package, said, “When household tech names show up, it means it is the end of the beginning.”
Many contenders vied for the Creative Company Name Award. Zettaset was far catchier than companies that combine “machine,” “miner,” or “data.” SpliceMachine could just as well be SpliceMiner or SpliceData. RapidMiner could be RapidData or RapidMachine and nobody would know the difference. Ataccama, named for an arid, clean desert in Chile, gets honorable mention. The name fits its mission of purifying data. But the prize goes to Gazzang, which is a comic book exclamation with an extra “z”. Who cares what the company does? (Hint: It offers data encryption.) Gazzang is so much fun to say.
The Get Your Own Slogan Award goes to Pivotal for its use of “data lake.” I love the idea, in fact so much that my company made this video about it in 2012. But I don’t take credit for the concept, which is an excellent way to explain the type of repository needed in the world of big data. James Dixon, CTO of Pentaho, is the person who introduced me to the data lake in 2011, and is, as far as I know, the person who coined it. Actually, Pivotal should keep the data lake slogan, but credit James Dixon for coming up with it.
The Most Insightful Presentation Award goes to Kurt Brown, Director of Data Platform at Netflix. His presentation had more ideas than most books. He explained the Netflix way with respect to architecture, operations, cloud, open source, and managing vendors. He included a brilliant section on philosophical principles. A choice hoodies-meet-suits moment occurred when someone asked what kind of “project management office” Netflix had. Brown smiled and replied that there is no such thing as a project management office at Netflix. At times, a developer leads important efforts that need coordination, but that’s about it.
The Best Quote Award is always a challenge. Farrah Bostic, CEO of The Difference Engine, a consulting firm that helps companies with product strategy and research, had a pithy statement, “People are data too.” Bill Franks, Chief Analytics Officer of Teradata, also offered a great insight when he said, “Big data must be an extension of an existing analytics strategy.” Too often discussions of big data ignore the fact that big data is just one input needed in a comprehensive analytics strategy. Franks feels so strongly about this issue that he’s working on a book called Making Analytics Operational. Using big data is easy compared with creating a culture of analytics.
Upon consideration, I declare Geoffrey Moore the winner of the Best Quote Award with a quote he coined last year: “Without big data analytics, companies are blind and deaf, wandering out onto the Web like deer on a freeway.” If you take big data out, the quote is less trendy and more useful. Moore also said, “There is a trillion dollars of work to be done soon,” which I liked as well.
The People-Driven Platform Award is awarded to a platform that supports as many different types of questions that people can ask. Ari Zilka, CTO of Hortonworks, points out that the power of Hadoop is that it serves many data models. You can have data in flat files, raw data in a filesystem, and NoSQL served up out of Hadoop. But if you are seeking to create an analytics culture, you ultimately want to allow people to ask any question and for the system to find a way to provide an answer. Teradata’s Bill Franks put it this way, “We don’t want many brains; we want one brain with many specialized subsystems.”
For BI to have the greatest impact, it has to solve the Top of the Funnel problem. This is not solved by any current platform, except perhaps IBM’s Watson for some applications. To solve it for the world of analytics will require natural language processing connected to some sort of semantic model of the data. DataRPM, ClearStory Data, and Microsoft Power Q&A all take important steps in this direction. But each of these platforms is limited to one or two data models. Companies like Facebook and Netflix don’t try to solve this problem, but support different niche environments.
The best near term solution has one platform that exposes data in different data models. For that reason, the Teradata Aster Discovery Platform wins the People-Driven Platform Award this year. With this platform it is possible to access data in Hadoop, in a scalable SQL repository, and to do graph analysis. MapReduce and graph analysis are supported through extensions to SQL rather than by using completely new query mechanisms. BI tools, SQL, and interactive development environments are supported as well as an API for custom apps. Data ingestion, prep, analytics, and visualization are all integrated. To really create a discovery platform, all of these parts need to be in the same workbench. It will be interesting to see how people from all corners of the enterprise react to this collection of functionality.
I’ve attended Strata several times, written several O’Reilly books, and am a great admirer of Tim O’Reilly and O’Reilly Media. One of the core principles of the O’Reilly way is to “watch the alpha geeks.” When you go to OSCon, there is always an abundance of alpha geeks. If I were Gina Blaber or the other powers that be at O’Reilly, I would be give serious thought to this question: Now that the business buyers are arriving at Strata in force, how long will the alpha geeks stay? It is certainly possible that Strata could become just like every other IT industry conference, which would be a sad thing. The alpha geeks come to meet each other and get genuinely new ideas from people like Ian Timourian. As the number of magicians and Star Wars figures multiply on the show floor, it will be interesting to see how Strata keeps its alpha geek mojo.
Follow Dan Woods on Twitter: