Azure HDInsight helps make Hadoop enterprise-ready in the cloud

Kareem Anderson

Before getting into the nuances of Microsoft’s recently announced security, performance, and updated ISV solutions for Azure HDInsight’s, we feel it a bit incumbent to discuss Hadoop in greater detail for a second. Bear with us; it’ll all converge in a second.

Microsoft’s Hadoop service, originally coined by Mike Cafarella and Doug Cutting in 2005 as an homage to his son’s toy elephant, came about as a solution for managing web-related search data. Over the course of eleven years, the service has evolved into an open-source, community-built project better associated with the Apache Software Foundation.

What do Hadoop and today’s announcement of performance and security updates for Azure HDInsights have to do with one another? Well, they are actually one and the same.

According to Tiffany Wissner, senior director of product marketing for Microsoft’s Data Platform:

We are pleased to announce new capabilities in Azure HDInsight, Microsoft’s managed Hadoop and Spark cloud services, that build on our leadership to make Hadoop enterprise-ready in the cloud and easy for your users with the most security capabilities of any cloud Hadoop solution, big data query speeds that approach data warehousing performance, and new notebook experiences for data scientists all on the latest Hortonworks Data Platform 2.5 and Spark 2.0 platform.”

The specific list of improvements supporting the Hadoop solutions include:

  • Simplifying the authentication and identity management process
  • Authorization with central security policy administration and auditing
  • Encryption for data protection
  • Using LLAP (Long Lived and Process) to boost speed performance
  • We are now pleased to be the first Cloud Hadoop solution to onboard LLAP (Long Lived and Process) from the Stinger.Next initiatives, which promises sub-second querying on big data, which is 25x faster than existing Hive.
  • Continue Spark support with Spark 2.0 release
  • Spark 2.0 is a major release that overhauls the core query engine with “Project Tungsten,” which upgrades Spark with capabilities of a modern compiler to perform cache-efficient vectorized computations.
  • New data science experiences with Zeppelin notebook support
  • Partner support from Cask and StreamSets to Azure HDInsights ISV program

While much of the tech news regarding Microsoft has been focused on the company’s appearance and announcements at Ignite 2016, it seems some attention should be paid to its other cloud offerings in Azure HDInsights and expansion of Hadoop capabilities. For anyone interested in reading up on this week’s announcements for Spark 2.0, improved ISV support, LLAP speeds, encryption and data protection, or other managed Cloud Hadoop solutions, visit the Microsoft Azure blog here.