On of the hottest topics and technology areas in the industry is Big Data. Azure HDInsight is one of the services within the Microsoft Azure platform that services the ability to build scalable, Big Data solutions in the cloud. In an effort to expand out the breadth of certifications and exams offered around Microsoft Azure, the cloud, and Big Data, Microsoft has added the Perform Data Engineering on Microsoft Azure HDInsight (70-775) exam.
This exam is retired June 30, 2019.
Certification Target Audience
The focus on the Perform Data Engineering on Microsoft Azure HDInsight (70-775) exam is around Microsoft Azure HDInsight. The exam is designed to target candidates who are Data Engineers, Data Architects, Data Scientists, and Data Developers who implement Big Data engineering workflows on HDInsight. This exam tests your experience and familiarity with the features and capabilities of Batch data processing, Real-time processing, and Interactive processing.
Here is a high level list of the skills and objectives measured on this exam:
- Administer and Provision HDInsight Clusters
- Deploy HDInsight clusters
- Deploy and secure multi-user HDInsight clusters
- Ingest data for batch and interactive processing
- Configure HDInsight clusters
- Manage and debug HDInsight jobs
- Implement Big Data Batch Processing Solutions
- Implement batch solutions with Hive and Apache Pig
- Design batch ETL solutions for big data with Spark
- Operationalize Hadoop and Spark
- Implement Big Data Interactive Processing Solutions
- Implement interactive queries for big data with Spark SQL
- Perform exploratory data analysis by using Spark SQL
- Implement interactive queries for big data with Interactive Hive
- Perform exploratory data analysis by using Hive
- Perform interactive processing by using Apache Phoenix on HBase
- Implement Big Data Real-Time Processing Solutions
- Create Spark streaming applications using DStream API
- Create Spark structured streaming applications
- Develop big data real-time processing solutions with Apache Storm
- Build solutions that use Kafka
- Build solutions that use HBase
When studying for this exam, you’ll definitely want to look at the official exam page from Microsoft for the full list of exam objectives. You’ll need to be sure to study every one of them that will be measured on the exam.
At the time of writing this summary of the 70-775 Perform data Engineering on Microsoft Azure HDInsight exam, it is still in Beta as it was just recently published. This means there aren’t any exam guide books, or practice exams available yet. To study for this exam, you’ll need to rely mostly on the Azure HDInsight documentation, as well as the documentation for any other services and technologies listed in the exam objectives.
Not specific to just this exam, there are some additional resources available from various sources that do cover the technologies and skills measured on this exam. Here’s a short list of a few of these additional resources that may help in studying for this exam:
- Free eBooks
- Introducing Microsoft Azure HDInsight by Avkash Chauhan, Valentine Fontama, and 3 others
- Paid Books
- Big Data Analytics with Microsoft HDInsight in 24 hours by Manpreet Singh, Arshad Ali
- HBase: The Definitive Guid: Random Access to your Planet-Size Data by Lars George
- Video Courses from Opsgility
- Real-Time Ingestion and Processing in Azure by Chris Pietschmann
- Vido Courses from Pluralsight
- HDInsight Deep Dive: Storm, HBase, and Hive by Elton Stoneman
- Getting Started with Apache Kafka by Ryan Plant
- Applying the Lambda Architecture with Spark, Kafka, and Cassandra by Ahmad Alkilani