Small Notes from the AWS Certified Big Data: Speciality Exam

3 min readAug 7, 2020

AWS’s second speciality exam — big data — is designed to validate technical skills and experience in designing and implementing AWS services to derive value from data. The exam is for individuals who perform complex Big Data analyses and validates an individual’s ability to:

Implement core AWS Big Data services according to basic architecture best practices
Design and maintain Big Data
Leverage tools to automate data analysis

So far, so good; but what does this mean, in real terms, as one person’s Big Data, is another person’s Jupyter notebooks on EMR. AWS, as always, break this across domains (for 100%):

| Domain 1 | Collection | 17% |

| Domain 2 | Storage | 17% |

| Domain 3 | Processing | 17% |

| Domain 4 | Analysis | 17% |

| Domain 5 | Visualisation | 12% |

| Domain 6| Data Security | 20% |

Already, an obvious problem, and this was certainly reflected in the exam, is the weight put on domain 6 — data security. Architects and Data Scientists probably aren’t going to have any real exposure to this domain, and certainly the questions I was given, were focussed on logging and S3 bucket security controls, rather than configuring encryption on EMR or a Redshift Data Warehouse (or DWH as one question just throws in there).

As with all exams above associate level, there’s a prerequisite requirement to hold 1 AWS Associate level exam. The exam itself is 65 questions, with 170 minutes to complete.

Starting Point

Review the AWS Certified Big Data Speciality page — https://aws.amazon.com/certification/certified-big-data-specialty/

This page includes a PDF link to the Exam Guide, which covers the above domains in more detail.

Next Steps

If your focus has been infrastructure, development, or another non-specific Big Data track, you’ll want to start with some basics, so you get a good feel for what the exam focus will be. Firstly, I’d strongly encourage viewing of “AWS re:Invent 2017: Big Data Architectural Patterns and Best Practices on AWS (ABD201)” This session is presented by Siva Raghupathy, who’s been at AWS for nearly 10 years — and this shows by the breadth and depth of his knowledge on AWS’s Big Data Offerings.

FAQS

AWS’s FAQ sections are always worth reading as part of any pre-sales, architecting, or exam study. For this exam, I’d recommend the following FAQs to get started:

EMR https://aws.amazon.com/emr/faqs/

Data Pipeline https://aws.amazon.com/datapipeline/faqs/

Kinesis https://aws.amazon.com/kinesis/data-streams/faqs/

Redshift https://aws.amazon.com/redshift/faqs/

DynamoDB https://aws.amazon.com/dynamodb/faqs/

IAM https://aws.amazon.com/iam/faqs/