Medium1 markMultiple Choice
Domain 3.5: Data IngestionPerformanceGlueETL

AWS SAA-C03 · Question 52 · Domain 3.5: Data Ingestion

A company receives daily CSV files in an Amazon S3 bucket. They need to automatically transform these files into Apache Parquet format and catalog the metadata so the data can be queried using Amazon Athena. Which AWS service should be used for the transformation and cataloging?

Answer options:

A.

Amazon Kinesis Data Analytics

B.

AWS Glue

C.

AWS Data Pipeline

D.

Amazon EMR

How to approach this question

Look for 'transform', 'catalog metadata', and 'Athena'. AWS Glue is the serverless ETL and catalog service.

Full Answer

B.AWS Glue✓ Correct
AWS Glue
AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data. Glue Crawlers populate the Data Catalog, and Glue Jobs can transform CSV to Parquet.

Common mistakes

Choosing EMR, which is powerful but not the most operationally efficient choice for simple serverless ETL and cataloging.

Practice the full AWS SAA-C03 Practice Exam 7

65 questions · hints · full answers · grading

More questions from this exam