Exam Data-Engineer-Associate Topic 1 Question 86 Discussion
Actual exam question for Amazon's Data-Engineer-Associate exam
Question #: 86
Topic #: 1
Question #: 86
Topic #: 1
A company stores server logs in an Amazon 53 bucket. The company needs to keep the logs for 1 year. The logs are not required after 1 year.
A data engineer needs a solution to automatically delete logs that are older than 1 year.
Which solution will meet these requirements with the LEAST operational overhead?
A data engineer needs a solution to automatically delete logs that are older than 1 year.
Which solution will meet these requirements with the LEAST operational overhead?
Suggested Answer: B Vote an answer
* Problem Analysis:
* The company usesAWS Gluefor ETL pipelines and requires automaticdata quality checks during pipeline execution.
* The solution must integrate with existing AWS Glue pipelines and evaluatedata quality rules based on predefined thresholds.
* Key Considerations:
* Ensure minimal implementation effort by leveraging built-in AWS Glue features.
* Use a standardized approach for defining and evaluating data quality rules.
* Avoid custom libraries or external frameworks unless absolutely necessary.
* Solution Analysis:
* Option A: SQL Transform
* Adding SQL transforms to define and evaluate data quality rules is possible but requires writing complex queries for each rule.
* Increases operational overhead and deviates from Glue's declarative approach.
* Option B: Evaluate Data Quality Transform with DQDL
* AWS Glue provides a built-inEvaluate Data Quality transform.
* Allows defining rules inData Quality Definition Language (DQDL), a concise and declarative way to define quality checks.
* Fully integrated with Glue Studio, making it the least effort solution.
* Option C: Custom Transform with PyDeequ
* PyDeequ is a powerful library for data quality checks but requires custom code and integration.
* Increases implementation effort compared to Glue's native capabilities.
* Option D: Custom Transform with Great Expectations
* Great Expectations is another powerful library for data quality but adds complexity and external dependencies.
* Final Recommendation:
* UseEvaluate Data Quality transformin AWS Glue.
* Define rules inDQDLfor checking thresholds, null values, or other quality criteria.
* This approach minimizes development effort and ensures seamless integration with AWS Glue.
:
AWS Glue Data Quality Overview
DQDL Syntax and Examples
Glue Studio Transformations
* The company usesAWS Gluefor ETL pipelines and requires automaticdata quality checks during pipeline execution.
* The solution must integrate with existing AWS Glue pipelines and evaluatedata quality rules based on predefined thresholds.
* Key Considerations:
* Ensure minimal implementation effort by leveraging built-in AWS Glue features.
* Use a standardized approach for defining and evaluating data quality rules.
* Avoid custom libraries or external frameworks unless absolutely necessary.
* Solution Analysis:
* Option A: SQL Transform
* Adding SQL transforms to define and evaluate data quality rules is possible but requires writing complex queries for each rule.
* Increases operational overhead and deviates from Glue's declarative approach.
* Option B: Evaluate Data Quality Transform with DQDL
* AWS Glue provides a built-inEvaluate Data Quality transform.
* Allows defining rules inData Quality Definition Language (DQDL), a concise and declarative way to define quality checks.
* Fully integrated with Glue Studio, making it the least effort solution.
* Option C: Custom Transform with PyDeequ
* PyDeequ is a powerful library for data quality checks but requires custom code and integration.
* Increases implementation effort compared to Glue's native capabilities.
* Option D: Custom Transform with Great Expectations
* Great Expectations is another powerful library for data quality but adds complexity and external dependencies.
* Final Recommendation:
* UseEvaluate Data Quality transformin AWS Glue.
* Define rules inDQDLfor checking thresholds, null values, or other quality criteria.
* This approach minimizes development effort and ensures seamless integration with AWS Glue.
:
AWS Glue Data Quality Overview
DQDL Syntax and Examples
Glue Studio Transformations
by Ruby at Feb 12, 2026, 07:19 PM
0
0
0
10
Comments
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
Report Comment
Commenting
You can sign-up / login (it's free).