Pass Your SnowPro Advanced DEA-C01 Exam on Dec 24, 2024 with 132 Questions
DEA-C01 Free Exam Study Guide! (Updated 132 Questions)
NEW QUESTION # 79
Changing the retention period for your account or individual objects changes the value for all lower-level objects that do not have a retention period explicitly set?
- A. TRUE
- B. FALSE
Answer: A
NEW QUESTION # 80
A Data Engineer defines the following masking policy:
....
must be applied to the full_name column in the customer table:
Which query will apply the masking policy on the full_name column?
- A. ALTER TABLE customer MODIFY COLUMN first_name ADD MASKING POLICY name_policy,
- B. ALTER TABLE customer MODIFY COLUMN full_name Set MASKING POLICY name_policy;
- C. ALTER TABLE customer MODIFY COLUMN full_nam ADD MASKING POLICY name_poiicy;
- D. ALTER TABLE customer MODIFY COLUMN first_nane SET MASKING POLICY name_policy; lasT_name SET MASKING POLICY name_pclicy;
Answer: B
Explanation:
Explanation
The query that will apply the masking policy on the full_name column is ALTER TABLE customer MODIFY COLUMN full_name SET MASKING POLICY name_policy;. This query will modify the full_name column and associate it with the name_policy masking policy, which will mask the first and last names of the customers with asterisks. The other options are incorrect because they do not follow the correct syntax for applying a masking policy on a column. Option B is incorrect because it uses ADD instead of SET, which is not a valid keyword for modifying a column. Option C is incorrect because it tries to apply the masking policy on two columns, first_name and last_name, which are not part of the table structure. Option D is incorrect because it uses commas instead of dots to separate the database, schema, and table names
NEW QUESTION # 81
A company created an extract, transform, and load (ETL) data pipeline in AWS Glue. A data engineer must crawl a table that is in Microsoft SQL Server. The data engineer needs to extract, transform, and load the output of the crawl to an Amazon S3 bucket. The data engineer also must orchestrate the data pipeline.
Which AWS service or feature will meet these requirements MOST cost-effectively?
- A. Amazon Managed Workflows for Apache Airflow (Amazon MWAA)
- B. AWS Glue Studio
- C. AWS Glue workflows
- D. AWS Step Functions
Answer: C
Explanation:
https://aws.amazon.com/blogs/big-data/orchestrate-an-etl-pipeline-using-aws-glue-workflows- triggers-and-crawlers-with-custom-classifiers/
https://aws.amazon.com/blogs/big-data/extracting-multidimensional-data-from-microsoft-sql- server-analysis-services-using-aws-glue/
NEW QUESTION # 82
When created, a stream logically takes an initial snapshot of every row in the source object and the contents of a stream change as DML statements execute on the source table.
A Data Engineer, Sophie Created a view that queries the table and returns the CURRENT_USER and CURRENT_TIMESTAMP values for the query transaction. A Stream has been created on views to capture CDC.
Tony, another user inserted the data e.g.
insert into <table> values (1),(2),(3);
Emily, another user also inserted the data e.g.
insert into <table> values (4),(5),(6);
What will happened when Different user queries the same stream after 1 hour?
- A. All the Six records would be displayed with User 'Sohpie' Who is the owner of the View.
- B. All the Six Records would be displayed with CURRENT_USER & CUR-RENT_TIMESTAMP while querying Streams.
- C. All the 6 records would be shown with METADATA$ACTION as 'INSERT' out of which 3 records would be displayed with username 'Tony' & rest 3 records would be displayed with username 'Emily'.
- D. User would be displayed with the one who queried during the session, but Recorded timestamp would be of past 1 hour i.e. actual records insertion time.
Answer: B
Explanation:
Explanation
When User queries the stream, the stream returns the username for the user. The stream also returns the current timestamp for the query transaction in each row, NOT the timestamp when each row was inserted.
NEW QUESTION # 83
Streams cannot be created to query change data on which of the following objects? [Select All that Apply]
- A. Directory tables
- B. External tables
- C. Views, including secure views
- D. Query Log Tables
- E. Standard tables, including shared tables.
Answer: D
Explanation:
Explanation
Streams supports all the listed objects except Query Log tables.
NEW QUESTION # 84
Ira a Data Engineer with TESLA IT systems, looking out to Compare Traditional Partitioning vs Snowflake micro-partitions for one of the Snowflake Project implementations. Which one of the following is incorrect understanding of Ira about Micro Partitioning?
- A. All DML operations (e.g. DELETE, UPDATE, MERGE) take advantage of the under-lying micro-partition metadata to facilitate and simplify table maintenance.
- B. The micro-partition metadata maintained by Snowflake enables precise pruning of col-umns in micro-partitions at query run-time, including columns containing semi-structured data.
- C. In Snowflake, as data is inserted/loaded into a table, clustering metadata is collected and recorded for each micro-partition created during the process.
- D. All data in Snowflake tables is automatically divided into micro-partitions, which are contiguous units of storage compared to traditional partitioning where specialized DDL required.
- E. Snowflake stores metadata about all rows stored in a micro-partition, including number of distinct columns.
Answer: E
Explanation:
Explanation
What are Micro-partitions?
All data in Snowflake tables is automatically divided into micro-partitions, which are contiguous units of storage. Each micro-partition contains between 50 MB and 500 MB of uncompressed data (note that the actual size in Snowflake is smaller because data is always stored compressed). Groups of rows in tables are mapped into individual micro-partitions, organized in a columnar fashion. This size and structure allow for extremely granular pruning of very large tables, which can be comprised of millions, or even hundreds of millions, of micro-partitions.
Snowflake stores metadata about all rows stored in a micro-partition, including:
The range of values for each of the columns in the micro-partition.
The number of distinct values.
Additional properties used for both optimization and efficient query processing.
It Never stores number of columns as part of Metadata.
Rest of the statements are correct.
NEW QUESTION # 85
A data engineer notices that Amazon Athena queries are held in a queue before the queries run.
How can the data engineer prevent the queries from queueing?
- A. Allow users who run the Athena queries to an existing workgroup.
- B. Increase the query result limit.
- C. Configure provisioned capacity for an existing workgroup.
- D. Use federated queries.
Answer: C
Explanation:
https://aws.amazon.com/blogs/aws/introducing-athena-provisioned-capacity/
NEW QUESTION # 86
A company is designing a data lake on Amazon S3. To ensure high performance when accessing the data, which best practice should the company adopt in organizing its data in the S3 bucket?
- A. Partition data based on commonly accessed attributes and use a consistent naming scheme for prefixes.
- B. Use a flat structure by avoiding the creation of any prefix or "folder" hierarchy.
- C. Enable S3 Transfer Acceleration to ensure data is quickly accessible from any location.
- D. Store all data files as a single large file and use AWS Lambda to parse required data segments.
Answer: A
NEW QUESTION # 87
A Data Engineer is working on a Snowflake deployment in AWS eu-west-1 (Ireland). The Engineer is planning to load data from staged files into target tables using the copy into command Which sources are valid? (Select THREE)
- A. SSO attached to an Amazon EC2 instance on AWS eu-west-1 (Ireland)
- B. Internal stage on AWS eu-central-1 (Frankfurt)
- C. External stage in an Amazon S3 bucket on AWS eu-central 1 (Frankfurt)
- D. Internal stage on GCP us-central1 (Iowa)
- E. External stage in an Amazon S3 bucket on AWS eu-west-1 (Ireland)
- F. External stage on GCP us-central1 (Iowa)
Answer: C,E,F
Explanation:
Explanation
The valid sources for loading data from staged files into target tables using the copy into command are:
External stage on GCP us-central1 (Iowa): This is a valid source because Snowflake supports cross-cloud data loading from external stages on different cloud platforms and regions than the Snowflake deployment.
External stage in an Amazon S3 bucket on AWS eu-west-1 (Ireland): This is a valid source because Snowflake supports data loading from external stages on the same cloud platform and region as the Snowflake deployment.
External stage in an Amazon S3 bucket on AWS eu-central 1 (Frankfurt): This is a valid source because Snowflake supports cross-region data loading from external stages on different regions than the Snowflake deployment within the same cloud platform. The invalid sources are:
Internal stage on GCP us-central1 (Iowa): This is an invalid source because internal stages are always located on the same cloud platform and region as the Snowflake deployment. Therefore, an internal stage on GCP us-central1 (Iowa) cannot be used for a Snowflake deployment on AWS eu-west-1 (Ireland).
Internal stage on AWS eu-central-1 (Frankfurt): This is an invalid source because internal stages are always located on the same region as the Snowflake deployment. Therefore, an internal stage on AWS eu-central-1 (Frankfurt) cannot be used for a Snowflake deployment on AWS eu-west-1 (Ireland).
SSO attached to an Amazon EC2 instance on AWS eu-west-1 (Ireland): This is an invalid source because SSO stands for Single Sign-On, which is a security integration feature in Snowflake, not a data staging option.
NEW QUESTION # 88
A data engineer uses Amazon Redshift to run resource-intensive analytics processes once every month. Every month, the data engineer creates a new Redshift provisioned cluster. The data engineer deletes the Redshift provisioned cluster after the analytics processes are complete every month. Before the data engineer deletes the cluster each month, the data engineer unloads backup data from the cluster to an Amazon S3 bucket.
The data engineer needs a solution to run the monthly analytics processes that does not require the data engineer to manage the infrastructure manually.
Which solution will meet these requirements with the LEAST operational overhead?
- A. Use Amazon Redshift Serverless to automatically process the analytics workload.
- B. Use Amazon Step Functions to pause the Redshift cluster when the analytics processes are complete and to resume the cluster to run new processes every month.
- C. Use the AWS CLI to automatically process the analytics workload.
- D. Use AWS CloudFormation templates to automatically process the analytics workload.
Answer: A
Explanation:
Use Amazon Redshift Serverless. This option allows the data engineer to focus on the analytics processes themselves without worrying about cluster provisioning, scaling, or management. It provides an on-demand, serverless solution that can handle variable workloads and is cost- effective for intermittent and irregular processing needs like those described.
NEW QUESTION # 89
Which Snowflake feature facilitates access to external API services such as geocoders. data transformation, machine Learning models and other custom code?
- A. External tables
- B. Security integration
- C. Java User-Defined Functions (UDFs)
- D. External functions
Answer: D
Explanation:
Explanation
External functions are Snowflake functions that facilitate access to external API services such as geocoders, data transformation, machine learning models and other custom code. External functions allow users to invoke external services from within SQL queries and pass arguments and receive results as JSON values. External functions require creating an API integration object and an external function object in Snowflake, as well as deploying an external service endpoint that can communicate with Snowflake via HTTPS.
NEW QUESTION # 90
A company has a production AWS account that runs company workloads. The company's security team created a security AWS account to store and analyze security logs from the production AWS account. The security logs in the production AWS account are stored in Amazon CloudWatch Logs.
The company needs to use Amazon Kinesis Data Streams to deliver the security logs to the security AWS account.
Which solution will meet these requirements?
- A. Create a destination data stream in the production AWS account. In the production AWS account, create an IAM role that has cross-account permissions to Kinesis Data Streams in the security AWS account.
- B. Create a destination data stream in the security AWS account. Create an IAM role and a trust policy to grant CloudWatch Logs the permission to put data into the stream. Create a subscription filter in the production AWS account.
- C. Create a destination data stream in the production AWS account. In the security AWS account, create an IAM role that has cross-account permissions to Kinesis Data Streams in the production AWS account.
- D. Create a destination data stream in the security AWS account. Create an IAM role and a trust policy to grant CloudWatch Logs the permission to put data into the stream. Create a subscription filter in the security AWS account.
Answer: B
NEW QUESTION # 91
Which are the two ways to access elements in a JSON object?
- A. Use dot notation to traverse a path in a JSON object:
<col-umn>:<level1_element>.<level2_element>.<level3_element>. - B. use bracket notation to traverse the path in an object:
<col-umn>['<level1_element>']['<level2_element>']. - C. Use SemiColon notation to traverse a path in a JSON object:
<col-umn>:<level1_element>;<level2_element>;<level3_element>. - D. use Curly bracket notation to traverse the path in an object:
<col-umn>{'<level1_element>'}{'<level2_element>'}.
Answer: A,B
NEW QUESTION # 92
A financial services company stores financial data in Amazon Redshift. A data engineer wants to run real-time queries on the financial data to support a web-based trading application. The data engineer wants to run the queries from within the trading application.
Which solution will meet these requirements with the LEAST operational overhead?
- A. Set up Java Database Connectivity (JDBC) connections to Amazon Redshift.
- B. Establish WebSocket connections to Amazon Redshift.
- C. Use the Amazon Redshift Data API.
- D. Store frequently accessed data in Amazon S3. Use Amazon S3 Select to run the queries.
Answer: C
Explanation:
https://docs.aws.amazon.com/redshift/latest/mgmt/data-api.html
NEW QUESTION # 93
Snowpipe API provides a REST endpoint for defining the list of files to ingest that Informs Snow-flake about the files to be ingested into a table. A successful response from this endpoint means that Snowflake has recorded the list of files to add to the table. It does not necessarily mean the files have been ingested. What is name of this Endpoint?
- A. REST endpoints--> insertfiles
- B. REST endpoints --> ingestfiles
- C. REST endpoints --> insertReport
- D. REST endpoints --> loadHistoryScan
Answer: A
Explanation:
Explanation
The Snowpipe API provides a REST endpoint for defining the list of files to ingest.
Endpoint: insertFiles
Informs Snowflake about the files to be ingested into a table. A successful response from this end-point means that Snowflake has recorded the list of files to add to the table. It does not necessarily mean the files have been ingested. For more details, see the response codes below.
In most cases, Snowflake inserts fresh data into the target table within a few minutes.
To Know more about SnowFlake Rest API used for Data File ingestion, do refer:
https://docs.snowflake.com/en/user-guide/data-load-snowpipe-rest-apis.html#data-file-ingestion
NEW QUESTION # 94
A Data Engineer wants to create a new development database (DEV) as a clone of the permanent production database (PROD) There is a requirement to disable Fail-safe for all tables.
Which command will meet these requirements?
- A. CREATE DATABASE DEV
CLONE PROD
FAIL_SAFE=FALSE; - B. CREATE DATABASE DEV
CLOSE PROD
DATA_RETENTION_TIME_IN_DAYS =0L - C. CREATE DATABASE DEV
CLONE PROD; - D. CREATE TRANSIENT DATABASE DEV
CLONE RPOD
Answer: D
Explanation:
Explanation
This option will meet the requirements of creating a new development database (DEV) as a clone of the permanent production database (PROD) and disabling Fail-safe for all tables. By using the CREATE TRANSIENT DATABASE command, the Data Engineer can create a transient database that does not have Fail-safe enabled by default. Fail-safe is a feature in Snowflake that provides additional protection against data loss by retaining historical data for seven days beyond the time travel retention period. Transient databases do not have Fail-safe enabled, which means that they do not incur additional storage costs for historical data beyond their time travel retention period. By using the CLONE option, the Data Engineer can create an exact copy of the PROD database, including its schemas, tables, views, and other objects.
NEW QUESTION # 95
Which use case would be BEST suited for the search optimization service?
- A. Business users who need fast response times using highly selective filters
- B. Data Scientists who seek specific JOIN statements with large volumes of data
- C. Analysts who need to perform aggregates over high cardinality columns
- D. Data Engineers who create clustered tables with frequent reads against clustering keys
Answer: A
Explanation:
Explanation
The use case that would be best suited for the search optimization service is business users who need fast response times using highly selective filters. The search optimization service is a feature that enables faster queries on tables with high cardinality columns by creating inverted indexes on those columns. High cardinality columns are columns that have a large number of distinct values, such as customer IDs, product SKUs, or email addresses. Queries that use highly selective filters on high cardinality columns can benefit from the search optimization service because they can quickly locate the relevant rows without scanning the entire table. The other options are not best suited for the search optimization service. Option A is incorrect because analysts who need to perform aggregates over high cardinality columns will not benefit from the search optimization service, as they will still need to scan all the rows that match the filter criteria. Option C is incorrect because data scientists who seek specific JOIN statements with large volumes of data will not benefit from the search optimization service, as they will still need to perform join operations that may involve shuffling or sorting data across nodes. Option D is incorrect because data engineers who create clustered tables with frequent reads against clustering keys will not benefit from the search optimization service, as they already have an efficient way to organize and access data based on clustering keys.
NEW QUESTION # 96
A company stores logs in an Amazon S3 bucket. When a data engineer attempts to access several log files, the data engineer discovers that some files have been unintentionally deleted.
The data engineer needs a solution that will prevent unintentional file deletion in the future.
Which solution will meet this requirement with the LEAST operational overhead?
- A. Use an Amazon S3 Glacier storage class to archive the data that is in the S3 bucket.
- B. Enable S3 Versioning for the S3 bucket.
- C. Manually back up the S3 bucket on a regular basis.
- D. Configure replication for the S3 bucket.
Answer: B
NEW QUESTION # 97
......
Snowflake DEA-C01 Exam Syllabus Topics:
| Topic | Details |
|---|---|
| Topic 1 |
|
| Topic 2 |
|
| Topic 3 |
|
| Topic 4 |
|
| Topic 5 |
|
DEA-C01 Dumps for SnowPro Advanced Certified Exam Questions and Answer: https://www.examdiscuss.com/Snowflake/exam/DEA-C01/
Realistic Verified DEA-C01 exam dumps Q&As - DEA-C01 Free Update: https://drive.google.com/open?id=1hNfQdGOM66YvLcYZScmR6vxQtoR3hnT7