DAS-C01 | All About High Quality DAS-C01 Real Exam

It is impossible to pass Amazon-Web-Services DAS-C01 exam without any help in the short term. Come to Actualtests soon and find the most advanced, correct and guaranteed Amazon-Web-Services DAS-C01 practice questions. You will get a surprising result by our Regenerate AWS Certified Data Analytics - Specialty practice guides.

Also have DAS-C01 free dumps questions for you:

NEW QUESTION 1
A company analyzes its data in an Amazon Redshift data warehouse, which currently has a cluster of three dense storage nodes. Due to a recent business acquisition, the company needs to load an additional 4 TB of user data into Amazon Redshift. The engineering team will combine all the user data and apply complex calculations that require I/O intensive resources. The company needs to adjust the cluster's capacity to support the change in analytical and storage requirements.
Which solution meets these requirements?

  • A. Resize the cluster using elastic resize with dense compute nodes.
  • B. Resize the cluster using classic resize with dense compute nodes.
  • C. Resize the cluster using elastic resize with dense storage nodes.
  • D. Resize the cluster using classic resize with dense storage nodes.

Answer: C

NEW QUESTION 2
A company has a data lake on AWS that ingests sources of data from multiple business units and uses Amazon Athena for queries. The storage layer is Amazon S3 using the AWS Glue Data Catalog. The company wants to make the data available to its data scientists and business analysts. However, the company first needs to manage data access for Athena based on user roles and responsibilities.
What should the company do to apply these access controls with the LEAST operational overhead?

  • A. Define security policy-based rules for the users and applications by role in AWS Lake Formation.
  • B. Define security policy-based rules for the users and applications by role in AWS Identity and Access Management (IAM).
  • C. Define security policy-based rules for the tables and columns by role in AWS Glue.
  • D. Define security policy-based rules for the tables and columns by role in AWS Identity and Access Management (IAM).

Answer: D

NEW QUESTION 3
A company has an application that ingests streaming data. The company needs to analyze this stream over a 5-minute timeframe to evaluate the stream for anomalies with Random Cut Forest (RCF) and summarize the current count of status codes. The source and summarized data should be persisted for future use.
Which approach would enable the desired outcome while keeping data persistence costs low?

  • A. Ingest the data stream with Amazon Kinesis Data Stream
  • B. Have an AWS Lambda consumer evaluate the stream, collect the number status codes, and evaluate the data against a previously trained RCF mode
  • C. Persist the source and results as a time series to Amazon DynamoDB.
  • D. Ingest the data stream with Amazon Kinesis Data Stream
  • E. Have a Kinesis Data Analytics application evaluate the stream over a 5-minute window using the RCF function and summarize the count of status code
  • F. Persist the source and results to Amazon S3 through output delivery to Kinesis Data Firehouse.
  • G. Ingest the data stream with Amazon Kinesis Data Firehose with a delivery frequency of 1 minute or 1 MB in Amazon S3. Ensure Amazon S3 triggers an event to invoke an AWS Lambda consumer that evaluates the batch data, collects the number status codes, and evaluates the data against a previouslytrained RCF mode
  • H. Persist the source and results as a time series to Amazon DynamoDB.
  • I. Ingest the data stream with Amazon Kinesis Data Firehose with a delivery frequency of 5 minutes or 1 MB into Amazon S3. Have a Kinesis Data Analytics application evaluate the stream over a 1-minute window using the RCF function and summarize the count of status code
  • J. Persist the results to Amazon S3 through a Kinesis Data Analytics output to an AWS Lambda integration.

Answer: B

NEW QUESTION 4
A company is hosting an enterprise reporting solution with Amazon Redshift. The application provides reporting capabilities to three main groups: an executive group to access financial reports, a data analyst group to run long-running ad-hoc queries, and a data engineering group to run stored procedures and ETL processes. The executive team requires queries to run with optimal performance. The data engineering team expects queries to take minutes.
Which Amazon Redshift feature meets the requirements for this task?

  • A. Concurrency scaling
  • B. Short query acceleration (SQA)
  • C. Workload management (WLM)
  • D. Materialized views

Answer: D

Explanation:

Materialized views:

NEW QUESTION 5
An IoT company wants to release a new device that will collect data to track sleep overnight on an intelligent mattress. Sensors will send data that will be uploaded to an Amazon S3 bucket. About 2 MB of data is generated each night for each bed. Data must be processed and summarized for each user, and the results need to be available as soon as possible. Part of the process consists of time windowing and other functions. Based on tests with a Python script, every run will require about 1 GB of memory and will complete within a couple of minutes.
Which solution will run the script in the MOST cost-effective way?

  • A. AWS Lambda with a Python script
  • B. AWS Glue with a Scala job
  • C. Amazon EMR with an Apache Spark script
  • D. AWS Glue with a PySpark job

Answer: A

NEW QUESTION 6
A company is sending historical datasets to Amazon S3 for storage. A data engineer at the company wants to make these datasets available for analysis using Amazon Athena. The engineer also wants to encrypt the Athena query results in an S3 results location by using AWS solutions for encryption. The requirements for encrypting the query results are as follows:
Use custom keys for encryption of the primary dataset query results.
Use generic encryption for all other query results.
Provide an audit trail for the primary dataset queries that shows when the keys were used and by whom. Which solution meets these requirements?

  • A. Use server-side encryption with S3 managed encryption keys (SSE-S3) for the primary datase
  • B. Use SSE-S3 for the other datasets.
  • C. Use server-side encryption with customer-provided encryption keys (SSE-C) for the primary dataset.Use server-side encryption with S3 managed encryption keys (SSE-S3) for the other datasets.
  • D. Use server-side encryption with AWS KMS managed customer master keys (SSE-KMS CMKs) for the primary datase
  • E. Use server-side encryption with S3 managed encryption keys (SSE-S3) for the other datasets.
  • F. Use client-side encryption with AWS Key Management Service (AWS KMS) customer managed keys for the primary datase
  • G. Use S3 client-side encryption with client-side keys for the other datasets.

Answer: A

NEW QUESTION 7
A software company hosts an application on AWS, and new features are released weekly. As part of the application testing process, a solution must be developed that analyzes logs from each Amazon EC2 instance to ensure that the application is working as expected after each deployment. The collection and analysis solution should be highly available with the ability to display new information with minimal delays.
Which method should the company use to collect and analyze the logs?

  • A. Enable detailed monitoring on Amazon EC2, use Amazon CloudWatch agent to store logs in Amazon S3, and use Amazon Athena for fast, interactive log analytics.
  • B. Use the Amazon Kinesis Producer Library (KPL) agent on Amazon EC2 to collect and send data to Kinesis Data Streams to further push the data to Amazon Elasticsearch Service and visualize using Amazon QuickSight.
  • C. Use the Amazon Kinesis Producer Library (KPL) agent on Amazon EC2 to collect and send data to Kinesis Data Firehose to further push the data to Amazon Elasticsearch Service and Kibana.
  • D. Use Amazon CloudWatch subscriptions to get access to a real-time feed of logs and have the logs delivered to Amazon Kinesis Data Streams to further push the data to Amazon Elasticsearch Service and Kibana.

Answer: D

NEW QUESTION 8
A company that monitors weather conditions from remote construction sites is setting up a solution to collect temperature data from the following two weather stations.
DAS-C01 dumps exhibit Station A, which has 10 sensors
DAS-C01 dumps exhibit Station B, which has five sensors
These weather stations were placed by onsite subject-matter experts.
Each sensor has a unique ID. The data collected from each sensor will be collected using Amazon Kinesis Data Streams.
Based on the total incoming and outgoing data throughput, a single Amazon Kinesis data stream with two shards is created. Two partition keys are created based on the station names. During testing, there is a bottleneck on data coming from Station A, but not from Station B. Upon review, it is confirmed that the total stream throughput is still less than the allocated Kinesis Data Streams throughput.
How can this bottleneck be resolved without increasing the overall cost and complexity of the solution, while retaining the data collection quality requirements?

  • A. Increase the number of shards in Kinesis Data Streams to increase the level of parallelism.
  • B. Create a separate Kinesis data stream for Station A with two shards, and stream Station A sensor data to the new stream.
  • C. Modify the partition key to use the sensor ID instead of the station name.
  • D. Reduce the number of sensors in Station A from 10 to 5 sensors.

Answer: C

Explanation:
https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding.html
"Splitting increases the number of shards in your stream and therefore increases the data capacity of the stream. Because you are charged on a per-shard basis, splitting increases the cost of your stream"

NEW QUESTION 9
A company has 1 million scanned documents stored as image files in Amazon S3. The documents contain typewritten application forms with information including the applicant first name, applicant last name, application date, application type, and application text. The company has developed a machine learning algorithm to extract the metadata values from the scanned documents. The company wants to allow internal data analysts to analyze and find applications using the applicant name, application date, or application text. The original images should also be downloadable. Cost control is secondary to query performance.
Which solution organizes the images and metadata to drive insights while meeting the requirements?

  • A. For each image, use object tags to add the metadat
  • B. Use Amazon S3 Select to retrieve the files based on the applicant name and application date.
  • C. Index the metadata and the Amazon S3 location of the image file in Amazon Elasticsearch Service.Allow the data analysts to use Kibana to submit queries to the Elasticsearch cluster.
  • D. Store the metadata and the Amazon S3 location of the image file in an Amazon Redshift tabl
  • E. Allow the data analysts to run ad-hoc queries on the table.
  • F. Store the metadata and the Amazon S3 location of the image files in an Apache Parquet file in Amazon S3, and define a table in the AWS Glue Data Catalo
  • G. Allow data analysts to use Amazon Athena to submit custom queries.

Answer: B

Explanation:
https://aws.amazon.com/blogs/machine-learning/automatically-extract-text-and-structured-data-from-documents

NEW QUESTION 10
A hospital uses wearable medical sensor devices to collect data from patients. The hospital is architecting a near-real-time solution that can ingest the data securely at scale. The solution should also be able to remove the patient’s protected health information (PHI) from the streaming data and store the data in durable storage.
Which solution meets these requirements with the least operational overhead?

  • A. Ingest the data using Amazon Kinesis Data Streams, which invokes an AWS Lambda function using Kinesis Client Library (KCL) to remove all PH
  • B. Write the data in Amazon S3.
  • C. Ingest the data using Amazon Kinesis Data Firehose to write the data to Amazon S3. Have Amazon S3 trigger an AWS Lambda function that parses the sensor data to remove all PHI in Amazon S3.
  • D. Ingest the data using Amazon Kinesis Data Streams to write the data to Amazon S3. Have the data stream launch an AWS Lambda function that parses the sensor data and removes all PHI in Amazon S3.
  • E. Ingest the data using Amazon Kinesis Data Firehose to write the data to Amazon S3. Implement a transformation AWS Lambda function that parses the sensor data to remove all PHI.

Answer: D

Explanation:
https://aws.amazon.com/blogs/big-data/persist-streaming-data-to-amazon-s3-using-amazon-kinesis-firehose-and

NEW QUESTION 11
A company hosts an on-premises PostgreSQL database that contains historical data. An internal legacy application uses the database for read-only activities. The company’s business team wants to move the data to a data lake in Amazon S3 as soon as possible and enrich the data for analytics.
The company has set up an AWS Direct Connect connection between its VPC and its on-premises network. A data analytics specialist must design a solution that achieves the business team’s goals with the least operational overhead.
Which solution meets these requirements?

  • A. Upload the data from the on-premises PostgreSQL database to Amazon S3 by using a customized batch upload proces
  • B. Use the AWS Glue crawler to catalog the data in Amazon S3. Use an AWS Glue job to enrich and store the result in a separate S3 bucket in Apache Parquet forma
  • C. Use Amazon Athena to query the data.
  • D. Create an Amazon RDS for PostgreSQL database and use AWS Database Migration Service (AWS DMS) to migrate the data into Amazon RD
  • E. Use AWS Data Pipeline to copy and enrich the data from the Amazon RDS for PostgreSQL table and move the data to Amazon S3. Use Amazon Athena to querythe data.
  • F. Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises databas
  • G. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet forma
  • H. Create an Amazon Redshift cluster and use Amazon Redshift Spectrum to query the data.
  • I. Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises databas
  • J. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet forma
  • K. Use Amazon Athena to query the data.

Answer: B

NEW QUESTION 12
A company analyzes historical data and needs to query data that is stored in Amazon S3. New data is
generated daily as .csv files that are stored in Amazon S3. The company’s analysts are using Amazon Athena to perform SQL queries against a recent subset of the overall data. The amount of data that is ingested into Amazon S3 has increased substantially over time, and the query latency also has increased.
Which solutions could the company implement to improve query performance? (Choose two.)

  • A. Use MySQL Workbench on an Amazon EC2 instance, and connect to Athena by using a JDBC or ODBC connecto
  • B. Run the query from MySQL Workbench instead of Athena directly.
  • C. Use Athena to extract the data and store it in Apache Parquet format on a daily basi
  • D. Query the extracted data.
  • E. Run a daily AWS Glue ETL job to convert the data files to Apache Parquet and to partition the converted file
  • F. Create a periodic AWS Glue crawler to automatically crawl the partitioned data on a daily basis.
  • G. Run a daily AWS Glue ETL job to compress the data files by using the .gzip forma
  • H. Query the compressed data.
  • I. Run a daily AWS Glue ETL job to compress the data files by using the .lzo forma
  • J. Query the compressed data.

Answer: BC

NEW QUESTION 13
A hospital is building a research data lake to ingest data from electronic health records (EHR) systems from multiple hospitals and clinics. The EHR systems are independent of each other and do not have a common patient identifier. The data engineering team is not experienced in machine learning (ML) and has been asked to generate a unique patient identifier for the ingested records.
Which solution will accomplish this task?

  • A. An AWS Glue ETL job with the FindMatches transform
  • B. Amazon Kendra
  • C. Amazon SageMaker Ground Truth
  • D. An AWS Glue ETL job with the ResolveChoice transform

Answer: A

Explanation:
Matching Records with AWS Lake Formation FindMatches

NEW QUESTION 14
A retail company wants to use Amazon QuickSight to generate dashboards for web and in-store sales. A group of 50 business intelligence professionals will develop and use the dashboards. Once ready, the dashboards will be shared with a group of 1,000 users.
The sales data comes from different stores and is uploaded to Amazon S3 every 24 hours. The data is partitioned by year and month, and is stored in Apache Parquet format. The company is using the AWS Glue Data Catalog as its main data catalog and Amazon Athena for querying. The total size of the uncompressed data that the dashboards query from at any point is 200 GB.
Which configuration will provide the MOST cost-effective solution that meets these requirements?

  • A. Load the data into an Amazon Redshift cluster by using the COPY comman
  • B. Configure 50 author users and 1,000 reader user
  • C. Use QuickSight Enterprise editio
  • D. Configure an Amazon Redshift data source with a direct query option.
  • E. Use QuickSight Standard editio
  • F. Configure 50 author users and 1,000 reader user
  • G. Configure an Athena data source with a direct query option.
  • H. Use QuickSight Enterprise editio
  • I. Configure 50 author users and 1,000 reader user
  • J. Configure an Athena data source and import the data into SPIC
  • K. Automatically refresh every 24 hours.
  • L. Use QuickSight Enterprise editio
  • M. Configure 1 administrator and 1,000 reader user
  • N. Configure an S3 data source and import the data into SPIC
  • O. Automatically refresh every 24 hours.

Answer: C

NEW QUESTION 15
An online retail company uses Amazon Redshift to store historical sales transactions. The company is required to encrypt data at rest in the clusters to comply with the Payment Card Industry Data Security Standard (PCI DSS). A corporate governance policy mandates management of encryption keys using an on-premises hardware security module (HSM).
Which solution meets these requirements?

  • A. Create and manage encryption keys using AWS CloudHSM Classi
  • B. Launch an Amazon Redshift cluster in a VPC with the option to use CloudHSM Classic for key management.
  • C. Create a VPC and establish a VPN connection between the VPC and the on-premises networ
  • D. Create an HSM connection and client certificate for the on-premises HS
  • E. Launch a cluster in the VPC with the option to use the on-premises HSM to store keys.
  • F. Create an HSM connection and client certificate for the on-premises HS
  • G. Enable HSM encryption on the existing unencrypted cluster by modifying the cluste
  • H. Connect to the VPC where the Amazon Redshift cluster resides from the on-premises network using a VPN.
  • I. Create a replica of the on-premises HSM in AWS CloudHS
  • J. Launch a cluster in a VPC with the option to use CloudHSM to store keys.

Answer: B

NEW QUESTION 16
An operations team notices that a few AWS Glue jobs for a given ETL application are failing. The AWS Glue jobs read a large number of small JSON files from an Amazon S3 bucket and write the data to a different S3 bucket in Apache Parquet format with no major transformations. Upon initial investigation, a data engineer notices the following error message in the History tab on the AWS Glue console: “Command Failed with Exit Code 1.”
Upon further investigation, the data engineer notices that the driver memory profile of the failed jobs crosses the safe threshold of 50% usage quickly and reaches 90–95% soon after. The average memory usage across all executors continues to be less than 4%.
The data engineer also notices the following error while examining the related Amazon CloudWatch Logs. What should the data engineer do to solve the failure in the MOST cost-effective way?

  • A. Change the worker type from Standard to G.2X.
  • B. Modify the AWS Glue ETL code to use the ‘groupFiles’: ‘inPartition’ feature.
  • C. Increase the fetch size setting by using AWS Glue dynamics frame.
  • D. Modify maximum capacity to increase the total maximum data processing units (DPUs) used.

Answer: B

Explanation:
https://docs.aws.amazon.com/glue/latest/dg/monitor-profile-debug-oom-abnormalities.html#monitor-debug-oom

NEW QUESTION 17
Three teams of data analysts use Apache Hive on an Amazon EMR cluster with the EMR File System (EMRFS) to query data stored within each teams Amazon S3 bucket. The EMR cluster has Kerberos enabled and is configured to authenticate users from the corporate Active Directory. The data is highly sensitive, so access must be limited to the members of each team.
Which steps will satisfy the security requirements?

  • A. For the EMR cluster Amazon EC2 instances, create a service role that grants no access to Amazon S3.Create three additional IAM roles, each granting access to each team’s specific bucke
  • B. Add the additional IAM roles to the cluster’s EMR role for the EC2 trust polic
  • C. Create a security configuration mapping for the additional IAM roles to Active Directory user groups for each team.
  • D. For the EMR cluster Amazon EC2 instances, create a service role that grants no access to Amazon S3.Create three additional IAM roles, each granting access to each team's specific bucke
  • E. Add the service role for the EMR cluster EC2 instances to the trust policies for the additional IAM role
  • F. Create a security configuration mapping for the additional IAM roles to Active Directory user groups for each team.
  • G. For the EMR cluster Amazon EC2 instances, create a service role that grants full access to Amazon S3.Create three additional IAM roles, each granting access to each team’s specific bucke
  • H. Add the service role for the EMR cluster EC2 instances to the trust polices for the additional IAM role
  • I. Create a security configuration mapping for the additional IAM roles to Active Directory user groups for each team.
  • J. For the EMR cluster Amazon EC2 instances, create a service role that grants full access to Amazon S3.Create three additional IAM roles, each granting access to each team's specific bucke
  • K. Add the service role for the EMR cluster EC2 instances to the trust polices for the base IAM role
  • L. Create a security configuration mapping for the additional IAM roles to Active Directory user groups for each team.

Answer: C

NEW QUESTION 18
......

Recommend!! Get the Full DAS-C01 dumps in VCE and PDF From Dumps-hub.com, Welcome to Download: https://www.dumps-hub.com/DAS-C01-dumps.html (New 130 Q&As Version)