A company is reading data from various customer databases that run on Amazon RDS. The databases contain many inconsistent fields For example, a customer record field that is place_id in one database is location_id in another database. The company wants to link customer records across different databases, even when many customer record fields do not match exactly
Which solution will meet these requirements with the LEAST operational overhead?
Correct Answer:
B
An airline has been collecting metrics on flight activities for analytics. A recently completed proof of concept demonstrates how the company provides insights to data analysts to improve on-time departures. The proof of concept used objects in Amazon S3, which contained the metrics in .csv format, and used Amazon Athena for querying the data. As the amount of data increases, the data analyst wants to optimize the storage solution to improve query performance.
Which options should the data analyst use to improve performance as the data lake grows? (Choose three.)
Correct Answer:
CDF
https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/
A bank is using Amazon Managed Streaming for Apache Kafka (Amazon MSK) to populate real-time data into a data lake The data lake is built on Amazon S3, and data must be accessible from the data lake within 24 hours Different microservices produce messages to different topics in the cluster The cluster is created with 8 TB of Amazon Elastic Block Store (Amazon EBS) storage and a retention period of 7 days
The customer transaction volume has tripled recently and disk monitoring has provided an alert that the cluster is almost out of storage capacity
What should a data analytics specialist do to prevent the cluster from running out of disk space1?
Correct Answer:
B
A financial company hosts a data lake in Amazon S3 and a data warehouse on an Amazon Redshift cluster. The company uses Amazon QuickSight to build dashboards and wants to secure access from its on-premises Active Directory to Amazon QuickSight.
How should the data be secured?
Correct Answer:
A
https://docs.aws.amazon.com/quicksight/latest/user/directory-integration.html
A company is building a data lake and needs to ingest data from a relational database that has time-series data. The company wants to use managed services to accomplish this. The process needs to be scheduled daily and bring incremental data only from the source into Amazon S3.
What is the MOST cost-effective approach to meet these requirements?
Correct Answer:
A
https://docs.aws.amazon.com/glue/latest/dg/monitor-continuations.html