Free Professional-Data-Engineer Exam Braindumps

Pass your Google Professional Data Engineer Exam exam with these free Questions and Answers

Page 9 of 54
QUESTION 36

- (Exam Topic 2)
Flowlogistic wants to use Google BigQuery as their primary analysis system, but they still have Apache Hadoop and Spark workloads that they cannot move to BigQuery. Flowlogistic does not know how to store the data that is common to both workloads. What should they do?

  1. A. Store the common data in BigQuery as partitioned tables.
  2. B. Store the common data in BigQuery and expose authorized views.
  3. C. Store the common data encoded as Avro in Google Cloud Storage.
  4. D. Store he common data in the HDFS storage for a Google Cloud Dataproc cluster.

Correct Answer: B

QUESTION 37

- (Exam Topic 5)
Which of these rules apply when you add preemptible workers to a Dataproc cluster (select 2 answers)?

  1. A. Preemptible workers cannot use persistent disk.
  2. B. Preemptible workers cannot store data.
  3. C. If a preemptible worker is reclaimed, then a replacement worker must be added manually.
  4. D. A Dataproc cluster cannot have only preemptible workers.

Correct Answer: BD
The following rules will apply when you use preemptible workers with a Cloud Dataproc cluster: Processing only—Since preemptibles can be reclaimed at any time, preemptible workers do not store data.
Preemptibles added to a Cloud Dataproc cluster only function as processing nodes.
No preemptible-only clusters—To ensure clusters do not lose all workers, Cloud Dataproc cannot create preemptible-only clusters.
Persistent disk size—As a default, all preemptible workers are created with the smaller of 100GB or the primary worker boot disk size. This disk space is used for local caching of data and is not available through HDFS.
The managed group automatically re-adds workers lost due to reclamation as capacity permits. Reference: https://cloud.google.com/dataproc/docs/concepts/preemptible-vms

QUESTION 38

- (Exam Topic 5)
Which SQL keyword can be used to reduce the number of columns processed by BigQuery?

  1. A. BETWEEN
  2. B. WHERE
  3. C. SELECT
  4. D. LIMIT

Correct Answer: C
SELECT allows you to query specific columns rather than the whole table.
LIMIT, BETWEEN, and WHERE clauses will not reduce the number of columns processed by BigQuery.
Reference:
https://cloud.google.com/bigquery/launch-checklist#architecture_design_and_development_checklist

QUESTION 39

- (Exam Topic 2)
Flowlogistic’s management has determined that the current Apache Kafka servers cannot handle the data volume for their real-time inventory tracking system. You need to build a new system on Google Cloud Platform (GCP) that will feed the proprietary tracking software. The system must be able to ingest data from a variety of global sources, process and query in real-time, and store the data reliably. Which combination of GCP products should you choose?

  1. A. Cloud Pub/Sub, Cloud Dataflow, and Cloud Storage
  2. B. Cloud Pub/Sub, Cloud Dataflow, and Local SSD
  3. C. Cloud Pub/Sub, Cloud SQL, and Cloud Storage
  4. D. Cloud Load Balancing, Cloud Dataflow, and Cloud Storage

Correct Answer: C

QUESTION 40

- (Exam Topic 6)
You operate an IoT pipeline built around Apache Kafka that normally receives around 5000 messages per second. You want to use Google Cloud Platform to create an alert as soon as the moving average over 1 hour drops below 4000 messages per second. What should you do?

  1. A. Consume the stream of data in Cloud Dataflow using Kafka I
  2. B. Set a sliding time window of 1 hour every 5 minute
  3. C. Compute the average when the window closes, and send an alert if the average is less than 4000 messages.
  4. D. Consume the stream of data in Cloud Dataflow using Kafka I
  5. E. Set a fixed time window of 1 hour.Compute the average when the window closes, and send an alert if the average is less than 4000 messages.
  6. F. Use Kafka Connect to link your Kafka message queue to Cloud Pub/Su
  7. G. Use a Cloud Dataflow template to write your messages from Cloud Pub/Sub to Cloud Bigtabl
  8. H. Use Cloud Scheduler to run a script every hour that counts the number of rows created in Cloud Bigtable in the last hou
  9. I. If that number falls below 4000, send an alert.
  10. J. Use Kafka Connect to link your Kafka message queue to Cloud Pub/Su
  11. K. Use a Cloud Dataflow template to write your messages from Cloud Pub/Sub to BigQuer
  12. L. Use Cloud Scheduler to run a script every five minutes that counts the number of rows created in BigQuery in the last hou
  13. M. If that number falls below 4000, send an alert.

Correct Answer: C

Page 9 of 54

Post your Comments and Discuss Google Professional-Data-Engineer exam with other Community members: