Free Professional-Data-Engineer Exam Braindumps

Pass your Google Professional Data Engineer Exam exam with these free Questions and Answers

Page 3 of 54
QUESTION 6

- (Exam Topic 5)
Which of the following are feature engineering techniques? (Select 2 answers)

  1. A. Hidden feature layers
  2. B. Feature prioritization
  3. C. Crossed feature columns
  4. D. Bucketization of a continuous feature

Correct Answer: CD
Selecting and crafting the right set of feature columns is key to learning an effective model. Bucketization is a process of dividing the entire range of a continuous feature into a set of consecutive
bins/buckets, and then converting the original numerical feature into a bucket ID (as a categorical feature) depending on which bucket that value falls into.
Using each base feature column separately may not be enough to explain the data. To learn the differences between different feature combinations, we can add crossed feature columns to the model.
Reference: https://www.tensorflow.org/tutorials/wide#selecting_and_engineering_features_for_the_model

QUESTION 7

- (Exam Topic 1)
You have spent a few days loading data from comma-separated values (CSV) files into the Google BigQuery table CLICK_STREAM. The column DT stores the epoch time of click events. For convenience, you chose a simple schema where every field is treated as the STRING type. Now, you want to compute web session durations of users who visit your site, and you want to change its data type to the TIMESTAMP. You want to minimize the migration effort without making future queries computationally expensive. What should you do?

  1. A. Delete the table CLICK_STREAM, and then re-create it such that the column DT is of the TIMESTAMP typ
  2. B. Reload the data.
  3. C. Add a column TS of the TIMESTAMP type to the table CLICK_STREAM, and populate the numericvalues from the column TS for each ro
  4. D. Reference the column TS instead of the column DT from now on.
  5. E. Create a view CLICK_STREAM_V, where strings from the column DT are cast into TIMESTAMP value
  6. F. Reference the view CLICK_STREAM_V instead of the table CLICK_STREAM from now on.
  7. G. Add two columns to the table CLICK STREAM: TS of the TIMESTAMP type and IS_NEW of the BOOLEAN typ
  8. H. Reload all data in append mod
  9. I. For each appended row, set the value of IS_NEW to tru
  10. J. For future queries, reference the column TS instead of the column DT, with the WHERE clause ensuring that the value of IS_NEW must be true.
  11. K. Construct a query to return every row of the table CLICK_STREAM, while using the built-in function to cast strings from the column DT into TIMESTAMP value
  12. L. Run the query into a destination table NEW_CLICK_STREAM, in which the column TS is the TIMESTAMP typ
  13. M. Reference the table NEW_CLICK_STREAM instead of the table CLICK_STREAM from now o
  14. N. In the future, new data is loaded into the table NEW_CLICK_STREAM.

Correct Answer: D

QUESTION 8

- (Exam Topic 5)
To give a user read permission for only the first three columns of a table, which access control method would you use?

  1. A. Primitive role
  2. B. Predefined role
  3. C. Authorized view
  4. D. It's not possible to give access to only the first three columns of a table.

Correct Answer: C
An authorized view allows you to share query results with particular users and groups without giving them
read access to the underlying tables. Authorized views can only be created in a dataset that does not contain the tables queried by the view.
When you create an authorized view, you use the view's SQL query to restrict access to only the rows and columns you want the users to see.
Reference: https://cloud.google.com/bigquery/docs/views#authorized-views

QUESTION 9

- (Exam Topic 3)
MJTelco’s Google Cloud Dataflow pipeline is now ready to start receiving data from the 50,000 installations. You want to allow Cloud Dataflow to scale its compute power up as required. Which Cloud Dataflow pipeline configuration setting should you update?

  1. A. The zone
  2. B. The number of workers
  3. C. The disk size per worker
  4. D. The maximum number of workers

Correct Answer: A

QUESTION 10

- (Exam Topic 6)
You are training a spam classifier. You notice that you are overfitting the training data. Which three actions can you take to resolve this problem? (Choose three.)

  1. A. Get more training examples
  2. B. Reduce the number of training examples
  3. C. Use a smaller set of features
  4. D. Use a larger set of features
  5. E. Increase the regularization parameters
  6. F. Decrease the regularization parameters

Correct Answer: ADF

Page 3 of 54

Post your Comments and Discuss Google Professional-Data-Engineer exam with other Community members: