Free DP-203 Exam Braindumps

Pass your Data Engineering on Microsoft Azure exam with these free Questions and Answers

Page 9 of 61
QUESTION 36

- (Exam Topic 3)
You build an Azure Data Factory pipeline to move data from an Azure Data Lake Storage Gen2 container to a database in an Azure Synapse Analytics dedicated SQL pool.
Data in the container is stored in the following folder structure.
/in/{YYYY}/{MM}/{DD}/{HH}/{mm}
The earliest folder is /in/2021/01/01/00/00. The latest folder is /in/2021/01/15/01/45. You need to configure a pipeline trigger to meet the following requirements:
DP-203 dumps exhibit Existing data must be loaded.
DP-203 dumps exhibit Data must be loaded every 30 minutes.
DP-203 dumps exhibit Late-arriving data of up to two minutes must he included in the load for the time at which the data
should have arrived.
How should you configure the pipeline trigger? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
DP-203 dumps exhibit
Solution:
Box 1: Tumbling window
To be able to use the Delay parameter we select Tumbling window. Box 2:
Recurrence: 30 minutes, not 32 minutes
Delay: 2 minutes.
The amount of time to delay the start of data processing for the window. The pipeline run is started after the expected execution time plus the amount of delay. The delay defines how long the trigger waits past the due time before triggering a new run. The delay doesn’t alter the window startTime.
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/how-to-create-tumbling-window-trigger

Does this meet the goal?

  1. A. Yes
  2. B. No

Correct Answer: A

QUESTION 37

- (Exam Topic 3)
You have an Azure Data Factory pipeline that is triggered hourly. The pipeline has had 100% success for the past seven days.
The pipeline execution fails, and two retries that occur 15 minutes apart also fail. The third failure returns the following error.
DP-203 dumps exhibit
What is a possible cause of the error?

  1. A. The parameter used to generate year=2021/month=01/day=10/hour=06 was incorrect.
  2. B. From 06:00 to 07:00 on January 10, 2021, there was no data in wwi/BIKES/CARBON.
  3. C. From 06:00 to 07:00 on January 10, 2021, the file format of data in wwi/BIKES/CARBON was incorrect.
  4. D. The pipeline was triggered too early.

Correct Answer: C

QUESTION 38

- (Exam Topic 3)
You are creating an Azure Data Factory data flow that will ingest data from a CSV file, cast columns to specified types of data, and insert the data into a table in an Azure Synapse Analytic dedicated SQL pool. The CSV file contains three columns named username, comment, and date.
The data flow already contains the following:
DP-203 dumps exhibit A source transformation.
DP-203 dumps exhibit A Derived Column transformation to set the appropriate types of data.
DP-203 dumps exhibit A sink transformation to land the data in the pool.
You need to ensure that the data flow meets the following requirements:
DP-203 dumps exhibit All valid rows must be written to the destination table.
DP-203 dumps exhibit Truncation errors in the comment column must be avoided proactively.
DP-203 dumps exhibit Any rows containing comment values that will cause truncation errors upon insert must be written to a file in blob storage.
Which two actions should you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

  1. A. To the data flow, add a sink transformation to write the rows to a file in blob storage.
  2. B. To the data flow, add a Conditional Split transformation to separate the rows that will cause truncation errors.
  3. C. To the data flow, add a filter transformation to filter out rows that will cause truncation errors.
  4. D. Add a select transformation to select only the rows that will cause truncation errors.

Correct Answer: AB
B: Example:
* 1. This conditional split transformation defines the maximum length of "title" to be five. Any row that is less than or equal to five will go into the GoodRows stream. Any row that is larger than five will go into the BadRows stream.
DP-203 dumps exhibit
* 2. This conditional split transformation defines the maximum length of "title" to be five. Any row that is less than or equal to five will go into the GoodRows stream. Any row that is larger than five will go into the BadRows stream.
A:
* 3. Now we need to log the rows that failed. Add a sink transformation to the BadRows stream for logging. Here, we'll "auto-map" all of the fields so that we have logging of the complete transaction record. This is a text-delimited CSV file output to a single file in Blob Storage. We'll call the log file "badrows.csv".
DP-203 dumps exhibit
* 4. The completed data flow is shown below. We are now able to split off error rows to avoid the SQL truncation errors and put those entries into a log file. Meanwhile, successful rows can continue to write to our target database.
DP-203 dumps exhibit
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/how-to-data-flow-error-rows

QUESTION 39

- (Exam Topic 3)
You have an Azure Data Factory instance named ADF1 and two Azure Synapse Analytics workspaces named WS1 and WS2.
ADF1 contains the following pipelines:
DP-203 dumps exhibit P1: Uses a copy activity to copy data from a nonpartitioned table in a dedicated SQL pool of WS1 to an Azure Data Lake Storage Gen2 account
DP-203 dumps exhibit P2: Uses a copy activity to copy data from text-delimited files in an Azure Data Lake Storage Gen2 account to a nonpartitioned table in a dedicated SQL pool of WS2
You need to configure P1 and P2 to maximize parallelism and performance.
Which dataset settings should you configure for the copy activity if each pipeline? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
DP-203 dumps exhibit
Solution:
Box 1: Set the Copy method to PolyBase
While SQL pool supports many loading methods including non-Polybase options such as BCP and SQL BulkCopy API, the fastest and most scalable way to load data is through PolyBase. PolyBase is a technology that accesses external data stored in Azure Blob storage or Azure Data Lake Store via the T-SQL language.
Box 2: Set the Copy method to Bulk insert
Polybase not possible for text files. Have to use Bulk insert. Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/load-data-overview

Does this meet the goal?

  1. A. Yes
  2. B. No

Correct Answer: A

QUESTION 40

- (Exam Topic 1)
You need to design the partitions for the product sales transactions. The solution must meet the sales transaction dataset requirements.
What should you include in the solution? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
DP-203 dumps exhibit
Solution:
Box 1: Sales date
Scenario: Contoso requirements for data integration include:
DP-203 dumps exhibit Partition data that contains sales transaction records. Partitions must be designed to provide efficient loads by month. Boundary values must belong to the partition on the right.
Box 2: An Azure Synapse Analytics Dedicated SQL pool Scenario: Contoso requirements for data integration include:
DP-203 dumps exhibit Ensure that data storage costs and performance are predictable.
The size of a dedicated SQL pool (formerly SQL DW) is determined by Data Warehousing Units (DWU). Dedicated SQL pool (formerly SQL DW) stores data in relational tables with columnar storage. This format
significantly reduces the data storage costs, and improves query performance.
Synapse analytics dedicated sql pool Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-overview-wha

Does this meet the goal?

  1. A. Yes
  2. B. No

Correct Answer: A

Page 9 of 61

Post your Comments and Discuss Microsoft DP-203 exam with other Community members: