DP-700 Practice Questions and Answers Part 1: Microsoft Fabric Exam Prep
If you are preparing for the Microsoft DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric exam, this practice set will help you understand the types of questions you may face in the exam. This post includes 10 DP-700 practice questions with correct answers and detailed explanations.
These questions focus on important Microsoft Fabric topics such as lakehouse security, eventhouses, data pipelines, Direct Lake performance, dimensional modeling, Spark notebook sessions, workspace permissions, and row-level security.
DP-700 Practice Questions and Answers – Part 1
Question 1: Granting Data Analysts Access to a Gold Layer Lakehouse
You are implementing a Microsoft Fabric medallion architecture. Data engineers must have read and write access to all lakehouses, including the underlying files. Data analysts must only have read access to the Delta tables in the gold layer and must not access the bronze or silver layers.
You need to ensure that the data analysts can access the gold layer lakehouse. What should you do?
- Add the DataAnalyst group to the Viewer role for WorkspaceA.
- Share the lakehouse with the DataAnalysts group and grant the Build reports on the default semantic model permission.
- Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint data permission.
- Share the lakehouse with the DataAnalysts group and grant the Read all Apache Spark permission.
Correct Answer: C. Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint data permission.
Explanation
The best option is to share the lakehouse directly with the DataAnalysts group and grant Read all SQL Endpoint data. This allows analysts to query the Delta tables through the SQL endpoint without giving them broader workspace access or Spark-level access to the underlying files.
Granting the Viewer role at the workspace level could expose more items than required. Granting Apache Spark permissions would also be too broad because it could allow access to lakehouse files. The SQL endpoint permission follows the principle of least privilege while still allowing analysts to query the gold layer.
Question 2: Choosing a Storage Item for T-SQL, KQL, and Spark Access
You have a Fabric workspace and semi-structured data. You need to read the data by using T-SQL, KQL, and Apache Spark. The data will only be written by using Spark.
What should you use to store the data?
- A lakehouse
- An eventhouse
- A datamart
- A warehouse
Correct Answer: B. An eventhouse
Explanation
An eventhouse is designed for real-time intelligence and event-based analytical workloads in Microsoft Fabric. It supports KQL-based querying and can be used in scenarios that involve high-volume or semi-structured event data.
In this scenario, the requirement includes reading data by using KQL, which makes an eventhouse the best fit. A lakehouse is excellent for Spark and SQL analytics, but KQL is strongly associated with Real-Time Intelligence and eventhouse workloads.
Question 3: Copying Data from On-Premises SQL Server to a Fabric Warehouse
You have a Fabric workspace that contains a warehouse named Warehouse1. You also have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-premises data gateway.
You need to copy data from Database1 to Warehouse1. Which item should you use?
- A Dataflow Gen1 dataflow
- A data pipeline
- A KQL queryset
- A notebook
Correct Answer: B. A data pipeline
Explanation
A data pipeline is the recommended Fabric item for orchestrating and moving data between source and destination systems. It can use a gateway connection to access on-premises SQL Server and copy data into a Fabric warehouse.
Data pipelines are designed for data movement, scheduling, orchestration, and activity management. A KQL queryset is used for querying KQL databases, while a notebook is more suitable for code-based transformations. In this case, the requirement is specifically to copy data, so a data pipeline is the correct choice.
Question 4: Selecting the Best Fabric Item for On-Premises Data Movement
You have a Fabric workspace that contains a warehouse named Warehouse1. You have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-premises data gateway.
You need to copy data from Database1 to Warehouse1. Which item should you use?
- An Apache Spark job definition
- A data pipeline
- A Dataflow Gen1 dataflow
- An eventstream
Correct Answer: B. A data pipeline
Explanation
A data pipeline is the most suitable Fabric item for copying data from an on-premises SQL Server database to a Fabric warehouse. It supports source and destination configuration, gateway connectivity, scheduling, and retry behavior.
An Apache Spark job definition is mainly used for Spark-based workloads, and an eventstream is used for streaming data. Since this requirement is a batch copy operation from SQL Server to a warehouse, a data pipeline is the correct answer.
Question 5: Improving Direct Lake Report Performance
You have a Fabric F32 capacity that contains a workspace. The workspace contains a warehouse named DW1 that is modeled by using MD5 hash surrogate keys.
DW1 contains a fact table that has grown from 200 million rows to 500 million rows during the past year. Power BI reports based on Direct Lake show year-over-year values. Users report degraded performance and some visuals show errors.
You need to resolve the performance issues. The solution must provide the best query performance and minimize operational costs. What should you do?
- Change the MD5 hash to SHA256.
- Increase the capacity.
- Enable V-Order.
- Modify the surrogate keys to use a different data type.
- Create views.
Correct Answer: C. Enable V-Order
Explanation
V-Order is an optimization feature in Microsoft Fabric that improves query performance by optimizing how data is stored and read. It is especially useful for analytical workloads and Direct Lake scenarios where large tables are queried frequently.
Increasing capacity could improve performance, but it also increases operational cost. Changing the hash algorithm does not directly address the query performance problem. Enabling V-Order provides a cost-efficient way to improve query speed for large analytical datasets.
Question 6: Summarizing Order Quantity by Year and Product
You have a Fabric warehouse named DW1. DW1 contains sales order details and product information. You need to create an output that summarizes order quantities by year and product. The result must also include a summary of order quantities at the year level for all products.
Which SQL grouping approach should you use?
Correct Answer: Use SELECT YEAR(...) and ROLLUP(YEAR(...), ProductName).
Explanation
The correct solution uses the YEAR function to extract the year from the order date and uses ROLLUP to generate subtotal rows. The ROLLUP operator is useful when you need grouped totals at multiple levels.
In this case, the query must return both:
- Order quantity by year and product
- Total order quantity by year across all products
A simplified version of the query structure is:
SELECT
YEAR(SO.ModifiedDate) AS OrderDate,
P.Name AS ProductName,
SUM(SO.OrderQty) AS OrderQty
FROM dbo.SalesOrderDetail AS SO
INNER JOIN dbo.Product AS P
ON P.ProductID = SO.ProductID
GROUP BY ROLLUP(YEAR(SO.ModifiedDate), P.Name)
ORDER BY OrderDate;
This approach meets the requirement because ROLLUP automatically adds summary rows for higher-level groupings.
Question 7: Designing a Product Dimension Table
You have a Fabric lakehouse named Lakehouse1. Data is ingested into Lakehouse1 as one flat table with the following columns:
- TransactionID
- Date
- ProductID
- ProductColor
- ProductName
- SalesAmount
You plan to load the data into a dimensional model and implement a star schema. You create two tables named FactSales and DimProduct. You will track changes in DimProduct.
Which three columns should you include in the DimProduct table?
- Date
- ProductName
- ProductColor
- TransactionID
- SalesAmount
- ProductID
Correct Answer: B, C, and F — ProductName, ProductColor, and ProductID
Explanation
A product dimension should contain descriptive product attributes and a unique product identifier. Therefore, ProductID, ProductName, and ProductColor belong in the DimProduct table.
The FactSales table should contain transactional and measurable data such as TransactionID, Date, and SalesAmount. Product attributes should be placed in the dimension table so that they can be reused for reporting, filtering, grouping, and historical tracking.
Question 8: Sharing the Same Apache Spark Session Between Notebooks
You have a Fabric workspace named Workspace1 that contains a notebook named Notebook1. You create a new notebook named Notebook2.
You need to ensure that Notebook2 can attach to the same Apache Spark session as Notebook1. What should you do?
- Enable high concurrency for notebooks.
- Enable dynamic allocation for the Spark pool.
- Change the runtime version.
- Increase the number of executors.
Correct Answer: A. Enable high concurrency for notebooks.
Explanation
High concurrency allows multiple notebooks to share the same Apache Spark session. This is useful when several notebooks need to run interactively in the same Spark environment and share session-level context.
Dynamic allocation and executor count affect resource usage, but they do not allow multiple notebooks to attach to the same Spark session. Changing the runtime version also does not meet the requirement.
Question 9: Allowing a Data Engineer to Write to One Table Without Reading PII
You have a Fabric workspace named Workspace1 that contains a lakehouse named Lakehouse1. Lakehouse1 contains the following tables:
- Orders
- Customer
- Employee
The Employee table contains Personally Identifiable Information (PII). A data engineer must write data to the Customer table but must not view the contents of the Employee table.
Which three actions should you perform?
- Share Lakehouse1 with the data engineer.
- Assign the data engineer the Contributor role for Workspace2.
- Assign the data engineer the Viewer role for Workspace2.
- Assign the data engineer the Contributor role for Workspace1.
- Migrate the Employee table from Lakehouse1 to Lakehouse2.
- Create a new workspace named Workspace2 that contains a new lakehouse named Lakehouse2.
- Assign the data engineer the Viewer role for Workspace1.
Correct Answer: D, E, and F
Explanation
The data engineer needs write access to the Customer table, which requires appropriate permissions in Workspace1. However, because the Employee table contains PII, it should be isolated from the operational workspace used by the data engineer.
The correct approach is to:
- Assign the data engineer the Contributor role for Workspace1.
- Move the Employee table to a separate lakehouse.
- Create a separate workspace and lakehouse for the sensitive Employee table.
This design separates sensitive PII data from the workspace where the data engineer performs operational work.
Question 10: Implementing Row-Level Security in a Fabric Warehouse
You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data and is used by multiple sales representatives.
You plan to implement row-level security. You need to ensure that sales representatives can see only their respective data.
Which warehouse object do you require to implement row-level security?
- Stored procedure
- Constraint
- Schema
- Function
Correct Answer: D. Function
Explanation
To implement row-level security in a Fabric warehouse, you need a function that defines the filter logic. This function is used by a security policy to determine which rows a user is allowed to access.
A stored procedure does not automatically filter table access for users. A constraint is used for data rules, not user-based filtering. A schema organizes database objects but does not define RLS logic. Therefore, a function is required.
Summary of DP-700 Practice Questions Part 1
| Question | Main Topic | Correct Answer |
|---|---|---|
| Question 1 | Lakehouse SQL endpoint permissions | C |
| Question 2 | Eventhouse storage selection | B |
| Question 3 | Data pipeline for on-premises SQL Server | B |
| Question 4 | Fabric data movement | B |
| Question 5 | Direct Lake performance optimization | C |
| Question 6 | SQL ROLLUP aggregation | YEAR + ROLLUP |
| Question 7 | Dimensional modeling | B, C, F |
| Question 8 | Spark notebook high concurrency | A |
| Question 9 | Workspace and PII isolation | D, E, F |
| Question 10 | Row-level security | D |
Frequently Asked Questions
Is DP-700 focused only on Microsoft Fabric?
Yes. The DP-700 exam focuses on implementing data engineering solutions by using Microsoft Fabric, including lakehouses, warehouses, data pipelines, notebooks, eventstreams, and real-time analytics.
What is the best way to prepare for DP-700?
The best approach is to study Microsoft Fabric concepts, practice scenario-based questions, understand why each answer is correct, and gain hands-on experience with Fabric workloads.
Are data pipelines important for the DP-700 exam?
Yes. Data pipelines are an important part of DP-700 because they are used for orchestration, ingestion, scheduling, copying data, and managing dependencies between Fabric activities.
What is row-level security in Fabric Warehouse?
Row-level security restricts access to rows in a table based on the identity or role of the user. In Fabric Warehouse, it is typically implemented by using a filter function and a security policy.
Final Thoughts
This was Part 1 of the DP-700 practice question series. These first 10 questions covered several high-value Microsoft Fabric exam topics, including lakehouse access control, eventhouses, data pipelines, Direct Lake performance, dimensional modeling, Spark sessions, and row-level security.
Continue practicing with the next set of questions to strengthen your understanding of real exam scenarios and improve your readiness for the Microsoft DP-700 certification exam.