Exam DP-750 Topic 1 Question 10 Discussion

Actual exam question for Microsoft's DP-750 exam
Question #: 10
Topic #: 1

You have an Azure Databricks workspace that is enabled for Unity Catalog and contains a Delta table named Orders.
You load the Orders table into an Apache Spark DataFrame named df.
You need to create a DataFrame that excludes rows where the order amount is null.
Solution: You run the following expression.
df.filter(df.order_amount.isNotNull())
Does this meet the goal?

A. Yes B. No

Suggested Answer: A Vote an answer

The correct answer is A - Yes.
df.filter(df.order_amount.isNotNull()) is the correct PySpark pattern for excluding null rows. The isNotNull() method is a Column method that returns True for every row where order_amount has a value and False for rows where it is null. Spark's filter keeps only the rows where the condition evaluates to True, producing a DataFrame with all null order_amount rows removed.
This works correctly because isNotNull() is explicitly null-aware - unlike the != None comparison in Q52, it doesn't rely on Python equality semantics. Under the hood it maps to the SQL expression order_amount IS NOT NULL, which is unambiguous in both SQL and Spark.
Both df.filter(df.order_amount.isNotNull()) and df.dropna(subset=['order_amount']) produce identical results.
The choice between them is stylistic - isNotNull() reads more explicitly as a filter condition, while dropna is more compact when handling multiple columns.
Reference: https://learn.microsoft.com/en-us/azure/databricks/pyspark/basics

by Upton at Jun 05, 2026, 05:42 AM

Limited Time Offer

15%

Off

Get Premium DP-750 Questions as Interactive Self Test Engine or PDF

Comments

0 Satisfied Customers

0 Shares

0 Demo Downloads

10 Years in Business