NULL handling in SQL is a crucial part of working with relational databases because it represents the absence of a value in a database column. Understanding how to handle NULLs is vital because NULLs can affect the logic and results of queries. Here’s a comprehensive explanation covering the key aspects of NULL handling in SQL, including both functions and the logic of NULLs with examples.
### IS NULL and IS NOT NULL
– **IS NULL**: This predicate is used to check if a column contains a NULL value.
**Example**:
“`sql
SELECT * FROM employees WHERE manager_id IS NULL;
“`
In this example, we’re retrieving all employees who do not have a manager (i.e., `manager_id` is NULL).
– **IS NOT NULL**: This predicate checks if a column does not contain a NULL value.
**Example**:
“`sql
SELECT * FROM employees WHERE manager_id IS NOT NULL;
“`
This will return employees who have a manager associated with them.
### COALESCE
– **COALESCE**: This function returns the first non-NULL expression among its arguments. It is useful for providing default values for NULLs.
**Example**:
“`sql
SELECT employee_id, COALESCE(phone_number, ‘N/A’) as contact_number FROM employees;
“`
This example will return ‘N/A’ for `phone_number` if it is NULL for any employee.
### NULLIF
– **NULLIF**: This function compares two expressions and returns NULL if they are equal; otherwise, it returns the first expression.
**Example**:
“`sql
SELECT NULLIF(salary, 0) as valid_salary FROM employees;
“`
Here, `nullif` is used to replace `0` salaries with NULL.
### NVL (Oracle-specific)
– **NVL**: This function is specific to Oracle Database and replaces NULL with a specified value.
**Example**:
“`sql
SELECT NVL(commission_pct, 0) as commission FROM sales;
“`
Here, the `commission_pct` is replaced with `0` if it is NULL.
### Three-Valued Logic (3VL)
In SQL, logical expressions can evaluate to three values: TRUE, FALSE, and UNKNOWN. UNKNOWN is the result of any comparison involving a NULL that isn’t handled using some specific function or predicate.
– **TRUE**: The logical condition holds.
– **FALSE**: The logical condition does not hold.
– **UNKNOWN**: The logical condition involves NULL without a specific instruction for handling it, resulting in uncertainty.
**Example**:
“`sql
SELECT * FROM employees WHERE salary > 50000 AND bonus > 1000;
“`
If `bonus` is NULL for an employee, the condition (`salary > 50000 AND bonus > 1000`) evaluates to UNKNOWN for that employee, hence, they are not selected.
### Common Pitfalls
1. **Misunderstanding NULL Comparisons**:
– Using equality (`=`) or inequality (`!=`) operators with NULL will not work as expected. Instead, use `IS NULL` or `IS NOT NULL`.
“`sql
— Incorrect
SELECT * FROM employees WHERE bonus = NULL; — Never true
— Correct
SELECT * FROM employees WHERE bonus IS NULL;
“`
2. **Arithmetic with NULL**:
– Any arithmetic operation with NULL results in NULL.
“`sql
SELECT salary + bonus FROM employees; — If bonus is NULL, result will be NULL
“`
3. **Aggregations**:
– Most aggregate functions (e.g., SUM, AVG) ignore NULL values.
“`sql
SELECT AVG(salary) FROM employees; — NULLs are ignored in the computation
“`
To effectively work with NULLs, it’s important to apply appropriate NULL-handling techniques and recognize the impact of three-valued logic on SQL query results. Each SQL function and predicate involving NULL must be used with an understanding of how they interact with regular data to avoid unexpected outcomes.