Oracle SQL Tutorial: Your Guide to Advanced Query Techniques

1. Overview
Oracle SQL offers many advanced features to help you write efficient and powerful queries. Here are some of the advanced Oracle SQL features.
2. Analytic Functions
Analytic functions allow you to perform complex calculations across rows related to the current row without needing self-joins or subqueries. Examples include ROW_NUMBER
, RANK
, DENSE_RANK
, LEAD
, and LAG
.
For the remainder of this section, we will use data from the following table:
CREATE TABLE sales_data (
order_id INT,
order_date DATE,
product_name VARCHAR(50),
sales_amount DECIMAL(10, 2)
);
INSERT INTO sales_data (order_id, order_date, product_name, sales_amount)
VALUES
(1, '2023-01-10', 'Product A', 100.00),
(2, '2023-01-12', 'Product B', 150.00),
(3, '2023-01-15', 'Product A', 75.00),
(4, '2023-01-18', 'Product C', 200.00),
(5, '2023-01-20', 'Product B', 125.00),
(6, '2023-01-22', 'Product A', 90.00),
(7, '2023-01-25', 'Product C', 175.00);
2.1 ROW_NUMBER
ROW_NUMBER() assigns a unique integer value to each row within the result set based on the order specified. It’s often used for pagination or identifying specific rows.
SELECT order_id, order_date, product_name, sales_amount,
ROW_NUMBER() OVER (ORDER BY order_date) AS row_num
FROM sales_data;
Result:
order_id | order_date | product_name | sales_amount | row_num |
---|---|---|---|---|
1 | 2023-01-10 | Product A | 100.00 | 1 |
2 | 2023-01-12 | Product B | 150.00 | 2 |
3 | 2023-01-15 | Product A | 75.00 | 3 |
4 | 2023-01-18 | Product C | 200.00 | 4 |
5 | 2023-01-20 | Product B | 125.00 | 5 |
6 | 2023-01-22 | Product A | 90.00 | 6 |
7 | 2023-01-25 | Product C | 175.00 | 7 |
2.2 RANK and DENSE_RANK
RANK() and DENSE_RANK(): Assigns a rank to each row based on the specified order, with possible gaps in ranks for tied values. RANK()
leaves gaps while DENSE_RANK()
assigns consecutive ranks to tied values.
SELECT product_name, sales_amount,
RANK() OVER (ORDER BY sales_amount DESC) AS rank,
DENSE_RANK() OVER (ORDER BY sales_amount DESC) AS dense_rank
FROM sales_data;
Result:
product_name | sales_amount | rank | dense_rank |
---|---|---|---|
Product C | 200.00 | 1 | 1 |
Product B | 150.00 | 2 | 2 |
Product C | 175.00 | 3 | 3 |
Product B | 125.00 | 4 | 4 |
Product A | 100.00 | 5 | 5 |
Product A | 90.00 | 6 | 6 |
Product A | 75.00 | 7 | 7 |
2.3 LEAD and LAG
LEAD() and LAG(): These functions allow you to access the value of a column in the next row (LEAD()
) or the previous row (LAG()
) within a specified window. They are useful for calculating differences or trends.
SELECT order_id, order_date, product_name, sales_amount,
LAG(order_date) OVER (ORDER BY order_date) AS prev_order_date,
LEAD(order_date) OVER (ORDER BY order_date) AS next_order_date
FROM sales_data;
Result:
order_id | order_date | product_name | sales_amount | prev_order_date | next_order_date |
---|---|---|---|---|---|
1 | 2023-01-10 | Product A | 100.00 | null | 2023-01-12 |
2 | 2023-01-12 | Product B | 150.00 | 2023-01-10 | 2023-01-15 |
3 | 2023-01-15 | Product A | 75.00 | 2023-01-12 | 2023-01-18 |
4 | 2023-01-18 | Product C | 200.00 | 2023-01-15 | 2023-01-20 |
5 | 2023-01-20 | Product B | 125.00 | 2023-01-18 | 2023-01-22 |
6 | 2023-01-22 | Product A | 90.00 | 2023-01-20 | 2023-01-25 |
7 | 2023-01-25 | Product C | 175.00 | 2023-01-22 | null |
2.4 SUM, AVG, MIN, and MAX with the OVER clause
SUM(), AVG(), MIN(), and MAX() with the OVER clause: These functions can perform aggregate calculations across a window of rows specified by the OVER
clause. For example, you can calculate the rolling sales average over a certain period.
SELECT order_id, order_date, product_name, sales_amount,
SUM(sales_amount) OVER (ORDER BY order_date) AS cumulative_sales
FROM sales_data;
Result:
rder_id | order_date | product_name | sales_amount | cumulative_sales |
---|---|---|---|---|
1 | 2023-01-10 | Product A | 100.00 | 100.00 |
2 | 2023-01-12 | Product B | 150.00 | 250.00 |
3 | 2023-01-15 | Product A | 75.00 | 325.00 |
4 | 2023-01-18 | Product C | 200.00 | 525.00 |
5 | 2023-01-20 | Product B | 125.00 | 650.00 |
6 | 2023-01-22 | Product A | 90.00 | 740.00 |
7 | 2023-01-25 | Product C | 175.00 | 915.00 |
2.5 NTILE
NTILE(): Divides the result into the specified number of roughly equal parts (buckets), assigning a bucket number to each row. This is useful for creating percentiles or quartiles.
NTILE() and WIDTH_BUCKET(): These functions help distribute data into bins or buckets based on specified criteria, which is useful for creating histograms.
SELECT order_id, order_date, product_name, sales_amount,
NTILE(4) OVER (ORDER BY sales_amount) AS quartile
FROM sales_data;
2.6 FIRST_VALUE and LAST_VALUE
FIRST_VALUE() and LAST_VALUE(): These functions return the first and last values in a window, respectively. They are often used to find the earliest and latest values within a time period.
SELECT product_name, sales, FIRST_VALUE(product_name) OVER (ORDER BY sales DESC) AS best_selling_product,
LAST_VALUE(product_name) OVER (ORDER BY sales DESC) AS worst_selling_product
FROM products;
2.7 PERCENTILE_CONT and PERCENTILE_DISC
PERCENTILE_CONT() and PERCENTILE_DISC(): These functions calculate a specified percentile value for a numeric column. PERCENTILE_CONT()
returns a value interpolated from adjacent values while PERCENTILE_DISC()
returns an actual value from the data set.
2.8 CUME_DIST
CUME_DIST(): Calculates the cumulative distribution of a value within a window, indicating the relative position of a value compared to others.
SELECT exam_score, CUME_DIST() OVER (ORDER BY exam_score) AS cumulative_distribution
FROM exam_scores;
2.9 LISTAGG
LISTAGG(): Concatenates values from multiple rows into a single string with an optional separator. This is useful for creating comma-separated lists or other concatenated results.
SELECT LISTAGG(first_name, ', ') WITHIN GROUP (ORDER BY employee_id) AS concatenated_names
FROM employees;
3. Common Table Expressions (CTEs)
Common Table Expressions (CTEs) are temporary result sets that can be defined within a SQL query. They are particularly useful for breaking down complex queries into more manageable parts and improving query readability. Here are some examples of how CTEs can be used.
3.1 Recursive Queries
Recursive Queries: CTEs are often used for recursive queries. For example, you can use a CTE to query hierarchical data like an organizational chart or a bill of materials. The CTE can reference itself to navigate through the hierarchy.
WITH RecursiveOrgChart AS (
SELECT employee_id, manager_id
FROM employees
WHERE manager_id IS NULL -- Starting point
UNION ALL
SELECT e.employee_id, e.manager_id
FROM employees e
INNER JOIN RecursiveOrgChart roc ON e.manager_id = roc.employee_id
)
SELECT * FROM RecursiveOrgChart;
3.2 Data Transformation
Data Transformation: CTEs can be used to transform data in a way that makes it easier to work with in subsequent queries. For example, you can normalize or pivot data.
WITH NormalizedData AS (
SELECT customer_id, order_id, product
FROM orders
UNPIVOT (product FOR product_type IN (product_A, product_B, product_C))
)
SELECT * FROM NormalizedData;
3.3 Subquery Replacement
Subquery Replacement: Instead of using subqueries in the main query, you can use CTEs to make your code more readable. This is especially helpful for complex subqueries.
WITH TopCustomers AS (
SELECT customer_id, SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC
LIMIT 10
)
SELECT c.customer_name, tc.total_spent
FROM customers c
JOIN TopCustomers tc ON c.customer_id = tc.customer_id;
3.4 Pagination
Pagination: CTEs are useful for implementing pagination, where you can calculate row numbers and then retrieve a specific range of rows.
WITH NumberedRows AS (
SELECT *,
ROW_NUMBER() OVER (ORDER BY order_date) AS row_num
FROM orders
)
SELECT *
FROM NumberedRows
WHERE row_num BETWEEN 11 AND 20;
3.5 Complex Aggregations
Complex Aggregations: CTEs can simplify complex aggregation queries, making them easier to understand and maintain.
WITH MonthlySales AS (
SELECT EXTRACT(MONTH FROM order_date) AS month,
SUM(total_amount) AS total_sales
FROM orders
GROUP BY EXTRACT(MONTH FROM order_date)
)
SELECT * FROM MonthlySales;
3.6 Recursive Data Generation
Recursive Data Generation: CTEs can be used to generate a series of data, such as dates for a given time period, which can be used for reporting or analysis.
WITH DateSeries AS (
SELECT DATE '2023-01-01' + INTERVAL '1 DAY' * n AS date
FROM generate_series(0, 364) AS n
)
SELECT * FROM DateSeries;
4. Window Functions
Window functions are a subset of analytic functions that perform calculations across a set of table rows related to the current row. They have a window (frame) defined, allowing for operations like calculating running totals or averages.
5. Hierarchical Queries
Oracle supports hierarchical queries using the CONNECT BY
clause. These are useful when dealing with data organized in a hierarchical or tree-like structure, such as organizational charts or bill of materials.
6. Regular Expressions
Oracle SQL supports powerful regular expression functions for pattern matching and manipulation of strings. Functions like REGEXP_LIKE
, REGEXP_SUBSTR
, and REGEXP_REPLACE
enable sophisticated text processing.
7. Model Clause
The MODEL clause allows you to perform calculations involving multiple dimensions and measures. It’s useful for forecasting, financial modeling, and other scenarios where you need to simulate changes across various dimensions.
8. Materialized Views
Materialized views are precomputed result sets stored in the database, which can significantly improve query performance for complex queries by caching the results.
9. Partitioning
Oracle supports table and index partitioning, allowing you to divide large tables or indexes into smaller, more manageable segments. This can lead to improved performance, maintenance, and query optimization.
10. Virtual Columns
Virtual columns are columns whose values are derived from expressions or functions based on other columns in the same table. They can be indexed and queried like regular columns.
11. Flashback Queries
Oracle provides the ability to query data as it existed at a previous point in time using flashback queries. This can be valuable for auditing, historical analysis, and data recovery.
12. Advanced Indexing
Oracle supports various indexing techniques such as bitmap indexes, function-based indexes, and domain indexes, which can help optimize query performance for specific use cases.
13. Fine-Grained Access Control
Oracle allows you to define fine-grained access control using features like Virtual Private Database (VPD), which enable you to control data access at a row or column level based on user roles and attributes.
14. Advanced Aggregation
Oracle provides extensions to standard SQL aggregation functions, such as GROUPING SETS
, CUBE
, and ROLLUP
, which allow you to generate multiple levels of aggregated results in a single query.