Grouping Data in SQL

When working with large datasets, it is often more useful to analyze data in groups rather than viewing individual records. SQL provides the GROUP BY clause to group rows that share common values and apply aggregate functions such as COUNT, SUM, AVG, MIN, and MAX to each group. Grouping is a core concept in SQL reporting, analytics, and business intelligence.

GROUP BY Clause

The GROUP BY clause is used to group rows based on one or more columns. After grouping, aggregate functions operate on each group separately and return one summary row per group.

Example: Count Employees per Department

SELECT 
    department,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department;

Explanation

The employees table is grouped by the department column.
All employees in the same department form one group.
COUNT(*) counts how many employees exist in each department.
The result shows one row per department, not individual employees.

Important GROUP BY Rule (Very Important for Interviews)

👉 Every column in the SELECT list must either:

Be included in the GROUP BY clause
OR
Be used inside an aggregate function

Incorrect example:

SELECT department, salary
FROM employees
GROUP BY department;  -- ❌ ERROR

Correct example:

SELECT department, AVG(salary)
FROM employees
GROUP BY department;

GROUP BY with Multiple Columns

GROUP BY can use more than one column. In such cases, SQL creates groups based on unique combinations of the specified columns.

Example: Average Salary by Department and Job Title

SELECT 
    department,
    job_title,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department, job_title;

Explanation

Employees are grouped first by department
Then further grouped by job title within each department
Each department–job title combination produces one row
This gives more detailed insights than single-column grouping

HAVING Clause

The HAVING clause is used to filter groups, not individual rows. It is always used after GROUP BY and works with aggregate functions.

Example: Departments with More Than 5 Employees

SELECT 
    department,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;

Explanation

GROUP BY creates department-wise groups
COUNT(*) calculates employees in each group
HAVING removes groups with 5 or fewer employees
Final output shows only departments with more than 5 employees

WHERE vs HAVING (Critical Concept)

Although WHERE and HAVING both filter data, they operate at different stages of query execution.

Example: Combined WHERE and HAVING

SELECT 
    department,
    AVG(salary) AS avg_salary
FROM employees
WHERE hire_date > '2019-01-01'
GROUP BY department
HAVING AVG(salary) > 55000;

Explanation

WHERE filters rows first (only employees hired after 2019)
GROUP BY groups the remaining rows by department
HAVING filters grouped results based on average salary

Key Differences: WHERE vs HAVING

Feature	WHERE	HAVING
Filters	Rows	Groups
Used with	Individual columns	Aggregate functions
Execution time	Before GROUP BY	After GROUP BY
Can use aggregates	❌ No	✔ Yes

Logical Execution Order of SQL Query (Important)

SQL does not execute queries in the order written.

Actual execution order:

FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
LIMIT

Understanding this helps avoid logical mistakes.

GROUP BY with ORDER BY

You can sort grouped results using ORDER BY.

Example:

SELECT department, COUNT(*) AS emp_count
FROM employees
GROUP BY department
ORDER BY emp_count DESC;

GROUP BY with DISTINCT

DISTINCT is usually unnecessary with GROUP BY
GROUP BY already creates unique groups

Real-World Use Cases

Department-wise employee count
Average salary by role
Sales analysis by region and product
Monthly revenue reports
Job portal analytics
HR and payroll dashboards

Common Beginner Mistakes

Forgetting GROUP BY for non-aggregated columns
Using WHERE instead of HAVING for aggregates
Using column aliases in HAVING (DB-dependent)
Grouping too many columns unnecessarily
Expecting row-level output from grouped queries

Best Practices

Use clear and meaningful aliases
Filter rows early using WHERE
Use HAVING only for aggregated conditions
Avoid unnecessary columns in GROUP BY
Format queries for readability

Log In

Sign Up

Grouping Data in SQL

GROUP BY Clause

Example: Count Employees per Department

Explanation

Important GROUP BY Rule (Very Important for Interviews)

GROUP BY with Multiple Columns

Example: Average Salary by Department and Job Title

Explanation

HAVING Clause

Example: Departments with More Than 5 Employees

Explanation

WHERE vs HAVING (Critical Concept)

Example: Combined WHERE and HAVING

Explanation

Key Differences: WHERE vs HAVING

Logical Execution Order of SQL Query (Important)

GROUP BY with ORDER BY

GROUP BY with DISTINCT

Real-World Use Cases

Common Beginner Mistakes

Best Practices

Leave a Comment