... or "How to solve the same problem in 10 different ways".
One of the common problems to solve in SQL is "Get row with the group-wise maximum". Getting just the maximum for the group is simple, getting the full row which is belonging to the maximum is the interesting step.
SELECT MAX(population), continent
FROM Country
GROUP BY continent;
+-----------------+---------------+
| MAX(population) | continent |
+-----------------+---------------+
| 1277558000 | Asia |
| 146934000 | Europe |
| 278357000 | North America |
| 111506000 | Africa |
| 18886000 | Oceania |
| 0 | Antarctica |
| 170115000 | South America |
+-----------------+---------------+
We use the 'world' database from the MySQL manual for the examples.
The next step is to find the countries which have the population and the continent of our gathered data.
SELECT continent, name, population
FROM Country
WHERE population = 1277558000
AND continent = 'Asia';
+-----------+-------+------------+
| continent | name | population |
+-----------+-------+------------+
| Asia | China | 1277558000 |
+-----------+-------+------------+
Instead of doing this row by row we just do a JOIN between the two by using a temporary table:
CREATE TEMPORARY TABLE co2
SELECT continent, MAX(population) AS maxpop
FROM Country
GROUP BY continent;
SELECT co1.continent, co1.name, co1.population
FROM Country AS co1, co2
WHERE co2.continent = co1.continent
AND co1.population = co2.maxpop;
+---------------+----------------------------------------------+------------+
| continent | name | population |
+---------------+----------------------------------------------+------------+
| Oceania | Australia | 18886000 |
| South America | Brazil | 170115000 |
| Asia | China | 1277558000 |
| Africa | Nigeria | 111506000 |
| Europe | Russian Federation | 146934000 |
| North America | United States | 278357000 |
| Antarctica | Antarctica | 0 |
| Antarctica | Bouvet Island | 0 |
| Antarctica | South Georgia and the South Sandwich Islands | 0 |
| Antarctica | Heard Island and McDonald Islands | 0 |
| Antarctica | French Southern territories | 0 |
+---------------+----------------------------------------------+------------+
DROP TEMPORARY TABLE co2;
Instead of using a temporary table as internal steps we can write the same also as simple sub-query which is creating a temporary table internally.
SELECT co1.continent, co1.name, co1.population
FROM Country AS co1,
(SELECT continent, MAX(population) AS maxpop
FROM Country
GROUP BY continent) AS co2
WHERE co2.continent = co1.continent
and co1.population = co2.maxpop;
+---------------+----------------------------------------------+------------+
| continent | name | population |
+---------------+----------------------------------------------+------------+
| Oceania | Australia | 18886000 |
| South America | Brazil | 170115000 |
| Asia | China | 1277558000 |
| Africa | Nigeria | 111506000 |
| Europe | Russian Federation | 146934000 |
| North America | United States | 278357000 |
| Antarctica | Antarctica | 0 |
| Antarctica | Bouvet Island | 0 |
| Antarctica | South Georgia and the South Sandwich Islands | 0 |
| Antarctica | Heard Island and McDonald Islands | 0 |
| Antarctica | French Southern territories | 0 |
+---------------+----------------------------------------------+------------+
The sub-query is executed in the exact same way as the temporary table we created by hand. Instead of JOINing against the temporary table we JOIN against the result of the sub-query.
Hmm, this was too simple ? Let's take a look at the alternatives:
SELECT co1.continent, co1.name, co1.population
FROM Country AS co1
WHERE co1.population =
(SELECT MAX(population) AS maxpop
FROM Country AS co2
WHERE co2.continent = co1.continent);
To be read as: 'Get the countries which have the same population as the maximum population of the current country'. Using such a sub-qeury results in more readable sub-queries. BUT ... they a 'DEPENDENT' as the inner query is refering to a field of the outer query. This means that for each row of the outer query the inner query is executed.
The same query can be written in two other ways:
SELECT continent, name, population
FROM Country
WHERE ROW(population, continent) IN (
SELECT MAX(population), continent
FROM Country
GROUP BY continent);
SELECT co1.continent, co1.name, co1.population
FROM country as co1
WHERE co1.population >= ALL
(SELECT co2.population
FROM country AS co2
WHERE co2.continent = co1.continent);
If you don't want to use sub-queries and prefer pure JOINs perhaps there are for you:
SELECT co1.continent, co1.name, co1.population
FROM country AS co1 LEFT JOIN country AS co2
ON co1.population < co2.population AND
co1.continent = co2.continent
WHERE co2.population is NULL;
SELECT co1.Continent, co1.Name
FROM Country AS co1 JOIN Country AS co2
ON co2.Continent = co1.Continent AND
co1.Population <= co2.Population
GROUP BY co1.Continent, co1.Name
HAVING COUNT(*) = 1
## added 2005-05-28 as no. 11, sent in by rudy@r937.com
SELECT co1.continent, co1.name
FROM Country AS co1 JOIN Country AS co2
ON co1.continent = co2.continent
GROUP BY co1.continent, co1.name
HAVING co1.population = MAX(co2.population)
Now you already know 8 ways. The last two shall only give you some more ideas. First of all a way that doesn't work (yet).
SELECT co2.continent, MAX(co2.population) AS maxpop,
(SELECT name
FROM Country
WHERE population = maxpop AND
continent = co2.continent)
FROM Country AS co2
GROUP BY co2.continent;
ERROR 1247 (42S22): Reference 'maxpop' not supported (reference to group function)
And as the last one the modified max-concat example from the manual.
SELECT continent,
SUBSTRING( MAX( CONCAT(LPAD(population,10,'0'),name) ), 10+1) AS name,
MAX( population ) AS population
FROM Country
GROUP BY continent;
The result is slightly different but it is more about the idea.
Comments