Thursday, May 27, 2010

Going Meta

Lately I've been using some of the same tricks over and over again in different contexts. Essentially I've just been doing a lot of complex queries, where I select a set of records first, and then select other data out of the first set:

select my_set.X, my_set.Y, my_set.Z
from (
select * from whatever_it_is
where my_item = 'something'
) my_set


Why would I need to do this? Well, for one thing it breaks a problem down into smaller bits, so it is easier to work on. I can write and validate a query that gets all the records I might want to work with, and then write two or three queries that use it to produce friendlier output. I can also use it to do things like counts:

select my_set.X Item,
count(case when my_set.Y = 'Chicago' then 1 else null end) Num_Chicago,
count(case when my_set.Y = 'Phoenix' then 1 else null end) Num_Phoenix,
count(my_set.Y) Total
from (
select * from whatever_it_is
where my_item = 'something'
) my_set
group by my_set.X

This will produce something that looks like this:



ItemNum_ChicagoNum_PhoenixTotal
Item 1 4 1 5
Item 2 2 2 4

You can do grand totals using a UNION ALL with another query:

select my_set.X Item,
count(case when my_set.Y = 'Chicago' then 1 else null end) Num_Chicago,
count(case when my_set.Y = 'Phoenix' then 1 else null end) Num_Phoenix
from (
select * from whatever_it_is
where my_item = 'something'
) my_set
group by my_set.X
UNION ALL
select 'Totals',
count(case when my_set.Y = 'Chicago' then 1 else null end) Num_Chicago,
count(case when my_set.Y = 'Phoenix' then 1 else null end) Num_Phoenix,
count(my_set.Y) Total
from (
select * from whatever_it_is
where my_item = 'something'
) my_set

The key to doing a UNION is you need your columns to match up. So note that I have the same number of columns (3) and I used the same names (Item, Num_Chicago, Num_Phoenix, Total) to describe them. By not including my_set.X in the second query I end up with total counts of all the rows, not simply of one item. The output from this should look like:




ItemNum_ChicagoNum_PhoenixTotal
Item 1 4 1 5
Item 2 2 2 4
Totals 6 3 9

I've used the same sort of complex queries to pull the earliest or latest item by category from a set of records; it is a fairly simple concept but can be very powerful.

No comments:

Post a Comment