|
We've all had to do it. When I've had to do it, it's usually been a database that I've created.
In one of my early attempts at database design, I made a table called
ListOfLists:
| ListID |
List |
Day |
| 1 |
sausages,
carrots, lettuce |
Monday |
| 2 |
pineapple, lemon,
apple |
Tuesday |
| 3 |
apple, cinnamon |
Wednesday |
Keeping several items within a single field seemed like a good idea at the time, but it is
an example of really bad design.
For starters, what if I want to have a list of 20 or 30 items. My "List" field would need to
be large enough to accommodate the longest list.
Another common mistake is to have one field for each item:
| ListID |
Item1 |
Item2 |
Item3 |
Day |
| 1 |
sausages |
carrots |
lettuce |
Monday |
| 2 |
pineapple |
lemon |
apple |
Tuesday |
| 3 |
apple |
cinnamon |
|
Wednesday |
This is really no improvement on my original design, because I will run into the same problem
with lists that contain more than 3 items.
Also, try writing a query to find all the lists which include "apple".
OK, so how best to fix this? I have thousands of records that need to be fixed!
What I need to do, is put my item data into a separate table:
| ItemID |
ListID |
Item |
| 1 |
1 |
lettuce |
| 2 |
1 |
carrots |
| 3 |
1 |
sausages |
| 4 |
2 |
apple |
| 5 |
2 |
lemon |
| 6 |
2 |
pineapple |
| 7 |
3 |
cinnamon |
| 8 |
3 |
apple |
My "ListOfLists" table will contain the same data as before, but without the "List" field:
| ListID |
Day |
| 1 |
Monday |
| 2 |
Tuesday |
| 3 |
Wednesday |
The SQL to generate my Item table, is as follows:
SELECT ListOfLists.ListID,
SUBSTRING(List,t1.Id,t2.Id) AS Item
INTO Item
FROM ListOfLists, Numbers AS t1, Numbers AS t2
WHERE (t1.Id+t2.Id=LEN(List)+2 OR
SUBSTRING(List,t1.Id+t2.Id,1)=",")
AND (t1.Id=1 OR SUBSTRING(List,t1.Id-1,1)=",") AND CHARINDEX(",",SUBSTRING(List,t1.Id+1,t2.Id-2))=0 AND
t2.Id>=2
This query relies on the following table, called "Numbers":
| Id |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
| ... |
"Numbers" must contain at least as many records as there are characters in the longest list.
To understand how this query works, consider the following variation, in which two extra fields
are selected, and the WHERE clause is dropped:
SELECT t1.Id, t2.Id, ListOfLists.ListID,
SUBSTRING(List,t1.Id,t2.Id) AS Item
FROM ListOfLists, Numbers AS t1, Numbers AS t2
Using the sample data I have presented, this query will output all of the possible substrings
of all the lists.
A sample follows:
| t1.Id |
t2.Id |
ListID |
Expr1 |
| 1 |
1 |
1 |
s |
| 1 |
2 |
1 |
sa |
| 1 |
3 |
1 |
sau |
| 1 |
4 |
1 |
saus |
| 1 |
5 |
1 |
sausa |
| 1 |
6 |
1 |
sausag |
| 1 |
7 |
1 |
sausage |
| 1 |
8 |
1 |
sausages |
| 1 |
9 |
1 |
sausages, |
| 1 |
10 |
1 |
sausages,c |
| 2 |
1 |
1 |
a |
| 2 |
2 |
1 |
au |
| 2 |
3 |
1 |
aus |
| 2 |
4 |
1 |
ausa |
| 2 |
5 |
1 |
ausag |
| 2 |
6 |
1 |
ausage |
| 2 |
7 |
1 |
ausages |
| 2 |
8 |
1 |
ausages, |
| 2 |
9 |
1 |
ausages,c |
| 2 |
10 |
1 |
ausages,ca |
The WHERE clause in my original query picks out those substrings which
(1) Start at the beginning of a list, or start 1 character to the right of a comma.
(2) Finish at the end of a list, or finish 1 character to the left of a comma.
Both queries make use of an implicit CROSS JOIN between "Numbers" and itself. CROSS JOINS are
discussed in more detail in a previous tip
Adelle.
|