|
Hierarchies Part 3 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This article is a follow on from Hierarchies Part 2
In my previous tip, I listed the hypothetical employees of my company and their positions: Adelle : Manager Since last week, three new positions have opened up: 2 Account Managers in Australia. Each of these positions will report to an Area Manager. As far as my database is concerned, it doesn't matter that these positions have not yet been filled, because the positions are stored in a separate table from the employees. Previously, I hinted that there might be a better way to represent the hierarchy of my organization. "Better" is a relative term, as we shall see. But there is certainly another design pattern which can be applied to represent tree structures, which has many advantages over the design pattern I presented in "Self Joins Part 1". In most of my examples to date, I have not used my usual field naming convention because this is an area In the following example, I have started the field name of each numeric field with the letter "l", and each text field with an s. (l = Longint, s = string). As well as being good practice generally, it solves a problem which would have otherwise arisen in this example: "Left" and "Right" are reserved words in SQL. I have modified the structure of my "Position" table, so that each node has an "lLeft" column and an "lRight" column:
Ignore the lPositionID column for the moment: it is not related to lLeft or lRight. Each node is counted twice. Once on its left side, and once on its right. The numeric order for each node-side is as follows:
A useful property this system, is that given a particular starting node, the numeric value of the left and right sides of each node below the starting node, are between the left and right sides of the starting node.
So, to find all the subordinates of the Area Manager Query 1
SELECT t1.* During execution of this query, t2 will pick up just one record: the Area Manager, with lPositionID=3. t1 will pick up all the records with lLeft > 3 and lLeft < 8. The query could also be written as: Query 2
SELECT t1.*
because it doesn't matter which side of the subordinate nodes we look at - they will both fall within the range
given by the left and right sides of the starting node. Query 3
SELECT t1.* OR, using BETWEEN: Query 4
SELECT t1.* The output of Queries 3 and Query 4 is shown below:
If the left and right sides of each node below the starting node, are between the left and right sides of the starting node, it follows that the left and right sides of the starting node are between the left and right sides of the nodes above it. Query 5
SELECT t2.* Compare Query 5 with Query 4 (hint: only the first and last lines differ). The output of Query 5 follows:
Now that we can identify the nodes above a given starting node, we can count them to determine the depth of our starting node: Query 6
SELECT COUNT(t2.lPositionID) Query 6 will output the number "3". That is, the Area Manager (Australia) [lPositionID=3] is 3 levels deep. It is also possible to output the depths of every node in the tree: Query 7
SELECT COUNT(t2.lPositionID) AS lDepth, t1.lLeft, t1.lPositionID, The output of Query 7 follows:
When I first heard about this technique for representing hierarchical tree structures, I thought it was pretty cool. But there is a catch. To insert a new node into the tree (or to delete a node) potentially requires updating every record in the table. This is because the numbering system MUST be maintained in order for this technique to work.
Details for how to do this can be found in the following article by Joe Celko:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||