Buddies

I have the following table:

Table:  Employee
Field # Name Type
Field1: EmployeeID autonumber, primary key
Field2: Name Text

As part of our new fire safety plan, each employee has a "Buddy". In the event of a fire, everyone is supposed to check that their buddy has escaped the building.

I can think of a couple of ways of recording this information in my database:

(i) Add the following field to the Employee table:

Field4: BuddyID number (EmployeeID of another record in the same table).

If I do this, then I need to maintain the buddy relationship by ensuring that when each employee is assigned a Buddy, that employee is their own buddy's buddy.

That is, if John is Mark's buddy, then Mark must also be John's buddy.

(ii) Create a new table called Buddy with two fields:

Field1: Buddy1 int
Field2: Buddy2 int

(The two fields combine to make a composite primary key).

If I use this method, then I need to ensure that no employee is listed in the table twice.

Whichever method I use, I need to be able to display a list of Employees and their buddies.

Starting with method (i)...

The following SQL is incorrect:

SELECT t1.Name, t2.Name
FROM Employee AS t1 INNER JOIN Employee AS t2
ON t1.BuddyID=t2.EmployeeID

Syntactically, this query is OK, but it will display each employee twice:

t1.Name t2.Name
Jane Amos
Amos Jane
Wilma Mark
Mark Wilma

Let's run that query again, but this time select the EmployeeID's as well as the names:

SELECT t1.EmployeeID, t1.Name, t2.EmployeeID, t2.Name
FROM Employee AS t1 INNER JOIN Employee AS t2
ON t1.BuddyID=t2.EmployeeID

The results of this query follow.

t1.EmployeeID t1.Name t2.EmployeeID t2.Name
2 Jane 1 Amos
1 Amos 2 Jane
4 Wilma 3 Mark
3 Mark 4 Wilma

Compare the first and second rows. Jane (ID=2) appears on the left side in the first row, and on
the right side in the second row.

What if we were to only select rows where the employee with the greater EmployeeID appears on the left side:

SELECT t1.Name, t2.Name
FROM Employee AS t1 INNER JOIN Employee AS t2
ON t1.BuddyID=t2.EmployeeID
WHERE t1.EmployeeID>t2.EmployeeID

It turns out that this will do what I want, and each pair of buddies will be listed only once.

OK, so what about method (ii)...

Create a new table called Buddy with two fields:

Field1: Buddy1 int
Field2: Buddy2 int

I can set up the same "buddy" relationships by entering the following rows into the Buddy table:

Buddy1 Buddy2
2 1
4 3

The query to return a list of all the employees and their buddies is

SELECT t1.Name, t2.Name
FROM (Buddy LEFT JOIN Employee AS t1 ON t1.EmployeeID=Buddy.Buddy1)
LEFT JOIN Employee AS t2 ON t2.EmployeeID=Buddy.Buddy2

In some RDBMS's, it is possible to prevent multiple instances of the same pairing from being entered into the Buddy table, by following these steps:

1. Make Buddy1 and Buddy2 a composite primary key.
2. Add a constraint to the table, so that Buddy1 > Buddy2
3. Add UNIQUE constraints to both columns.

However, these steps are not sufficient to prevent the following scenario:

Buddy1 Buddy2
4 3
3 2
2 1

EmployeeID's 3 and 2 have been listed twice each.

One way to check for this scenario is to count the number of employees, and then compare that with the number of records in the Buddy table.

If the number of employees is even, then the number of records in the Buddy table should be exactly half that number.

If the number of employees is odd, then there will be one employee who does not have a buddy, and the number of records in the Buddy table should be half the remaining number of employees.

"So which is preferable. Method (i) or (ii)?"

My preference is for method (ii). It is easier to keep the data consistent using this method and by following the steps outlined above.

Also, having a table to represent the Buddy relationship means that I have somewhere to store additional information about that relationship.

For example, I could modify the structure of the Buddy table to include a location where the buddies should meet after a fire:

Table:  Buddy
Field # Name Type
Field1 Buddy1 int
Field2 Buddy2 int
Field3 MeetingLocation text

MeetingLocation is a property of the relationship, and not of the employees themselves.

It should be noted, that this type of structure is a special case and it is rarely appropriate. Only use it if the number of items in the relationship will always be exactly two.

This would not be a good structure to store a list of synonyms for a thesaurus.

It seems obvious that you could have a table called "Synonym" with fields "Word1" and "Word2" to represent the relationship between the words "chair" and "seat" or between "big" and "large".

However, this theory breaks down when you add the word "huge".

A better model in this instance is to have a table for each word group, and another table to represent
the membership of each group:

Table:  Word
Field # Name Type
Field1 WordID autonumber, primary key
Field2 Wrd Text
Field3 Meaning Text

Table:  WordGroup
Field # Name Type
Field1 GroupID autonumber, primary key
Field2 GroupHeading Text

 

Table:  GroupMembership
Field # Name Type
Field1 GroupID number
Field2 WordID number

It is not absolutely necessary to add an autonumber field to the GroupMembership table, as the GroupID and WordID can be combined to make a composite primary key.

My rule of thumb, is that the GroupMembership table should have an autonumber primary key, if it is
going to store any additional information about the group membership.

For example, the GroupMembership table could be extended to include antonyms as well as synonyms:

Table:  GroupMembership
Field # Name Type
Field1 GroupID number
Field2 WordID number
Field3 Type number (1=synonym, 2=antonym)
Field4 GroupMembershipID autonumber, primary key


Adelle.
 

Home About the Author Copyright © 2001 Adelle Hartley