First, sorry: I haven't looked at data base books for years, and have no idea what's current.
Here's my take on indexing:
Basically, most data base systems permit you to define multiple indexes on a table, and, again, most of them will return a value from the
index if the query can be satisfied using data in the index, without looking at the actual table.
For example, in the query I suggested in answer to your question:
Code:
SELECT * FROM File
WHERE commiterID IN
(SELECT commiterID FROM Commiter WHERE commiterName="NEO");
you are first looking for the "commiterID" for a specific "commiterName" and then looking for all the information stored in "File" for that "commiterID". Thus the first table ("Commiter" should be indexed jointly on
both "commiterName" and "commiterID"
in that order so a hit on "commiterName" in the table will return "commiterID" at the same time. (Note that, if these are all the fields in the table, the whole table will be physically stored in the index by most data base systems.)
Obviously that second table ("File") needs to be indexed on "commiterID." Depending on the data in that table, that should probably be the primary key on the table.
So the bottom line on indexing is just use common sense, and consider carefully the use to which the data will be put.
[aside]
You should note that the query as I wrote it will return all the data for all entries where the "commiterName" in the "Commiter" table is matched. Thus, unless the "commiterName" is restricted to be unique in your data base design, you may get multiple records returned. If "commiterName"
is unique, then "commiterID" is
redundant, and you would be better off just using "commiterName" as your primary key on the "file" table.
[/asside]