Beginning SQL
Beginning SQL Beginning SQL
Date of Date Meeting Did Member Name Birth Address Email Joined Date Location Attend? Martin Feb 27, 1 The Avenue, martin@ Jan 10, Mar 30, Lower West Side, Y 1972 NY some.com 2005 2005 NY Martin Feb 27, 1 The Avenue, martin@ Jan 10, April 28, Lower North Side, Y 1972 NY some.com 2005 2005 NY Jane Dec 12, 33 Some Road, Jane@ Jan 12, Mar 30, Lower West Side, N 1967 Washington server.net 2005 2005 NY Jane Dec 12, 33 Some Road, Jane@ Jan 12, April 28, Upper North Side, Y 1967 Washington server.net 2005 2005 NY Kim May 22, 19 The Road, kim@mail. Jan 23, Mar 30, Lower West Side, Y 1980 New Townsville com 2005 2005 NY Kim May 22, 19 The Road, kim@mail. Jan 23, April 28, Upper North Side, Y 1980 New Townsville com 2005 2005 NY There are a number of problems with this approach. First, it wastes a lot of storage space. You only need to store the members’ details once, so repeating the data is wasteful. What you have is repeating groups, something that the second rule of first normal form tells you to remove. You can remove duplicated data by splitting the data into tables: one for member details, another to hold location details, and a final table detailing meetings that took place. Another advantage of splitting data into tables is that it avoids something called the deletion anomaly, where deleting a record results in also deleting the data you want to keep. For example, say you want to delete details of meetings held more than a year ago. If your data isn’t split into tables and you delete records of meetings, then you also delete members’ details, which you want to keep. By separating member details and meeting data, you can delete one and not both. After you split data into tables, you need to link the tables by some unique value. The final rule of the first normal form — create a primary key for each table — comes into play. For example, you added a new column, MemberId, to the MemberDetails table and made it the primary key. You don’t have to create a new column, however; you could just use one or more of the existing columns, as long as, when taken together, the data from each column you combine makes for a unique key. However, having an ID column is more efficient in terms of data retrieval. Second Normal Form First normal form requires every table to have a primary key. This primary key can consist of one or more columns. The primary key is the unique identifier for that record, and second normal form states that there must be no partial dependences of any of the columns on the primary key. For example, imagine that you decide to store a list of films and people who starred in them. The data being stored is film ID, film name, actor ID, actor name, and date of birth. You could create a table called ActorFilms using those columns: Advanced Database Design 119
- Page 228: Chapter 3 94 ❑ Category ❑ FavCa
- Page 232: Chapter 3 96 You’re one step furt
- Page 236: Chapter 3 98 FirstName LastName Cat
- Page 240: Chapter 3 This time, you achieve mo
- Page 244: Chapter 3 SQL Is Set-Based 102 You
- Page 248: Chapter 3 104 Category MemberId Thr
- Page 252: Chapter 3 106 You should see the fo
- Page 256: Chapter 3 The overlapping portions
- Page 260: Chapter 3 110 More than one conditi
- Page 264: Chapter 3 112 Next, link the Member
- Page 268: Chapter 3 114 FirstName LastName Da
- Page 272: Chapter 3 ❑ The slightly tricky t
- Page 276: Chapter 4 normalization can make da
- Page 282: Now the tables are in second normal
- Page 286: After changing the database so that
- Page 290: IBM’s DB2 doesn’t allow you to
- Page 294: The following code creates a new ta
- Page 298: If you’re using DB2, the code is
- Page 302: The NOT NULL constraint prevents th
- Page 306: In the preceding code, CustomerId i
- Page 310: Foreign Key Foreign keys are column
- Page 314: MemberId MeetingDate Street City St
- Page 318: The third step had you create the p
- Page 322: The results are ordered because, by
- Page 326: The index is no longer needed, so d
Date of Date Meeting Did Member<br />
Name Birth Address Email Joined Date Location Attend?<br />
Martin Feb 27, 1 The Avenue, martin@ Jan 10, Mar 30, Lower West Side, Y<br />
1972 NY some.com 2005 2005 NY<br />
Martin Feb 27, 1 The Avenue, martin@ Jan 10, April 28, Lower North Side, Y<br />
1972 NY some.com 2005 2005 NY<br />
Jane Dec 12, 33 Some Road, Jane@ Jan 12, Mar 30, Lower West Side, N<br />
1967 Washington server.net 2005 2005 NY<br />
Jane Dec 12, 33 Some Road, Jane@ Jan 12, April 28, Upper North Side, Y<br />
1967 Washington server.net 2005 2005 NY<br />
Kim May 22, 19 The Road, kim@mail. Jan 23, Mar 30, Lower West Side, Y<br />
1980 New Townsville com 2005 2005 NY<br />
Kim May 22, 19 The Road, kim@mail. Jan 23, April 28, Upper North Side, Y<br />
1980 New Townsville com 2005 2005 NY<br />
There are a number of problems with this approach. First, it wastes a lot of storage space. You only need<br />
to store the members’ details once, so repeating the data is wasteful. What you have is repeating groups,<br />
something that the second rule of first normal form tells you to remove. You can remove duplicated data<br />
by splitting the data into tables: one for member details, another to hold location details, and a final table<br />
detailing meetings that took place. Another advantage of splitting data into tables is that it avoids something<br />
called the deletion anomaly, where deleting a record results in also deleting the data you want to<br />
keep. For example, say you want to delete details of meetings held more than a year ago. If your data<br />
isn’t split into tables and you delete records of meetings, then you also delete members’ details, which<br />
you want to keep. By separating member details and meeting data, you can delete one and not both.<br />
After you split data into tables, you need to link the tables by some unique value. The final rule of the<br />
first normal form — create a primary key for each table — comes into play. For example, you added a<br />
new column, MemberId, to the MemberDetails table and made it the primary key. You don’t have to<br />
create a new column, however; you could just use one or more of the existing columns, as long as, when<br />
taken together, the data from each column you combine makes for a unique key. However, having an ID<br />
column is more efficient in terms of data retrieval.<br />
Second Normal Form<br />
First normal form requires every table to have a primary key. This primary key can consist of one or<br />
more columns. The primary key is the unique identifier for that record, and second normal form states<br />
that there must be no partial dependences of any of the columns on the primary key. For example, imagine<br />
that you decide to store a list of films and people who starred in them. The data being stored is film<br />
ID, film name, actor ID, actor name, and date of birth.<br />
You could create a table called ActorFilms using those columns:<br />
Advanced Database Design<br />
119