Database Design

Designing your database properly is critical to your application performing well. Just like putting the printer all the way across your office, placing data in poor relationships makes work less efficient in that it can cause your database server to waste time looking for data.

When thinking about you database, think about what kinds of questions will be asked when your database is used. For example, is this a valid username and password? Or, what are the details about a product for sale?

MySQL is a relational database. An important feature of relational systems is that a single database can be spread across several tables as opposed to our flat-file phone book example. Related data is stored in separate tables and allows you to put them together by using a key common to both tables.

The key is the relation between the tables. The selection of a primary key is one of the most critical decisions you'll make in designing a new database. The most important concept that you need to understand is that you must ensure the selected key is unique.

If it's possible that two records (past, present, or future) share the same value for an attribute, don't use them as a primary key. Including key fields from another table to form a link between tables is called a foreign key relationship, like a boss to employees or a user to a purchase. The relational model is very useful because data is retrieved easier and faster.

Now that you have separate tables that store related data, you need to think about the number of items in each table that relate to the number of items in another table. This is all about relationships and the type of relationships data falls into. Think of the relationship as a repository or bucket, and each bucket of data has a specific relationship.

Databases relationships are quantified with the following categories:

  • One-to-one relationships. In a one-to-one relationship, each item is related to one and only one other item. Within the example of a bookstore. A one-to-one relationship exists between users and their shipping addresses. Each user must have exactly one shipping address.
  • One-to-many relationships. A one-to-many relationship, has keys from one table that appear multiple times in another table. This is the most common type of relationship. For example, the categories for books such as hardcover, soft cover, and audio books. Each book is in one of those three categories. However, they're never in more than one category.
  • Many-to-many relationships. A many-to-many relationship means that two tables can each have multiple keys from the other table in them. For example, shoppers that use an online bookstore can purchase multiple books. Likewise, multiple users can purchase the same book title. The many-to-many relationship is converted to a mapping table with two one-to-many relationships in order for the database to represent the data.

Thinking about how your data is related and the most efficient way to organize it is called normalization. Normalization of data is breaking it apart based on the logical relationships to minimize the duplication of data. Generally, duplicated data wastes space and makes maintenance a problem.

Should you change information that is duplicated, there's the risk that you miss a portion and you risk inconsistencies in you database. It's possible to have too much of a good thing though: databases placing each piece of data in their own tables would take too much processing time and queries would be convoluted.

Finding a balance in between is the goal. To normalize a database, start with the most basic rules of normalization and move forward step by step. The steps of normalization are in three stages, called forms.

The first step, called First Normal Form (FNF), must be done before the second normal form. Likewise, the third normal form cannot be completed before the second. The normalization process involves getting your data into conformity with the three progressive normal forms.

The First Normal Form involves removal of redundant data from horizontal rows. You want to ensure that there is no duplication of data in a given row, and that every column stores the least amount of information possible.

Put simply, in order for your database to be in first normal form, it must satisfy two requirements. Every table must not have repeating columns that contain the same kind of data, and all columns must contain only one value.

As we stated above, the First Normal Form deals with redundancy of data across a horizontal row. The Second Normal Form (2NF) deals with redundancy of data in vertical columns. Normal forms are progressive. To achieve Second Normal Form, your tables must already be in First Normal Form.

For a database table to be in Second Normal Form, you must identify any columns that repeat their values across multiple rows. Those columns need to be placed in their own table and referenced by a key value in the original table.

If you've followed the First and Second Normal Form process, you may not need to do anything with your database to satisfy the Third Normal Form (3NF) rules. In Third Normal Form, you're looking for data in your tables that's not fully dependent on the primary key, but dependent on another value in the table. Where this applies to your tables isn't immediately clear.

Although databases store the same information that you collect and process in PHP, databases require fields to be set to specific types of data when they're created. A data type is classification of a particular type of information. When you read, you're used to conventions such as symbols, letters, and numbers.

Therefore, it's easy to distinguish between different types of data because you use symbols along with numbers and letters. You can tell at a glance whether a number is a percentage, a time, or an amount of money. The symbols help you understand a percentage, time, or amount of money are that data's type.

A database uses internal codes to keep track of the different types of data it processes. Most programming languages require the programmer to declare the data type of every data object, and most database systems require the user to specify the type of each data field.

The available data types vary from one programming language to another, and from one database application to another. But, the three main types of datanumbers, dates/times, and stringsexist in one form or another.