Identifying entities in ER modeling

Having identified the general requirements of the system, the first phase in conceptual modeling and creating an ER model is to identify the entities in the system.

Entities are objects or things that can be described by their characteristics. As we identify entities, we list the attributes that describe the entity. For example, a customer is an entity that has a name, an address, a phone, and other details.

Be careful when choosing entities. A customer or a wine is an entity. Reducing the stock in the inventory and adding it to a shopping cart is a function or process, not an entity. The basic rule is that an entity is an object or thing.

Five entities and their attributes have already been identified earlier in this appendix. Four are easy to determine from our requirements:

  • The wine entity has the attributes type, name, year, and description.

  • The customer entity has the attributes surname, firstname, initial, title, addressline1, addressline2, addressline3, city, state, zipcode, country, phone, fax, salary, birthdate, email address, and discount.

  • The winery entity has the attributes name, description, phone, and fax.

  • The region entity has the attributes name, description, and map.

We add a users entity to this list in order to maintain user account details at the winestore:

  • The users entity has the attributes user_name and password. The user_name is the same as the customer email address.

The remaining entities-and, in two cases, the distinction between the entities-are harder to identify.

We have earlier identified the order entity in our introduction to ER modeling, but an order is hard to precisely define. One description might be:

An order is an object created by a customer when they agree to purchase one or more (possibly different) bottles of wine.

We can then say that an order is created on a date, and the system requirements in Chapter 1 identify that an order has a discount, a delivery cost, and a delivery note.

We can also say that this model of an order consists of one or more different wines and, for each different wine, a quantity of that wine is purchased. The subparts in each order-the different kinds of wines-are the items that make up the order. But is the wine itself part of an item? The distinction is hard, but the correct answer is probably no: this is a relationship, the items that make up an order are related to wines.

There are now two more entities-orders and items-and two relationships, which illustrates how difficult it is to reason about entities without considering how they are related. Determining entities isn't always easy, and many different drafts of an ER model are often required before a final, correct model is achieved. The ER model for the winestore took several attempts to get right.

Here are the item and order entities:

  • The item entity-which is related to an order-has the attributes quantity and price.

  • The order entity has attributes date, discount percentage, delivery cost, and delivery note.

The system requirements in Chapter 1 showed that wines are delivered in shipments. Each shipment is on a date and consists of a number of bottles, at a per-bottle and per-case price. How is this incorporated into the model? Perhaps the most obvious solution is to add quantity and price attributes to the wine entity. This doesn't work well: it is difficult to maintain the possibly different prices for different shipments and to maintain the correct shipment dates.

A good solution to the inventory problem is an inventory entity. This entity is related to the wine, and maintains different sets of data for each shipment of each wine:

  • The inventory entity has an on-hand quantity, an item cost, a dateadded, and a case cost (for a dozen bottles).

The final entity is somewhat of an oddity. If a wine is a Cabernet Merlot, you can simply store the string Cabernet Merlot in an attribute in the wine entity. Another approach is to have a grape_variety entity, where each different grape variety is described individually. So, Cabernet is one instance of a grape_variety entity, and Merlot is another. The grape_variety entity is then related to the wine entity. This approach does seem overly complicated, but let's opt for it anyway because it introduces an instructive twist to our modeling, a many-to-many relationship discussed in the next section.

Let's add two attributes to the grape_variety entity, variety (the description) and ID (a counter used to, for example, record that Cabernet is the first word in Cabernet Merlot, and Merlot is the second word.

  • The grape_variety entity has two attributes, ID and variety.

There are other possible entities. For example, the shopping basket could be an entity: the shopping cart is an object that contains items that will be ordered. However, a shopping cart is an incomplete order and, hence, it's omitted from the entity list. Including it is perhaps valid, and depends on how the entities are interpreted from the requirements.

There are also other entities that are outside the scope of our requirements. For example, a country might contain many regions, but there is no requirement for countries to be modeled in our system. Also, the winestore itself is an entity, but we are actually interested in the entities that make up the winestore, not really the whole concept itself. Selecting entities is all about getting the granularity and scope of choice right.

We have hinted at but not explicitly identified the relationships between the entities. For example, a winery is part of a region, a wine is made by a winery, and an item is related to a wine. The first step is to identify the entities and their attributes; the second step is to identify how the entities are related.