Data Modeling & ERDs

Resolving Many-to-Many

18 min Lesson 6 of 10

Resolving Many-to-Many

One of the most common mistakes junior analysts make is leaving a many-to-many relationship "as is" in a physical data model. Databases cannot directly implement a many-to-many line — they need a concrete table to sit between the two entities. The technique for doing this is called resolving the relationship, and the table you introduce is called a junction entity (also known as an associative entity, bridge entity, or linking table).

Why Many-to-Many Cannot Stay Unresolved

Consider an online bookstore. A single Customer can place many Orders, and an Order can contain many Books. If you draw a direct line between Order and Book you cannot answer simple questions like "how many copies of this book were in that specific order?" or "what was the discount applied per line?" There is no column to hold that data because there is no physical row to put it in.

Key principle: Any attribute that belongs to the relationship itself (not to either entity alone) is your clearest signal that a junction entity is needed. If you cannot decide which table an attribute belongs to, it almost certainly belongs in the junction.

Before Resolution: The Raw Many-to-Many

The diagram below shows a simplified clinic scenario before resolution. A Patient can book many Doctor appointments, and a Doctor can see many patients. The double crow-foot notation on both ends marks this as a many-to-many (M:N).

Unresolved many-to-many between Patient and Doctor Patient PK patientID fullName dateOfBirth phone Doctor PK doctorID fullName specialty phone books M N
Before resolution: a direct many-to-many between Patient and Doctor — valid in a logical model but cannot be implemented in a relational database.

After Resolution: Introducing the Junction Entity

To resolve the relationship, you remove the M:N line and introduce a new entity — Appointment — that sits between Patient and Doctor. The junction entity holds:

  • A foreign key to Patient (patientID) — one end of the original pair.
  • A foreign key to Doctor (doctorID) — the other end.
  • A composite primary key made from both FKs (or a surrogate PK plus a unique constraint on both FKs).
  • Any attributes of the relationship itself — in this case appointmentDate, startTime, status, and notes. These could not live in either Patient or Doctor without causing redundancy.
Resolved many-to-many: Patient — Appointment — Doctor junction entity Patient PK patientID fullName dateOfBirth phone Appointment PK appointmentID FK patientID FK doctorID appointmentDate startTime status notes Doctor PK doctorID fullName specialty phone 1 M M 1 Junction Entity
After resolution: the Appointment junction entity decomposes the M:N into two one-to-many relationships, and gains its own relationship-level attributes.
Naming the junction entity: Give it a meaningful business name, not a mechanical concatenation like PatientDoctor. In this clinic, the relationship is an "Appointment" — a real concept in the domain. A library might call it "Loan". A store might call it "OrderLine" or "OrderItem". A meaningful name signals the entity deserves to exist in its own right.

A Second Example: Online Store Order Lines

In an online store, a raw M:N exists between Order and Product. Resolving it creates an OrderItem junction entity that carries the attributes that belong to the specific combination — quantity and unitPrice (the price at the time of purchase, which may differ from today's product price).

Order (orderID PK, customerID FK, orderDate, totalAmount) OrderItem (orderID FK, productID FK, <-- composite PK quantity, unitPrice) <-- relationship attributes Product (productID PK, name, description, currentPrice, stockQty)

Without OrderItem, where would you store quantity? It cannot go on Order (one order covers many products) and it cannot go on Product (one product appears in many orders). The junction entity is the only logical home.

Identifying the Need for a Junction Entity — Checklist

  1. Both cardinality markers are "many" — a crow-foot on both ends of a relationship line is the immediate trigger.
  2. Attributes describe the pair, not either entity alonequantity, booking date, role in the project are classic examples.
  3. You need to record timestamps or ordering of the relationship — when did the patient first see that doctor? When was the book borrowed?
  4. The relationship itself has a lifecycle — an appointment can be confirmed, cancelled, or completed; a loan can be active, returned, or overdue.
Do not over-resolve. A pure one-to-many relationship does not need a junction entity. Adding one where it is not required creates unnecessary complexity, extra joins, and more code to maintain. Only introduce a junction when the M:N genuinely exists or when the relationship carries its own data.

Library Example: Student Borrows Book

A university library system illustrates a junction entity with a clear lifecycle. A Student can borrow many Books, and a Book (copy) can be borrowed by many students over time. The junction entity Loan captures borrowedDate, dueDate, returnedDate (nullable), and fineAmount. Each of these attributes is meaningless without both the student and the book in context.

Composite vs surrogate primary key: Some teams use a composite PK of both FKs (e.g., studentID + bookCopyID + borrowedDate if a student can borrow the same book twice). Others prefer a surrogate loanID as PK with a unique constraint on the natural combination. Both are valid — document your choice in the data dictionary.

Summary

Resolving many-to-many relationships is one of the most impactful steps in transforming a logical ERD into a physical one. Every M:N line becomes a junction entity with two foreign keys and any attributes that belong to the association. The junction entity often earns a meaningful business name — Appointment, OrderItem, Loan, Enrollment — because it represents a real concept in the domain. Mastering this transformation is essential before you move on to normalization.