overview of topics in the texbook :
Given data per user that lists what they like (say movie ratings), how should we pick suggested movies for a given individual?
How to organize in files complicated data collections?
Consider books, their authors, and their publishers.
This doesn't work well.
book
----------
id,name,phone,publisher,auther1,auther2,
problems:
Instead of trying to put an author list into a book as we would do in a language like python
book = {name='This book', authors=['George', 'Alice']}
we user reference pointers (ids) to refer to other "things" (objects) in our database.
Here's an example.
Publishers
------------
id, name, address, phone, ...
1, Bob's Books, ...
2, Mary's Magazines, ...
Authors
-------
id, name, address, phone, ...
1, John Smith, ...
2, Jane Doe, ....
Books
-----
id, name, ISBN, ...
1, A Field Guide to Benches, 1112223334
2, Rocket Science Made Easy, 9879879876
So far so good : no replication of information, no ever-expanding columns.
But now we need to make the relations. There are several types
Since each book only has a single publisher, we can put a single pointer in the book database to that publisher's id.
Books
-----
id, name, ISBN, publisher_id
1, A Field Guide to Benches, 1112223334, 2
2, Rocket Science Made Easy, 9879879876, 1
Many to many is the tricky one. We make a new table with an entry for each connection. Sometimes there's an obvious name for this table; sometimes not.
AuthorBook
-----------
id, author_id, book_id
1, 1, 1
2, 2, 1
3, 1, 2
The "id" in that last table is the unique identity of the connection ... which we may never need. (Unless we start making changes.)
Quick quiz: who are the authors of which books?
And here's favorite explanation of this stuff : the gaytabase (That version is not currently online, only this shorter version.)