Relational databases – ER diagram Part 1

Relational databases – ER diagram Part 1

In this blog post you will get to know about ERD. ERD stands for Entity Relationship Diagram.
It’s a method to describe tables and its relationships. Entity and table are used interchangeably.
I will use a casus to explain this in a more concrete way.

In part 2 the remaining questions from the casus will be explained.

Casus
A municipality wants a database with information about its inhabitants.
The information will be used for planning of schools, health care and child care.


The database design needs to be able to answer these following questions:

1. How many boys and girls start school during year X?
2. How many become old-aged pensioners in year X?
3. How many households have more than X people?
4. How many people are single parents?
5. How many households have at least one member unemployed?
6. How many households have a total income less than the norm for receiving social benefits?

The first step is to identify the entities.

Entity human being

Since it is about inhabitants, we need a table to keep track of person information.
Just for the fun, we call it “humanBeing”, other names could be “person” or “inhabitant”.
Every table needs a primary key to identify a unique record. Here it is id_humanBeing.
Other useful colums would be address information, birthday, sex.
So we can have the following table:
humanBeing (id_humanBeing, id_address, birthdate, sex, inactive, died)
Birthday is of date type, e.g. 1984-05-25. Sex is F or M.
Inactive indicates the person is living outside this municipality.

Entity address

Since there are many types of information to store about an address, we use a separate table “address”.
The table can have the following columns:
address (id_address, streetName, doorNumber, zipcode)
The primary key id_address is used by the humanBeing table to retrieve the unique address record.
In the table humanBeing the id_address column is called a foreign key.

The relationship between these two tables are like this:

 

A human being has zero (minimum) or one (maximum) address. Think of a refugee as zero address.
When a human being moves to a different address within the municipality, the human being record will be changed.
So still there is only one record.
An address can be used by many human beings, the crow’s foot means many (zero or more).
Think of a new building with apartments that is not yet inhabited as zero.

 

 

 

 

 

 

Answers question 2 and 3

With these two tables we can already answer two questions:
2. How many become old-aged pensioners in year X?
3. How many households have more than X people?

 

 

 

 

 

 

 

For question 2 we only need the birthday, year X and the age of pension.
Let’s decide the age of pension is 65 and year X is 2015.
This X variable can be filled by a user selecting a year from a dropdown menu.
A SQL query that would get the answer is for example the following:

SELECT COUNT(*) AS 'NoOfPensioners' 
FROM humanBeing
WHERE 2015 - YEAR(birthdate) = 65

A query is used to communicate with the database. A written command to retrieve some data, to modify or to delete the data.

 

 

 

 

 

For question 3, we need the address table, a link to human being table and X people.
Let’s say X is 6 people.
A household can be determined by an address.
A SQL query that would get this answer would be:

SELECT COUNT(id_humanBeing) AS 'NoOfHouseholds'
FROM address
JOIN humanBeing ON humanBeing.id_address = address.id_address
GROUP BY humanBeing.id_address
HAVING COUNT(id_humanBeing) > 6

To answer the question about boys and girls going to school, we need some more tables.

Entity school

To keep the table simple, we can have like the following minimal columns:
school (id_school, schoolName, creationDate)
Creation date is the date of entering the information in the database.

Entity schoolYear

To link a humanBeing to school, we need at least one more table.
We cannot just connect a school to humanBeing entity by adding id_school in humanBeing table.
A human being can attend more than one school.
It is therefore not efficient to have more than one column to store schools in.
The new table, we call it schoolYear and can have the following columns:
schoolYear (id_schoolYear, id_school, startDate, endDate)
The start and end date is of a school year, e.g. 1 September 2018 and end date 05 July 2019.

Entity student

We could have put id_humanBeing in the schoolYear table, that works fine.
Then this new entity “student” is also not needed.
To introduce this new table, is my decision to prevent redundant information.
Since more than one person is attending a school year by a given school, three columns will be duplicated: id_school, startDate and endDate.
A link to student entity can make the combination of those three columns unique (appearing once in the table).
The student entity can have the following columns:
student (id_student, id_humanBeing, id_schoolYear)

Here is the relationship between the various tables:

A human being links to zero or more student (entity).
This simply means a human being can be a student (the zero annotation as minimum).
Not all human beings are or have been a student.
They could have completed their studies in a different municipality or even a different country.
The human being, as a student, in a specific school year is indicated by the 1 annotation.
So a human being can be studying a few years, meaning being student in specific years.
A school year has one or more students, one is the minimum.
A school has one or more school years and a specific school year belongs to a school.

 

 

 

 

 

Answer question 1

1. How many boys and girls start school during year X?

 

 

 

 

 

 

 

 

I read how many people start school without taking the sex into consideration.
To make it more fun, we will make a distinction on how many boys and how many girls as well.
Let’s say school year is 2017.
Here we need the sex from humanBeing table and the startDate of the schoolYear table.

An example of a query would be:

SELECT sex, COUNT(sex) as amount
FROM student
JOIN humanBeing on humanBeing.id_humanBeing = student.id_humanBeing
JOIN schoolYear ON schoolYear.id_schoolYear = student.id_schoolYear
WHERE year(startDate) = '2017'
GROUP BY sex
UNION
SELECT 'Both', COUNT(sex) as amount
FROM student
JOIN humanBeing on humanBeing.id_humanBeing = student.id_humanBeing
JOIN schoolYear ON schoolYear.id_schoolYear = student.id_schoolYear
WHERE year(startDate) = '2017'

The data it gives back looks like this:

Sex Amount
M 3000
F 2800
Both 5800

Comments

comments

Leave a Reply