Protocol for PKs (Primary Keys) with Null Entries?

WWGD · Dec 20, 2015

Hi all :I have the table Emp (EmpNum PK, EmpLastNM, EmpEducation, EmpDependent, EmpEducation,...EMpHireDate) which I want to normalize and then I want to do an ERD for it.

I must break this into tables one for which , it seems, I cannot have a non-null PK:

Now EmpDependent is both multi-valued and has some nulls. My question is, what can I do about setting up a PK for empdependent given the null?
I have an idea of using the triple (Emp_Num, Emp_DepName, Emp_DepRelation) as the PK; an employee may have more than one dependent by the same name, but (given dependents are nuclear family) , not likely two dependents with both same name and relation.

STILL, there is an employee without dependents, meaning this triple will contain some nulls. How does one usually deal with this?

PAllen · Dec 21, 2015

You wouldn't enter rows in this table for employees with no dependents. There is no point in doing so; then your PK is valid.

Samy_A · Dec 21, 2015

Employee John Smith has a son Peter Smith. He divorces, gets full custody of Peter, and marries again with Carole. Alas, Carole was married before to Brian Smith, and they also had a son Peter Smith, for whom Carole has full custody. Now the nuclear family of John and Carole has two Peter Smith's in it.

Quite unlikely, sure, but I wouldn't structure a database with a PK that could conceivably become non-unique.
Why not add a PK (autonumber) to the EmpDependent table?

Another potential problem with the structure are two employees married to each other. Are their children dependents of both? Maybe you need three tables: employees, dependents, employees_dependents_relation.

WWGD · Dec 21, 2015

Samy_A said:

Employee John Smith has a son Peter Smith. He divorces, gets full custody of Peter, and marries again with Carole. Alas, Carole was married before to Brian Smith, and they also had a son Peter Smith, for whom Carole has full custody. Now the nuclear family of John and Carole has two Peter Smith's in it.

Quite unlikely, sure, but I wouldn't structure a database with a PK that could conceivably become non-unique.
Why not add a PK (autonumber) to the EmpDependent table?

Another potential problem with the structure are two employees married to each other. Are their children dependents of both? Maybe you need three tables: employees, dependents, employees_dependents_relation.

Thanks, that is precisely the key I had in mind: {emp_num, emp_dependentname, emp_relation}. But you're right to consider guarding against very common names: john smith, bob jones, etc. Still, the may be a way around it: we can use Pete Smith for one, Peter Smith for the other and so on. This I think can be made to work for most English names , who have informal as well as formal names :Charles, Charlie, Chuck, Pete, Peter, Bob, Robert. I think something similar may be the case for names in other languages. A problem from this may be the one may want to enter the person's actual legal name. And you don't want to introduce the SSN, since you want to keep it private. But I think the issue of wanting the person's legal name may not really matter that much, as this is kind of secondary and could be obtained in a different way.

WWGD · Dec 21, 2015

PAllen said:

You wouldn't enter rows in this table for employees with no dependents. There is no point in doing so; then your PK is valid.

I wonder if using the employee's name itself as a dependent would also work for these cases, understood (maybe written some way in the data dictionary) to mean the employee has no dependents. But this may slow the search and take up a lot of memory . So maybe your advice is best and we could rename the table: EmployeesWithDependents to make your point clear.

Samy_A · Dec 21, 2015

WWGD said:

Thanks, that is precisely the key I had in mind: {emp_num, emp_dependentname, emp_relation}.
But you're right to consider guarding against very common names: john smith, bob jones, etc. Still, the may be a way around it: we can use Pete Smith for one, Peter Smith for the other and so on. This I think can be made to work for most English names , who have informal as well as formal names :Charles, Charlie, Chuck, Pete, Peter, Bob, Robert. I think something similar may be the case for names in other languages. A problem from this may be the one may want to enter the person's actual legal name. And you don't want to introduce the SSN, since you want to keep it private. But I think the issue of wanting the person's legal name may not really matter that much, as this is kind of secondary and could be obtained in a different way.

I think we are talking about different things here.
The combination of fields {emp_num, emp_dependentname, emp_relation} as primary key doesn't look like a good idea. Once you have to start with tricks (enter Charles as Charlie if need be, ...) it will get messy sooner rather than later.
If we ignore the problem of dependents of two employees I mentioned at the end of my previous post, why not simply add a numeric primary key to your EmpDependent table? That way you don't have to do any tricks with common names: every dependent will have his own key.

WWGD said:

I wonder if using the employee's name itself as a dependent would also work for these cases, understood (maybe written some way in the data dictionary) to mean the employee has no dependents. But this may slow the search and take up a lot of memory . So maybe your advice is best and we could rename the table: EmployeesWithDependents to make your point clear.

I don't understand this. I thought the second table was a table with dependents. The name EmployeesWithDependents suggests that this is a table of Employees. @PAllen advice is correct: employees with no dependent simply will have no related records in the EmpDependent table. No need for any special infrastucture. If you need a list of employees without dependents, an outer join between the two tables, with a NULL criterium on the EmpDependent PK will provide that.

Borg · Dec 21, 2015

Simply put, this is all that you need to do.

Samy_A said:

...simply add a numeric primary key to your EmpDependent table

PAllen said:

You wouldn't enter rows in this table for employees with no dependents.

WWGD · Dec 21, 2015

OK, thanks to all; as it must be clear to you, I am pretty new at this, so I lack a context in which to make reasonable/sound decisions at this point.

pbuk · Dec 23, 2015

Just picked up on this and in relation to your other recent posts I have a couple of questions:
how are you learning this stuff - are you following a course or book?
what is your aim and what tools are you using - SqlServer is great if you are developing an enterprise application and have access to all the tools in a full .NET development environment, but if you just want to learn about RDBMSs there may be better ways

As a guide, your learning should lead you to designing a schema something like this:

There is an employee entity so we need an employees table
[*]Whilst it looks like we could use EmployeeNumber as the Primary Key for this table there are advantages to having a key that is independent of any property value so we use an auto-increment key which is by convention called id.
[*]An Employee can have more than one Dependent, so we need a dependents table to represent this entity
[*]Again the Dependents table has an auto-increment key called id.
[*]The Dependents table has a column fk_employee which creates a many-to-one relationship to the Employees table.

WWGD · Dec 23, 2015

MrAnchovy said:

Just picked up on this and in relation to your other recent posts I have a couple of questions:
how are you learning this stuff - are you following a course or book?
what is your aim and what tools are you using - SqlServer is great if you are developing an enterprise application and have access to all the tools in a full .NET development environment, but if you just want to learn about RDBMSs there may be better ways

As a guide, your learning should lead you to designing a schema something like this:

There is an employee entity so we need an employees table
[*]Whilst it looks like we could use EmployeeNumber as the Primary Key for this table there are advantages to having a key that is independent of any property value so we use an auto-increment key which is by convention called id.
[*]An Employee can have more than one Dependent, so we need a dependents table to represent this entity
[*]Again the Dependents table has an auto-increment key called id.
[*]The Dependents table has a column fk_employee which creates a many-to-one relationship to the Employees table.

I am taking a class actually. We have done some setting up of ERDs given different types of data tables. Schools did not provide us with software, only class was based on MSSQL2014 , so I used the free version offered by minisoft (which I had trouble installing, BTW).

So I am trying to learn how to work with it, but I also find it interesting and I want to learn it for its own sake.

Protocol for PKs (Primary Keys) with Null Entries?

Similar threads

Hot Threads

Hackathon ideas?

Touch-typing for programmers

How to calculate Tension for a series of connected points?

Trying To Debug A Python File

Python Complaining About Python

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective