Protocol for PKs (Primary Keys) with Null Entries?

Click For Summary

Discussion Overview

The discussion revolves around the challenges of setting up primary keys (PKs) in a database table for employee dependents, particularly in the context of handling null entries and multi-valued relationships. Participants explore normalization, entity-relationship diagrams (ERDs), and the implications of naming conventions in database design.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant suggests using a composite PK of (Emp_Num, Emp_DepName, Emp_DepRelation) but raises concerns about null entries for employees without dependents.
  • Another participant argues that rows for employees without dependents should not be entered, thus validating the PK.
  • A scenario is presented where two employees may have dependents with the same name, questioning the uniqueness of the proposed PK.
  • Some participants propose adding an auto-increment numeric PK to the EmpDependent table to avoid issues with name collisions and null entries.
  • There is discussion about the implications of using employee names as dependents to indicate no dependents, with concerns about memory usage and search efficiency.
  • One participant suggests renaming the table to clarify that it only includes employees with dependents, while another emphasizes that employees without dependents should simply have no related records.
  • Several participants discuss the importance of designing a schema that includes separate tables for employees and dependents, with appropriate foreign key relationships.

Areas of Agreement / Disagreement

Participants express differing views on how to handle primary keys and null entries, with no consensus on the best approach. Some advocate for using a numeric PK, while others support the composite key approach despite its potential issues.

Contextual Notes

Participants highlight various assumptions, such as the uniqueness of names and the handling of employees with no dependents. The discussion also reflects uncertainty regarding the implications of naming conventions and database design choices.

Who May Find This Useful

This discussion may be useful for database designers, students learning about relational database management systems (RDBMS), and professionals seeking to understand the complexities of normalization and primary key design.

WWGD
Science Advisor
Homework Helper
Messages
7,802
Reaction score
13,106
Hi all :I have the table Emp (EmpNum PK, EmpLastNM, EmpEducation, EmpDependent, EmpEducation,...EMpHireDate) which I want to normalize and then I want to do an ERD for it.

I must break this into tables one for which , it seems, I cannot have a non-null PK:

Now EmpDependent is both multi-valued and has some nulls. My question is, what can I do about setting up a PK for empdependent given the null?
I have an idea of using the triple (Emp_Num, Emp_DepName, Emp_DepRelation) as the PK; an employee may have more than one dependent by the same name, but (given dependents are nuclear family) , not likely two dependents with both same name and relation.

STILL, there is an employee without dependents, meaning this triple will contain some nulls. How does one usually deal with this?
 
Technology news on Phys.org
You wouldn't enter rows in this table for employees with no dependents. There is no point in doing so; then your PK is valid.
 
Employee John Smith has a son Peter Smith. He divorces, gets full custody of Peter, and marries again with Carole. Alas, Carole was married before to Brian Smith, and they also had a son Peter Smith, for whom Carole has full custody. Now the nuclear family of John and Carole has two Peter Smith's in it.

Quite unlikely, sure, but I wouldn't structure a database with a PK that could conceivably become non-unique.
Why not add a PK (autonumber) to the EmpDependent table?

Another potential problem with the structure are two employees married to each other. Are their children dependents of both? Maybe you need three tables: employees, dependents, employees_dependents_relation.
 
Last edited:
Samy_A said:
Employee John Smith has a son Peter Smith. He divorces, gets full custody of Peter, and marries again with Carole. Alas, Carole was married before to Brian Smith, and they also had a son Peter Smith, for whom Carole has full custody. Now the nuclear family of John and Carole has two Peter Smith's in it.

Quite unlikely, sure, but I wouldn't structure a database with a PK that could conceivably become non-unique.
Why not add a PK (autonumber) to the EmpDependent table?

Another potential problem with the structure are two employees married to each other. Are their children dependents of both? Maybe you need three tables: employees, dependents, employees_dependents_relation.

Thanks, that is precisely the key I had in mind: {emp_num, emp_dependentname, emp_relation}. But you're right to consider guarding against very common names: john smith, bob jones, etc. Still, the may be a way around it: we can use Pete Smith for one, Peter Smith for the other and so on. This I think can be made to work for most English names , who have informal as well as formal names :Charles, Charlie, Chuck, Pete, Peter, Bob, Robert. I think something similar may be the case for names in other languages. A problem from this may be the one may want to enter the person's actual legal name. And you don't want to introduce the SSN, since you want to keep it private. But I think the issue of wanting the person's legal name may not really matter that much, as this is kind of secondary and could be obtained in a different way.
 
PAllen said:
You wouldn't enter rows in this table for employees with no dependents. There is no point in doing so; then your PK is valid.
I wonder if using the employee's name itself as a dependent would also work for these cases, understood (maybe written some way in the data dictionary) to mean the employee has no dependents. But this may slow the search and take up a lot of memory . So maybe your advice is best and we could rename the table: EmployeesWithDependents to make your point clear.
 
WWGD said:
Thanks, that is precisely the key I had in mind: {emp_num, emp_dependentname, emp_relation}.
But you're right to consider guarding against very common names: john smith, bob jones, etc. Still, the may be a way around it: we can use Pete Smith for one, Peter Smith for the other and so on. This I think can be made to work for most English names , who have informal as well as formal names :Charles, Charlie, Chuck, Pete, Peter, Bob, Robert. I think something similar may be the case for names in other languages. A problem from this may be the one may want to enter the person's actual legal name. And you don't want to introduce the SSN, since you want to keep it private. But I think the issue of wanting the person's legal name may not really matter that much, as this is kind of secondary and could be obtained in a different way.
I think we are talking about different things here.
The combination of fields {emp_num, emp_dependentname, emp_relation} as primary key doesn't look like a good idea. Once you have to start with tricks (enter Charles as Charlie if need be, ...) it will get messy sooner rather than later.
If we ignore the problem of dependents of two employees I mentioned at the end of my previous post, why not simply add a numeric primary key to your EmpDependent table? That way you don't have to do any tricks with common names: every dependent will have his own key.

WWGD said:
I wonder if using the employee's name itself as a dependent would also work for these cases, understood (maybe written some way in the data dictionary) to mean the employee has no dependents. But this may slow the search and take up a lot of memory . So maybe your advice is best and we could rename the table: EmployeesWithDependents to make your point clear.
I don't understand this. I thought the second table was a table with dependents. The name EmployeesWithDependents suggests that this is a table of Employees. @PAllen advice is correct: employees with no dependent simply will have no related records in the EmpDependent table. No need for any special infrastucture. If you need a list of employees without dependents, an outer join between the two tables, with a NULL criterium on the EmpDependent PK will provide that.
 
Simply put, this is all that you need to do.
Samy_A said:
...simply add a numeric primary key to your EmpDependent table
PAllen said:
You wouldn't enter rows in this table for employees with no dependents.
 
OK, thanks to all; as it must be clear to you, I am pretty new at this, so I lack a context in which to make reasonable/sound decisions at this point.
 
Just picked up on this and in relation to your other recent posts I have a couple of questions:
how are you learning this stuff - are you following a course or book?
what is your aim and what tools are you using - SqlServer is great if you are developing an enterprise application and have access to all the tools in a full .NET development environment, but if you just want to learn about RDBMSs there may be better ways

As a guide, your learning should lead you to designing a schema something like this:
  • There is an employee entity so we need an employees table
    [*]Whilst it looks like we could use EmployeeNumber as the Primary Key for this table there are advantages to having a key that is independent of any property value so we use an auto-increment key which is by convention called id.
    [*]An Employee can have more than one Dependent, so we need a dependents table to represent this entity
    [*]Again the Dependents table has an auto-increment key called id.
    [*]The Dependents table has a column fk_employee which creates a many-to-one relationship to the Employees table.
 
  • #10
MrAnchovy said:
Just picked up on this and in relation to your other recent posts I have a couple of questions:
how are you learning this stuff - are you following a course or book?
what is your aim and what tools are you using - SqlServer is great if you are developing an enterprise application and have access to all the tools in a full .NET development environment, but if you just want to learn about RDBMSs there may be better ways

As a guide, your learning should lead you to designing a schema something like this:
  • There is an employee entity so we need an employees table
    [*]Whilst it looks like we could use EmployeeNumber as the Primary Key for this table there are advantages to having a key that is independent of any property value so we use an auto-increment key which is by convention called id.
    [*]An Employee can have more than one Dependent, so we need a dependents table to represent this entity
    [*]Again the Dependents table has an auto-increment key called id.
    [*]The Dependents table has a column fk_employee which creates a many-to-one relationship to the Employees table.


I am taking a class actually. We have done some setting up of ERDs given different types of data tables. Schools did not provide us with software, only class was based on MSSQL2014 , so I used the free version offered by minisoft (which I had trouble installing, BTW).

So I am trying to learn how to work with it, but I also find it interesting and I want to learn it for its own sake.
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
Replies
5
Views
2K
  • · Replies 50 ·
2
Replies
50
Views
9K
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 2 ·
Replies
2
Views
4K