Protocol for PKs (Primary Keys) with Null Entries?

In summary, you would use the combination of fields {emp_num, emp_dependentname, emp_relation} as the primary PK for the EmpDependent table. You would guard against common names by using the combination of fields as well as by naming the table "EmployeesWithDependents."
  • #1
WWGD
Science Advisor
Gold Member
7,003
10,421
Hi all :I have the table Emp (EmpNum PK, EmpLastNM, EmpEducation, EmpDependent, EmpEducation,...EMpHireDate) which I want to normalize and then I want to do an ERD for it.

I must break this into tables one for which , it seems, I cannot have a non-null PK:

Now EmpDependent is both multi-valued and has some nulls. My question is, what can I do about setting up a PK for empdependent given the null?
I have an idea of using the triple (Emp_Num, Emp_DepName, Emp_DepRelation) as the PK; an employee may have more than one dependent by the same name, but (given dependents are nuclear family) , not likely two dependents with both same name and relation.

STILL, there is an employee without dependents, meaning this triple will contain some nulls. How does one usually deal with this?
 
Technology news on Phys.org
  • #2
You wouldn't enter rows in this table for employees with no dependents. There is no point in doing so; then your PK is valid.
 
  • #3
Employee John Smith has a son Peter Smith. He divorces, gets full custody of Peter, and marries again with Carole. Alas, Carole was married before to Brian Smith, and they also had a son Peter Smith, for whom Carole has full custody. Now the nuclear family of John and Carole has two Peter Smith's in it.

Quite unlikely, sure, but I wouldn't structure a database with a PK that could conceivably become non-unique.
Why not add a PK (autonumber) to the EmpDependent table?

Another potential problem with the structure are two employees married to each other. Are their children dependents of both? Maybe you need three tables: employees, dependents, employees_dependents_relation.
 
Last edited:
  • #4
Samy_A said:
Employee John Smith has a son Peter Smith. He divorces, gets full custody of Peter, and marries again with Carole. Alas, Carole was married before to Brian Smith, and they also had a son Peter Smith, for whom Carole has full custody. Now the nuclear family of John and Carole has two Peter Smith's in it.

Quite unlikely, sure, but I wouldn't structure a database with a PK that could conceivably become non-unique.
Why not add a PK (autonumber) to the EmpDependent table?

Another potential problem with the structure are two employees married to each other. Are their children dependents of both? Maybe you need three tables: employees, dependents, employees_dependents_relation.

Thanks, that is precisely the key I had in mind: {emp_num, emp_dependentname, emp_relation}. But you're right to consider guarding against very common names: john smith, bob jones, etc. Still, the may be a way around it: we can use Pete Smith for one, Peter Smith for the other and so on. This I think can be made to work for most English names , who have informal as well as formal names :Charles, Charlie, Chuck, Pete, Peter, Bob, Robert. I think something similar may be the case for names in other languages. A problem from this may be the one may want to enter the person's actual legal name. And you don't want to introduce the SSN, since you want to keep it private. But I think the issue of wanting the person's legal name may not really matter that much, as this is kind of secondary and could be obtained in a different way.
 
  • #5
PAllen said:
You wouldn't enter rows in this table for employees with no dependents. There is no point in doing so; then your PK is valid.
I wonder if using the employee's name itself as a dependent would also work for these cases, understood (maybe written some way in the data dictionary) to mean the employee has no dependents. But this may slow the search and take up a lot of memory . So maybe your advice is best and we could rename the table: EmployeesWithDependents to make your point clear.
 
  • #6
WWGD said:
Thanks, that is precisely the key I had in mind: {emp_num, emp_dependentname, emp_relation}.
But you're right to consider guarding against very common names: john smith, bob jones, etc. Still, the may be a way around it: we can use Pete Smith for one, Peter Smith for the other and so on. This I think can be made to work for most English names , who have informal as well as formal names :Charles, Charlie, Chuck, Pete, Peter, Bob, Robert. I think something similar may be the case for names in other languages. A problem from this may be the one may want to enter the person's actual legal name. And you don't want to introduce the SSN, since you want to keep it private. But I think the issue of wanting the person's legal name may not really matter that much, as this is kind of secondary and could be obtained in a different way.
I think we are talking about different things here.
The combination of fields {emp_num, emp_dependentname, emp_relation} as primary key doesn't look like a good idea. Once you have to start with tricks (enter Charles as Charlie if need be, ...) it will get messy sooner rather than later.
If we ignore the problem of dependents of two employees I mentioned at the end of my previous post, why not simply add a numeric primary key to your EmpDependent table? That way you don't have to do any tricks with common names: every dependent will have his own key.

WWGD said:
I wonder if using the employee's name itself as a dependent would also work for these cases, understood (maybe written some way in the data dictionary) to mean the employee has no dependents. But this may slow the search and take up a lot of memory . So maybe your advice is best and we could rename the table: EmployeesWithDependents to make your point clear.
I don't understand this. I thought the second table was a table with dependents. The name EmployeesWithDependents suggests that this is a table of Employees. @PAllen advice is correct: employees with no dependent simply will have no related records in the EmpDependent table. No need for any special infrastucture. If you need a list of employees without dependents, an outer join between the two tables, with a NULL criterium on the EmpDependent PK will provide that.
 
  • #7
Simply put, this is all that you need to do.
Samy_A said:
...simply add a numeric primary key to your EmpDependent table
PAllen said:
You wouldn't enter rows in this table for employees with no dependents.
 
  • #8
OK, thanks to all; as it must be clear to you, I am pretty new at this, so I lack a context in which to make reasonable/sound decisions at this point.
 
  • #9
Just picked up on this and in relation to your other recent posts I have a couple of questions:
how are you learning this stuff - are you following a course or book?
what is your aim and what tools are you using - SqlServer is great if you are developing an enterprise application and have access to all the tools in a full .NET development environment, but if you just want to learn about RDBMSs there may be better ways

As a guide, your learning should lead you to designing a schema something like this:
  • There is an employee entity so we need an employees table
  • Whilst it looks like we could use EmployeeNumber as the Primary Key for this table there are advantages to having a key that is independent of any property value so we use an auto-increment key which is by convention called id.
  • An Employee can have more than one Dependent, so we need a dependents table to represent this entity
  • Again the Dependents table has an auto-increment key called id.
  • The Dependents table has a column fk_employee which creates a many-to-one relationship to the Employees table.
 
  • #10
MrAnchovy said:
Just picked up on this and in relation to your other recent posts I have a couple of questions:
how are you learning this stuff - are you following a course or book?
what is your aim and what tools are you using - SqlServer is great if you are developing an enterprise application and have access to all the tools in a full .NET development environment, but if you just want to learn about RDBMSs there may be better ways

As a guide, your learning should lead you to designing a schema something like this:
  • There is an employee entity so we need an employees table
  • Whilst it looks like we could use EmployeeNumber as the Primary Key for this table there are advantages to having a key that is independent of any property value so we use an auto-increment key which is by convention called id.
  • An Employee can have more than one Dependent, so we need a dependents table to represent this entity
  • Again the Dependents table has an auto-increment key called id.
  • The Dependents table has a column fk_employee which creates a many-to-one relationship to the Employees table.
I am taking a class actually. We have done some setting up of ERDs given different types of data tables. Schools did not provide us with software, only class was based on MSSQL2014 , so I used the free version offered by minisoft (which I had trouble installing, BTW).

So I am trying to learn how to work with it, but I also find it interesting and I want to learn it for its own sake.
 
Last edited:

1. What is a primary key (PK)?

A primary key is a unique identifier for each record in a database table. It is used to distinguish one record from another and is typically a numeric or alphanumeric value.

2. What is the purpose of a primary key?

The primary key serves as a reference for other tables and ensures that each record in a table can be uniquely identified and accessed.

3. Can a primary key have null entries?

Yes, a primary key can have null entries, but it is not recommended. A null value in a primary key means that the record does not have a unique identifier, which can cause issues with data integrity and querying.

4. How can null entries in primary keys be avoided?

To avoid null entries in primary keys, the database should be designed with proper data validation and constraints. This can include setting the primary key as a required field, enforcing uniqueness, and using auto-incrementing values.

5. What is the best practice for handling null entries in primary keys?

The best practice is to avoid null entries in primary keys whenever possible. However, if null entries are necessary, they should be used sparingly and with caution. It is important to regularly review the database design and ensure that the null entries do not cause any issues with data integrity.

Similar threads

  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
2
Replies
50
Views
4K
  • Programming and Computer Science
Replies
5
Views
1K
  • Programming and Computer Science
Replies
2
Views
1K
  • Programming and Computer Science
Replies
6
Views
1K
  • Programming and Computer Science
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
2
Replies
35
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
639
  • Programming and Computer Science
Replies
2
Views
1K
  • Programming and Computer Science
Replies
6
Views
1K
Back
Top