Database Table Design for Canadian Electoral Data

In summary, my team lead asked me to create two tables for a database and to outline the rationale behind my table design choices. Table 1 should contain the following columns and be named vote_share:● Riding Number● Riding Name in English● Riding Name in French● Total votes● Turnout (votes / voters)● Conservative Vote Share● Liberal Vote Share● NDP Vote Share● Green Vote Share● Bloc Quebecois Vote Share● Peoples’ Party Vote ShareTable 2 should contain the following columns and be named candidates:● Riding Number● Liberal candidate● Conservative candidate● NDP candidate● Green candidate● Bloc Quebecois candidate
  • #1
highflyyer
28
1
I have been asked by my team lead to create two tables for a database and to outline the rationale behind my table design choices.Table 1 should contain the following columns and be named vote_share:
● Riding Number
● Riding Name in English
● Riding Name in French
● Total votes
● Turnout (votes / voters)
● Conservative Vote Share
● Liberal Vote Share
● NDP Vote Share
● Green Vote Share
● Bloc Quebecois Vote Share
● Peoples’ Party Vote ShareTable 2 should contain the following columns and be named candidates:
● Riding Number
● Liberal candidate
● Conservative candidate
● NDP candidate
● Green candidate
● Bloc Quebecois candidate
● Peoples’ Party candidateIf I normalize the two tables, will I end up with more than two tables?
 
Technology news on Phys.org
  • #2
highflyyer said:
If I normalize the two tables, will I end up with more than two tables?

Why would you normalize the two tables if you've already been told that there are two tables and exactly which columns should appear in each of them?
 
  • #3
This here is the updated question:

Consider the following in a database:

Table 1 should contain the following columns and be named vote_share:

● Riding Number

● Riding Name in English

● Riding Name in French

● Total votes

● Turnout (votes / voters)

● Conservative Vote Share

● Liberal Vote Share

● NDP Vote Share

● Green Vote Share

● Bloc Quebecois Vote Share

● Peoples’ Party Vote Share

Table 2 should contain the following columns and be named candidates:

● Riding Number

● Liberal candidate

● Conservative candidate

● NDP candidate

● Green candidate

● Bloc Quebecois candidate

● Peoples’ Party candidate

What would be

  1. the SQL statements one would use to create both tables
  2. the rationale behind the table design choices

The fact that my supervisor asks me to explain the rationale behind my table design choices leads me to wonder if I would have to normalize the tables. I am just a bit confused.
 
  • #4
highflyyer said:
The fact that my supervisor asks me to explain the rationale behind my table design choices

Are they your design choices, or someone else's?

In other words, is the description you give of the two tables and the columns in each something you came up with in response to the question, or was it given to you as part of the question?
 
  • #5
PeterDonis said:
In other words, is the description you give of the two tables and the columns in each something you came up with in response to the question, or was it given to you as part of the question?

It was given to me as part of the question.

My supervisor also mentioned that she created a schema in the database and that should I have write access to it, but I am not sure how that relates to the creation and population of the tables.
 
  • #6
highflyyer said:
It was given to me as part of the question.

Then it doesn't seem like you are being asked to make any changes to the table structure at all, which would mean you are not being asked to normalize the tables.

Of course, I don't know your supervisor; I'm just making a reasonable guess based on what you've said. If you are unable to understand why your supervisor is asking you these questions or what the ground rules are for what changes, if any, you are allowed to make in the table structure in your answers to them, the person you need to clarify that with is your supervisor; we certainly aren't going to be able to answer that here.
 
  • #7
highflyyer said:
  1. the SQL statements one would use to create both tables
  2. the rationale behind the table design choices

The fact that my supervisor asks me to explain the rationale behind my table design choices

From what you've said, they aren't your design choices; you didn't make them, your supervisor did (I assume your supervisor is the one who made up the question). So it seems like you're not being asked to explain why you made particular design choices; it seems like you're being asked if you understand why someone else (your supervisor) made those design choices, and whether you can explain that rationale.

Again, I don't know your supervisor and there are a lot of details you haven't given (such as, what is this database for?). I'm just making reasonable guesses based on what you've said.
 
  • #8
Okay
 
  • #9
Given that you’re designing two tables and given access to the database schema then your manager wants you to write the sql to create the tables in one of the database schemas.

You must decide by schema inspection which schema if there are many, best fits the type of data that the tables will hold.

show schemas — H2 sql

Next you need to decide the names for your tables and they must not collide with table names in the schema.

show tables from (schema) — H2 sql

next you must decide the names and datatypes for each of the fields in your tables And you must decide what will be the primary keys that bind the tables together which looks like the riding number.

show columns from (table) from (schema) — H2 sql

Having decided that some fields will be textual data like a name, address or phone number then you must decide the maximal number of characters. Some of this information you might find in other tables in the chosen schema or other schemas.

From there you will be able to write the sql to create the tables. This type of SQL is known as Data Definition Langauge or DDL for short. It covers database, schema and table actions like creating, altering, describing and deleting these DDL objects.

Other SQL statements pertain to adding, updating and dropping data from tables or in constructing index tables and views of tables and that only covers some of what SQL can do.

I imagine after you get past this task your manager will want you to write sql to load the tables, to select subsets of data from the tables and later to create index tables to optimize the select queries.

https://websitesetup.org/sql-cheat-sheet/

This cheat sheet shows many of the more common sql statements, use it in conjunction with google to find what you need. Sometimes sql syntax varies from vendor to vendor though mostly due to vendor specific extensions to the base language.

As a further reference here’s the H2 Database SQL guide. Please be aware that you should locate your database vendors SQL reference for the more obscure SQLstatements.

http://www.h2database.com/html/commands.html

http://www.h2database.com/html/grammar.html

The H2 database utility is a simple single jar java-based application. It’s very easy to use and even has a web based console for direct inspection of database schemas, tables and views. I’ve used it on a few projects years ago and it is still in use storing, retrieving and deleting large data records with nary a glitch. It is also a great learning tool.

Database design is an interesting pastime that you should enjoy.
 
  • Like
Likes sysprog
  • #10
If you want to do it properly, you want to build input (what form does the information come in) and output (what is the information being used for) schemas and smoosh them together, etc.

What you've been given could be technically right or wrong depending on many factors, most importantly "why" and "how" ? But, what counts is "Does it work" ?

Honestly, it looks like your supervisor just wants you to sign off on (ie: screwups are your responsibility), andor deskcheck (maybe they took a db101 course somewhere along the line), their design.

Or, it's a homework assignment, in which case there's a relevant other subforum.
 
Last edited:
  • Like
Likes sysprog
  • #11
I think it’s a homework assignment but not in the sense of HW on PF to acclimate you to their team environment and maybe you’ll become the keeper of the schema.
 
  • #12
hmmm27 said:
Honestly, it looks like your supervisor just wants you to sign off on (ie: screwups are your responsibility), andor deskcheck (maybe they took a db101 course somewhere along the line), their design.
All too frequently, someone in authority doesn't want to cede any of the authority, and consequently imposes 'pointy-haired boss' design and implementation decisions, and then offloads almost all of the accountability to someone who has the required ability ##-## in such a scenario the boss can take credit if things go well, and have someone else to take blame if they don't.

Depending partly on how technically astute the person is, I usually try more or less to get the recipient of my work product to specify requirements in terms of functionality, as distinguished from in terms of methods of implementation.

In the case instanter, it seems to me that the schema design is naive ##-## perhaps studying some existing databases, or some open-source models might be in order, e.g. https://www.idea.int/sites/default/files/publications/electoral-system-design-database-codebook.pdf, or https://github.com/alphagov/open-standards/issues/42.
 
  • Like
Likes hmmm27
  • #13
sysprog said:
In the case instanter, it seems to me that the schema design is naive
It isn't horrible - assuming the unspecified context is benign to it - just... yes, "naive" is a good choice of words.

--------

Don't mind the arrogance, but...

If'n it was me, I'd just make a flat file(s), simply mirroring input schema(e), and add a couple of audit fields ; leaving query building to runtime.

On the other hand, there's only a few hundred ridings in the country : you could literally do everything on a section of the office wall, achieving the same result as a computerized database.

Better actually, since the format allows input from non-tech inclined personnel.

But.

My rationale for lack-of-design is "Because, I've been doing this (on and off) for four decades, and I'm really lazy", which may not be considered the best response from a junior team member (even notwithstanding I mean "lazy" in the sense of "achieving optimum efficiency").

Given that, you might try "Looks good, boss, I'll get right on it". In that regards, it would be prudent to add a comprehensive input journalling file to the schema.

Go Green !
 
Last edited:
  • Like
Likes pbuk
  • #14
I feel like the type of thing you should be thinking about here look like: how is data input into this table? What are the most common potential input errors, and can the table design catch them? How is the data in the table going to be used, and how can you make it as easy as possible to use it?

As an example, if people are entering data by hand, and someone accidentally writes that there are 1222 votes instead of 12222 votes that's going to be tough to catch in the current system. If they were writing down quantities for everything instead of ratios it would easier to catch - if you write down total votes, number of voters, and votes for each party, you can validate with table constraints (or even just after the fact if you really don't want to prevent data entry) that the number of votes is a reasonable fraction of the number of voters, and the votes for each party add up to the number of votes.

On the output side, for example if you think 90% of queries people construct will begin by joining the two tables together, maybe you only want one table for easier use (or have a view that joins them).
 
  • #15
I'm with @hmmm27 here.

There are times and places for a normalized SQL database, but I'd just store this as JSON then it can be imported into any analytical software (or a NoSQL database) on demand. The Canadian government probably publish it like this anyway - oh, they are using CSV. There is an electoral district called Avalon, cool :cool:

We are talking about historical data that is published online so data entry errors are not relevant. I can see no benefits to applying any normalization - in fact I don't see anything to be gained by splitting into two tables; I would merge into one. The only normalization I would apply is to store numbers of votes rather than percentages.
 

What is database table design?

Database table design is the process of organizing and structuring data in a relational database. It involves making decisions about data types, relationships between tables, and how to store and retrieve data efficiently.

What are the key considerations when making database table design choices?

When making database table design choices, it is important to consider the purpose and goals of the database, the type and amount of data, and the potential for future scalability and performance.

What are the different types of database table design?

There are several types of database table design, including flat tables, star schemas, snowflake schemas, and fact and dimension tables. Each type has its own advantages and is suitable for different data needs.

How do I choose the right primary key for my database table?

When choosing a primary key for a database table, it is important to select a unique identifier for each record. This can be a single column or a combination of columns, and it should not change over time.

What are some common mistakes to avoid in database table design?

Some common mistakes to avoid in database table design include using inconsistent or redundant data, not considering future scalability, and not properly defining relationships and constraints between tables.

Similar threads

  • Programming and Computer Science
Replies
1
Views
1K
  • General Discussion
2
Replies
36
Views
5K
  • General Math
Replies
2
Views
3K
  • Special and General Relativity
Replies
1
Views
2K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
5
Views
3K
Back
Top