Comp Sci SQL WHERE Clause: Repetitions & Order Explained

Crystal037 · Jan 8, 2023

I expect output as 30,20,3,2,1 in a row but I am getting repetitions of these with 20,30 coming 3 times and 3,2 coming 18 times and 1 coming 16 times. I dont understand how combination of condition in where is leading to so many duplicates. Instead of using or if I divide the query into 2 and execute it one by one then I am getting only distinct values. Can you also explain the order by which it traverses through the table and selects tuples and display as to understand why am I getting those multiple values and why in that specific order

DrJohn · Jan 8, 2023

Repitition like that is usually due to a cartesian join, where some part of the join is not defined. So each row of one table is joined to EVERY row of another table. And I think that the OR is one of the things doing that. So read up on cartesian joins to see where it is going wrong.

If your select was on more than just Pnumber, say include Pno from works_on, or Pno from works_on and Dnumber from department you should see groupings of data showing how the cartesian join is collecting repetively data on a table by table basis.

It always helps to recover date where you specify the tables in the select's join rather than just a column name. You have not mentioned which table's dnum equals which table's dnumber, for example.

DrJohn · Jan 8, 2023

The query optimiser will test the number of rows it gets in lots of different joins and then use that to work out which join to do first, then collect data from that and choose which one it does next. This depends on indexes existing as well, if I remember correctly.

jack action · Jan 8, 2023

You could probably replace your first line with "select distinct pnumber" and get the result you expect. But that would be a bad fix.

The real fix you need is to rewrite your query with an appropriate JOIN syntax to eliminate the problem explained by @DrJohn . Here are links that explain how and why you should:

Maarten Havinga · Jan 8, 2023

If you join on dnum = dnumber you get nr 5 times 9, since dnumber is 5 thrice and dnum as well. This all goes for the same person, so the join isn't uniquely linked.

Maarten Havinga · Jan 8, 2023

Joining with comma's or with the equivalent cross join takes every combination of rows once. If you then specify dnum = dnumber the amount of rows where both equal 5 becomes 9 since for both tables there are 3 such rows. But you also join on pnum = pnumber and these match on 2 and 3 where again dnum = dnumber = 5. Therefore these occurences (2 and 3) are again multiplied by 2, yielding 18 times 2 and 3.

BTW this also occurs on an inner join on the same conditions, you need to take a join condition that uniquely identifies one correct row.

Comp Sci SQL WHERE Clause: Repetitions & Order Explained

Thread 'Why wasn’t gravity included in the potential energy for this problem?'

Similar threads

Engineering Diff gain of a push pull degenerated differential pair

Engineering AGMA pitting resistance factor of safety (SH)

How Do I Draw This Shear and Moment Diagram?

PLL - How to find all the gains of a PI corrector and fix Ki ? MATLAB

Engineering Full bridge circuit with inductor and resistor

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers