Boolean operators in SQL - Correct Syntax?

  • Thread starter Thread starter polystethylene
  • Start date Start date
  • Tags Tags
    Operators Sql
AI Thread Summary
The discussion revolves around constructing an SQL query for the Sloan Digital Sky Survey to exclude a specific triangular region in a 4-dimensional color space while maintaining a condition for blue objects. The user initially struggles with the NOT operator in their WHERE clause, which results in selecting all blue items instead of excluding the intended triangle. Key advice includes adopting an iterative approach to debugging the query by commenting out lines and testing smaller segments, as well as separating the NOT clauses for clarity. The user learns that the fields in question represent star magnitudes across five color bands (u, g, r, i, z) and that the definition of "blue" is based on the difference between u and g magnitudes. Ultimately, the user finds a solution by restructuring the NOT clauses to address each 2-D color space independently, leading to a successful query. However, they express frustration over not fully understanding why the final query works, highlighting a common challenge in SQL query construction and debugging.
polystethylene
Messages
17
Reaction score
0
Hi all,

Trying to write an SQL query for the Sloan Digital Sky Survey that uses the NOT operator, and failing miserably.

Basically, I'm making photometric cuts in 4-d colour space, and I currently have a selection of inequalities that select enclosed regions of colour space.

However, I now want to NOT that triangle, and select everything but that triangle, whilst still specifying a simple 'must be bluer than:...' colour cut.

The problem is that I can't get the query to return the data with the missing triangle, it simply selects all blue items.

This is what I've tried so far:

(This is in the WHERE clause)

and (P.psfmag_u-P.psfmag_g) < 0.55
and NOT ( (P.psfmag_u-P.psfmag_g) >= (1.27*(P.psfmag_g- P.psfmag_r) + 0.25)
and (P.psfmag_u-P.psfmag_g) <= ((3.9*(P.psfmag_g-P.psfmag_r)) + 1.8)


and (P.psfmag_g- P.psfmag_r) > (1.6*(P.psfmag_r- P.psfmag_i) - 0.05)
and (P.psfmag_g- P.psfmag_r) < (1.28*(P.psfmag_r- P.psfmag_i) + 0.25)
and (P.psfmag_g- P.psfmag_r) < (-1.45*(P.psfmag_r- P.psfmag_i) + 0.2)
and (P.psfmag_r- P.psfmag_i) > (3.73*(P.psfmag_i- P.psfmag_z) - 0.24)
and (P.psfmag_r- P.psfmag_i) < (0.725*(P.psfmag_i- P.psfmag_z) + 0.18)


and (P.psfmag_r-P.psfmag_i) between -0.45 and 0.15
and (P.psfmag_i-P.psfmag_z) between -0.6 and 0.1)

Any help would be appreciated, I'm relatively new to SQL and I'm struggling to find a workaround.

The syntax checks out as being 'ok', but clearly the results aren't.
 
Technology news on Phys.org
The problem is:
1. we do not know what any of those fields are - ie. what they represent
so other than seeing that you don't appear to have mutual exclusion
we can't tell what is going on.
2. what makes "blue"
3. the syntax is correct

I would take an iteratvie approach. Comment out all but the top 2 lines. See what you get for your resultset. Keep adding back a line at a time. Another approach is to take the NOT clause and make it a SQL state (lose the NOT) and see what the resultset is.

If the tabls is gigantic, create a view that is a subset, and run the test queries against the view.
 
jim mcnamara said:
The problem is:
1. we do not know what any of those fields are - ie. what they represent
so other than seeing that you don't appear to have mutual exclusion
we can't tell what is going on.
2. what makes "blue"
3. the syntax is correct

I would take an iteratvie approach. Comment out all but the top 2 lines. See what you get for your resultset. Keep adding back a line at a time. Another approach is to take the NOT clause and make it a SQL state (lose the NOT) and see what the resultset is.

If the tabls is gigantic, create a view that is a subset, and run the test queries against the view.
Thanks for the quick reply, the problem is I now don't understand your questions in able to answer them :smile:. This is the problem with only having used SQL in a limited capacity.

If by fields you mean the psfmag's, these are just the magnitudes of the stars in the SDSS 5 colour bands (ugriz).
The line that defines 'blue' is the first line, " and (P.psfmag_u-P.psfmag_g) < 0.55 ". I'll just knock something up in Photoshop to show you what I mean pictorially.
http://img246.imageshack.us/img246/9741/wdselectionlinests5.jpg This is the cut in u-g vs. g-r space, basically the top three lines. I'm trying to exclude the triangle in the middle, and take everything else, that is also less than u-g = 0.55.
As for taking the NOT clause and making it an SQL state - that's the bit I don't understand, what's an sql state?

Very grateful for the response :D
 
Last edited by a moderator:
My bad - I mean a SQL statement - not a SQL state. I'm thinking that you are unintentionally excluding data. The easiest way to deal with it is to make small changes.

You can also use a construct like
Code:
select stuff from table
minus 
select other_stuff from table ;
which is logically equivalent to the NOT, but allows you to take each sub-query and run it independently to see what you get as a resultset.

Plus, this may seem like a stupid question, but: are you postive that you loaded the table(s) correctly?
 
My first question would be - if you take out the condition that filters out non-blue items, does it select all the items except the ones in the triangle?
 
Thanks for the replies people.

Well, we accidentally stumbled upon the solution, basically through a little educated guess work and trial & error.

After stripping back the NOT clause so that it just dealt with u-g vs. g-r colour space, it worked perfectly.

Going with this, we then separated the NOT clause so that there was a separate clause for each 2-D colour space, and this worked perfectly.

So the final query looked like this:
and (P.psfmag_u-P.psfmag_g) < 0.55
and NOT ((P.psfmag_u-P.psfmag_g) >= (1.27*(P.psfmag_g- P.psfmag_r) + 0.25)
and (P.psfmag_u-P.psfmag_g) <= ((3.9*(P.psfmag_g-P.psfmag_r)) + 1.8)) and NOT ((P.psfmag_g- P.psfmag_r) > (1.6*(P.psfmag_r- P.psfmag_i) - 0.05)
and (P.psfmag_g- P.psfmag_r) < (1.28*(P.psfmag_r- P.psfmag_i) + 0.25)
and (P.psfmag_g- P.psfmag_r) < (-1.45*(P.psfmag_r- P.psfmag_i) + 0.2))

and NOT ((P.psfmag_r- P.psfmag_i) > (3.73*(P.psfmag_i- P.psfmag_z) - 0.24)
and (P.psfmag_r- P.psfmag_i) < (0.725*(P.psfmag_i- P.psfmag_z) + 0.18)
and (P.psfmag_r-P.psfmag_i) between -0.45 and 0.15)

So, this is fine, as it works, but I'll be damned if I can work out why it works, and that's quite unsatisfying in its own way.
 
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I have a quick questions. I am going through a book on C programming on my own. Afterwards, I plan to go through something call data structures and algorithms on my own also in C. I also need to learn C++, Matlab and for personal interest Haskell. For the two topic of data structures and algorithms, I understand there are standard ones across all programming languages. After learning it through C, what would be the biggest issue when trying to implement the same data...
Back
Top