Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Homework Help: Is google search a Database management system ?

  1. May 5, 2012 #1
    Wasn't sure about where else to post this so I posted it here :

    I got into an argument with my friend about google search being a DBMS, I disagreed because it doesn't fit the definition that we were given or the one on wikipedia and a few more websites. All that google search does it query a database (according to my knowledge) and gives you the requested results. He on the other hand, says that it manages data and gives the user what the user is looking for, I went forward to say that according to his logic the find function in word processing programs is also a DBMS because it's doing almost the same thing as google search, he said yes it. He also said that facebook is a DBMS.

    To settle this argument we decided to post this question here.

    Thanks in advance
  2. jcsd
  3. Jul 18, 2012 #2
    Nice question, if you ever find the answer please let me know.

    I guess it depends on how you look at it. When you look at Google as a DBMS then what is the database? I suppose the database is formed by all pages written by us and spread over countless servers all over the world. On Google's servers they only keep indexes that make it easy and fast for us to find something back. Google even caches lots of pages, I suppose in my line of work as an Oracle DBA I would say that Google has a materialized view for the many things it finds on the web, but Google does of course not use Oracle. There is one important difference when comparing Google's DBMS with other database management systems; Oracle, SQL Server, mySQL and anything else control access and governor changes in the data while you don't need Google's approval to change your website or access any other website, at least not yet :-) And when you make changes in your data/website it is not directly reflected by Google's search engine either, only when you are lucky and then with a considerable delay. Those are some huge differences with other database management systems that make me think that if Google is a DBMS it is a very crappy one. It is an excellent indexing system though.

    Kind regards,
    Paul Karman
  4. Jul 18, 2012 #3


    Staff: Mentor

    Google search uses a database to store keywords and the web page where they were used. Also search stats are recorded in the database to drive future searches with similar keywords. Since Google is business it maintains these stats in orde so set search rates where companies want to tie their website to keywords selected by the end user.

    For Facebook, a database is used to store your profile information.

  5. Jul 18, 2012 #4


    Staff: Mentor

    With respect to google and other peoples websites, google runs a spider program that crawls thru webpages to see if they've been changed and recording the changes in its cache while updating its keyword search. I not sure how often it works but Im pretty sure it scans the more heavily accessed sites more often and uses any url references in those webpages to scan further and further into the web structure.
  6. Jul 19, 2012 #5
    For Facebook I would agree that it has a database and a database management system. Every message, story and picture is stored on Facebook’s own servers and what you can see or enter in the database is governed by some logic that starts with checking your credentials.

    For Google however it is a different matter depending on how you look at Google. If you see Google as an independent system in which somehow information gets entered that then can be queried and you can log in to find out even more the yes, it could be considered a database.

    But it seems that Google’s main existence revolves around collecting data about data, that is called metadata. And although metadata is also data I it is a bit harder to recognize it as a database, it simply has no reason for existence if web pages did not exist. And when there is no database there is no need for a database management system?!

    Of course you can use the same arguments for Facebook where the information is within the person and Facebook just indexes information that is in the user. (Are brain bots the next step?) Anyway, this argument would draw Facebook towards an indexing service like Google where the data is stored in biological organisms. I think I draw the line between man and machine and agree that Facebook contains the data.

    You might want to go the other way and decide that any information that is not actual real life is in fact metadata so any database is in principle a meta database that can use a database management system. So then I would agree that meta databases are databases too and hence so is Google.

    Facebook could be an indexing service, Google could be a database, everybody is right.

    Paul Karman
  7. Jul 19, 2012 #6


    Staff: Mentor

    Yes, metadata is data about data and because its data it needs to be stored and searchable itself and so its placed in a database.

    A database is simply an organized way of storing data using an api to access it. Some of the earliest of database systems were called ISAM databases where data was organized into records and records were inserted into pages. Each record had a key for simple access. See http://en.wikipedia.org/wiki/Indexed_Sequential_Access_Method.

    Following ISAM were IDS databases originally concieved by Charles Bachmann. These databases used linked lists of data with cross links to other rings. Programmers had to be familiar with how the data was organized in order to traverse it to retrieve information. It was very fast but saving a database to tape storage and reloading it for load balancing required a specialized program to traverse your database.

    Later IBM developed relational databases and the SQL language. Here data was organized into tables with associated index tables of keys. This organization wasn't as fast as IDS but it was easier to backup and reload as needed by a program provided by the vendor. In a sense it was a cross between ISAM and IDS database technology and much easier to use by non-programmers.

    Google uses an algorithm called MapReduce which integrates its data together into a form of massive distributed database. Their 2008 stats were described in the article below so you can imagine where they are now:

    Last edited: Jul 19, 2012
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook