Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Infinite Databases?

  1. Aug 25, 2015 #1

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    Hi, I was going over a paper on infinite databases when I dropped by PF, and realized that the paper was hosted by Springer, with a reputation for being somewhat loose in its rigor. Does anyone know if this is a legitimate area of research, and, if so, some intro papers on it ?
    While we're at it, how about the area of Rough Relational databases : http://www.wseas.us/e-library/conferences/2012/CambridgeUK/AIKED/AIKED-37.pdf
    Thanks.
     
    Last edited: Aug 25, 2015
  2. jcsd
  3. Aug 25, 2015 #2
  4. Aug 25, 2015 #3

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    Thanks, Dr. Courtney, just curious: do you know if Google uses infinite databases ? Maybe the use OODBS's (Object-Oriented) ?

    EDIT: I think my question is poorly -posed since I doubt Google searches are done over databases, but instead over websites. I doubt that the collection of all websites can be made into a database and then do a search over it. And then, of course, the collection of websites, or even the collection of all strings in the web is obviously finite.

    I case anyone is interested, here is something I think is related:

    http://www.google.com/support/enter...doc_set/admin_crawl/introduction.html#1076111
     
    Last edited: Aug 25, 2015
  5. Aug 28, 2015 #4

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    As usual almost impossible to find free articles.
     
  6. Aug 31, 2015 #5
    I've been watching this thread hoping to really pick up on what an infinite database is, but I'm still not totally sure.

    Bringing what I know to the potluck, Google is renowned for its a BigTable technology, a really simple Map that can scale easily into the Petabyte range, surely meant to run on distributed systems. (See Apache's HBase for Hadoop integration). https://en.wikipedia.org/wiki/BigTable

    When I've done search engine type stuff (at a much lower level) for academic projects, it was a datastructure based on word occurences mapped to the word, and an intelligent system of joins to answer joint searches for many words. Basically variations of the "fulltext" index from database systems. Its kinda interesting but not that complicated.

    Most databases don't have problems with potentially infinite records. Though a certain key may be a finite datatype, it can be replaced with something bigger as the database grows. I believe Oracle supports number sizes up to 10^125, which is more than the atoms in the known universe.

    If you're not talking about that but a database that has an infinite amount of records in place and they are queryable, that's a different thing. I mean like a db table representing the integers, that returns a program to infinitely enumerate them when queried, or takes a query like (SELECT value FROM integers WHERE value%2=0) and returns a new iterator to enumerate all the infinite even numbers... I know they looked at this stuff early in computer science, and I think it inspired some of the work with the lambda calculus, where you have functions operating on functions... But its a tall order, and clearly takes a re imagining of databases, to include things like columns that can't be queried to get other columns, (irreversible functions) and so on. To my knowledge that isn't in use anywhere at this point, but it would be cool to see more research in it.
     
  7. Aug 31, 2015 #6

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    I wonder, if it were the second type you cited, of an infinite amount of records/variables, if one could adapt "finite-data" techniques like PCA or Factor Analysis to them. I wonder if it would be too far-off to think of Functional -Analytic-related techniques, dealing with infinite-dimensional operators.
     
  8. Aug 31, 2015 #7
    I'm sure you could do a lot of it. In my brief look into the philosophy of math, I saw that most great mathematicians tended to be philosophical mathematical realists, while most computer science actually seemed more in line with the constructivist philosophy. The former lets you do things like reference a number that satisfies an equation without knowing it, as if it had an innate reality all its own in its unknown state, while in computer science you more often have the latter: a thing is not treated as "real" until its constructed in some way, manifest in memory. So when one gets into this idea of infinite databases, one is talking about making computer science more philosophically in line with the most popular and powerful mathematical philosophy, instead of the somewhat unpopular and dysfunctional one it most closely mirrors now. Just about any math you wanted to do would be more intuitive in such a system.

    But getting there may not be easy. One thing I've learned is that whenever we're doing something in a goofy way, there's usually a compelling reason why, often in terms of it being a lot easier...
     
  9. Sep 8, 2015 #8

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    Hi again, Fooality, all:

    This is, I think, a way of thinking of Infinite Relational(for now, the ones I am most familiar with) Databases, RDBs. Please comment:

    RDB theory is built on 1st order logic , aka, Predicate Logic. A successful query as output is a model , and the collection of attributes is put together with some connectives into wffs , I think usually the standard set-theoretical connectives : And/Intersection , Or/Union are the main ones. The attributes are predicates that are mapped into truths if the value is within the allowed constraints.

    Example: If we want to know, e.g., customers living in Texas with credit lines in the range $5,000 to $10,000 , then the results (if any) of our query are a model for

    Lives in State ##L_x## , has Credit , say ##C_x ## , and the domain constraints are : States are any of 50 US states, credit line is less than , say $100,000 and larger than $0.

    The query results are a model for ##L_x ## And ## C_x ## . We may form more complicated sentences using different connectives.

    An infinite database would then be, I think, a model for a 1st order system with infinitely-many predicates. I am thinking we may explore the case of Baseball , for which there is good data. A query may be: all hitters with ...... there are infinitely-many attributes: handedness, average with hitters on 1st and third and two outs , etc. we can refine these conditions as we want.

    Does this seem correct, and, if so, helpful to our understanding of infinite databases?
     
  10. Sep 8, 2015 #9
    I think we have a shared intuitive understanding of what we are talking about, but I haven't found a formal definition for an infinite database. If there is one I should know, tell me. Otherwise I''m winging it.

    I think good example might be an infinite table containing the productions of a context free grammar. A simple example of that might be mathematical expressions, governed by simple grammar rules, which allow mathematical terms like "((8*54)+32)*(4+3+2+1)" (productions are infinite, but those not obeying grammar rules are excluded from set)

    So an infinite table of this form would need to respond to queries. The simplest is to decide if a given expression is in the infinite table, (which in case of a context free grammar it can do) and another is to enumerate the productions, which it can also do. This enumerator for the table must also be modifiable by conditions. For instance "select all expressions containing (2+2)" would be an enumerator for the endless mathematical expressions containing that one term.

    I think it would be a cool programming construct, but it takes a new set of restraints that don't exist in normal databases: Some queries are not computable, decidable, and some are one way: For instance, the infinite table that contains the products of all pairs of primes isn't reversible in reasonable time: You can quickly get the product of two primes, but not the two primes given the product . But I don't think these restraints would hurt it too much. In fact, I could see one way tables being a good security construct... Users should be able to get customer info based on SSN, but getting SSN based on customer info is usually nefarious.
     
  11. Sep 8, 2015 #10

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    This http://link.springer.com/chapter/10.1007/978-3-540-75987-4_3#page-1 is short, but reasonably good , this http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2008/MSC/MSC-2008-15.pdf, is longer, and, after many days, only reasonable finds . I emailed one of the authors asking for background refs , hope s/he will get back to me. Maybe enough to get something of a foot in the door for now. Sorry I have not been able to find anything better for free. Wish that every single #$%^ did not feel like throwing around terms like "infinite/infinity" around as a game, to make searches more productive.

    EDIT4: A sample/example of infinite databases :"..... In the course of implementing a database system for querying Java software, we found that the universe of Java code can be effectively modeled as an infinite database...."
     
    Last edited: Sep 8, 2015
  12. Dec 13, 2016 #11
    Sounds like the tool I use for analyzing security data called Splunk, which approaches machine data as "schema-on-read". This allows you to weave the raw data together any way you can think of. For security analytics, I mostly use the tool to identify unique/uncommon/common and new/infrequent/regular event data, and data frequency pattern recognition (think a single zombie system sending an infrequent heartbeat signal out among billions of packets).
    Here's some info on how the data storage and retrieval works: http://docs.splunk.com/Splexicon:Tsidxfile
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Infinite Databases?
  1. Distributed database (Replies: 6)

Loading...