Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

How do you use Java's Hashtable?

  1. Sep 26, 2006 #1

    0rthodontist

    User Avatar
    Science Advisor

    I need a hashtable, but I can't seem to figure out how Java's Hashtable or HashMap classes are supposed to work. The keys in these classes are objects, not numbers. How does that even work? The only thing I can think is that the addresses of the keys (which are objects) are mapped to integer hash keys, which is not what I want at all. The problem is that I have a load of Comparable objects and I want to check to see if I've already used one of them, based on the compareTo method for the objects. What I optimally want is something like this:
    --I associate a number to each of my objects, which are equal iff one object equals() the other. I do not worry about the suitability of such numbers for hash keys--in particular I don't want to worry about the range of keys.
    --something in the Java library produces good hash keys based on these numbers (not on the addresses of Integers containing the numbers) and uses those keys to index my objects in a hash table

    I would like to just feed the hash table Integers containing the essential information about the objects and have it do the rest for me, but since the hash table accepts general objects, I'm worried that it won't index the Integers by their contents.
     
  2. jcsd
  3. Sep 26, 2006 #2

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    The java API documentation is actually very good -- you should get a copy.

    All objects have a hashCode method -- that's how it obtains a hash value for your object. If you don't like the default implementation (which does hash the address, as you guessed), then provide your own implementation!
     
  4. Sep 26, 2006 #3
    Hashtables in java are more generic than what one normally thinks of hashtables as. So instead of key being just numbers, you can use any object. Btw, integers are also objects so you can use them normally as you would (though you would need the "Integer" wrapper class instead of the normal "int").

    -- AI

    P.S -> hash.put(new Integer(5), new Blah());
     
    Last edited: Sep 26, 2006
  5. Sep 26, 2006 #4

    0rthodontist

    User Avatar
    Science Advisor

    I have been reading the API. Ok, so is this how it works?
    My object -> Integer "key"
    Integer -> hashCode int (equal to the value stored in the Integer)
    hashCode -> index in Hashtable
     
    Last edited: Sep 26, 2006
  6. Sep 26, 2006 #5

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Right; Integer.hashCode() is based on the value contained in the integer, and not the address.
     
  7. Sep 26, 2006 #6

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Oh, one more thing...

    Anytime you override the equals, you're really supposed to also override hashCode to agree with it -- that is, if two things are equal, then they have equal hash codes. Most (all?) classes in the standard packages should satisfy that.
     
  8. Sep 26, 2006 #7

    0rthodontist

    User Avatar
    Science Advisor

    OK--but I won't be using my objects directly as keys so I don't need to worry about that. The one thing though is that the hashCode of the Integer will then be scrambled and scaled so as to be suitable for an actual hash key, right?
     
  9. Sep 26, 2006 #8

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Right. I think you can look up the actual function it uses in the documentation. (what I presume you mean by "Scaling" is done in the hash table itself)
     
  10. Sep 26, 2006 #9
    There are fast MD5 implementations available. Combine that with toString function to MyObject and i think you can get something what you are looking for.

    -- AI
     
  11. Sep 26, 2006 #10

    0rthodontist

    User Avatar
    Science Advisor

    Actually, if this is all true, then probably the simplest thing is to implement hashCode for my objects and then use them as keys for themselves. Thanks
     
    Last edited: Sep 26, 2006
  12. Sep 29, 2006 #11

    0rthodontist

    User Avatar
    Science Advisor

    It doesn't seem to be working. I have a class CubeState, and defined the hashCode for that as hashCode(this.toString()); Then I do
    Code (Text):
    HashMap<CubeState, CubeState> hm = new HashMap<CubeState, CubeState>(100000);
    ...
    if(!hm.containsValue(ctmp))
    {
        ...
        hm.put(ctmp, ctmp);
    }
     
    (ellipses indicate missing code, and ctmp is a CubeState)
    The problem is that this is incredibly slow. I have two versions of the program, one that's like above and another that is exactly the same except I've commented out those three uses of hm and the curly braces. The one that does not use the hash map takes a few milliseconds to do what the one that does use the hash map takes a few minutes to do. Each time I do that thing, it requires less than 10000 iterations of the code that checks for membership in the hash map. In theory the hash map should make it more efficient; in practice it's ridiculously slow. I've tried using a faster-calculated hashCode but that doesn't change anything.
     
    Last edited: Sep 29, 2006
  13. Sep 29, 2006 #12

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    (1) I think it's generally better to let a hash table start off at its default size, rather than initializing it to something large. (though I suppose you can try it both ways)

    (2) I think you want HashSet, not HashMap.

    (3) Your description of hashCode makes no sense -- did you mean to say that you did something like

    class CubeState {
    // ...
    public int hashCode() {
    return toString().hashCode();
    }
    // ...
    }

    ? (And have presumably overridden the toString method?)

    (4) Judging from the timing behavior... is all of this code inside of a function that you call 10000 times? If so, then it's a total waste -- each time you call the function, it will create a new hash map, do all the work, store the result in the hash map, and then promptly discard the hash map! The hash map should probably be a static data member of your class.
     
    Last edited: Sep 29, 2006
  14. Sep 29, 2006 #13

    0rthodontist

    User Avatar
    Science Advisor

    I did try it at its default size of 16, but since I'm going to be storing a few hundred thousand things in it (or would if it were running fast enough to get them there!) I figured that increasing the size right off might save time wasted in increasing the table size dynamically.
    I guess I could have used that. The API says HashSet uses an instance of HashMap anyway.

    Yes. (Actually I happened to call it cubeString instead of overriding toString but same idea) I needed a string representation of certain essential features of the CubeState anyway, and the string happened to contain exactly the same features that I wanted to use to differentiate between different CubeStates. So it was convenient.
    No, the lines except for the first are within a loop, and the first line is outside the loop. The function this stuff is in is called once.
     
    Last edited: Sep 29, 2006
  15. Sep 29, 2006 #14

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Then I don't think I could guess what's going on unless I saw the code. (Since I doubt your cubeString() method is -that- inefficient...)

    You didn't accidentally comment out the part of the program that was doing the work, did you? :smile: I've done that before. :frown:
     
  16. Sep 29, 2006 #15

    0rthodontist

    User Avatar
    Science Advisor

    Well, thanks--when I replace the HashMap with a HashSet it's fast again. I still don't know how to use the HashMap though.
     
  17. Oct 9, 2006 #16
    HashMap and HashSet are two different things in terms of Data Structures.

    A Set is similar to that of a Mathematical Set namely you have a list of objects but there are no duplicates and the order in which the objects were inserted is generally not perserved.

    On the other hand a Map could be consider as a "set of mappings between keys and values". Namely a key has an associated object.

    Sorry for a fruit, toaster, color example but as Map goes this came to my mind. Consider that you wan't to have stored a set of Persons and their favorite colors, one way to do that is ofcorse create a Person class with a field favorite Color but you can might aswell create a Map that is you Map a Key (Person) with a specific Value (e.g. Color). Put into code

    notice in documentation HashMap<K,V> - > Key, Value pair

    HashMap<String,Color> hm = new HashMap<String,Color>();
    hm.put("John, Doe", Color.WHITE);
    hm.put("George W Bush", Color.RED);

    There you can access the color as such

    Color color = hm.get("John, Doe");

    notice this is a Hash Map, Hash just means that we will use HashCode as a means of organizing and accessing the mapping, you can read more about that in almost every tutorial on Data Structures.

    Now a Set is a different story, you have no Key-Value pairs you just have a "list" that has no duplicates and the order doesn't matter, now you would ask why use HashSet and not ArrayList? Well generally speaking HashSet is more performant that a plain ArrayList due to the internal organisation of data.

    I would recommend you a book called BigJava if you can find it in your local library(other-wize it is too expensive) it is worft it to read it, there is also a superb chapter on Collections in the Core Java II book, here
    http://www.phptr.com/articles/article.asp?p=368648&rl=1 but might a be a bit to heavy for the first read.
     
  18. Oct 14, 2006 #17

    0rthodontist

    User Avatar
    Science Advisor

    Incidentally, I looked up a Java HashSet implementation. It seems that a difference between the way I was using HashMap and the way HashSet uses HashMap, is that I was doing a containsValue() check and HashSet does a containsKey() check on its internal HashMap. This may account for the performance difference, though it doesn't make total sense to me why it should.
     
  19. Oct 14, 2006 #18

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Ye gads, that would do it. "containsValue" has to do a brute force search through the entire HashMap until it finds the desired value or reaches the end. And since you initialized it to have a very large capacity...
     
    Last edited: Oct 14, 2006
  20. Oct 14, 2006 #19

    0rthodontist

    User Avatar
    Science Advisor

    Yeah, I guess. It's kind of annoying that they have a containsValue() method in the first place when there's no efficient implementation behind it! It should be marked "deprecated."
     
    Last edited: Oct 14, 2006
  21. Oct 14, 2006 #20

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    It makes sense from both a consistency and convenience viewpoint: every other container class (even ArrayList, for example) implements a contains(Object) method. And searching for a value is something one would sometimes want to do. Map.values().contains(Object) is awkward syntax, and probably execlutes more slowly for most Map types. Some Map classes might even have efficient value lookups, and it would be a shame to force the user to break the container abstraction to access such a useful method.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?