Reviewing the WeakHashMap
The WeakHashMap is a Java collection that uses the WeakReference class to hold its keys. As we've already seen, weak references to an object are cleared by the garbage collector as soon as there are no strong or soft references to the same object. The result is that entries only remain in the WeakHashMap as long as there are references to the map keys lying around your JVM.
Here is an example:
import java.util.WeakHashMap;
public class WeakHashMapSample1 {
public static void main(String[] args) {
WeakHashMap weakHashMap = new WeakHashMap();
// Create a key for the map, but keep the strong reference
String keyStrongReference = new String("key");
weakHashMap.put(keyStrongReference, "value");
// Run the GC and check if the key is still there.
System.gc();
System.out.println(weakHashMap.get("key"));
// Now, null-out the strong reference and try again.
keyStrongReference = null;
System.gc();
System.out.println(weakHashMap.get("key"));
}
}The code above prints:
value null
What happened here? At the first time we called System.gc(), there was still a reference to the key in the
keyStrongReference variable. Because of this, the map key was not cleared by the garbage collector. In the following line, we got rid of the strong reference to the key and tried again. This time, the call to weakHashMap.get("key") returned null.Implementing a Weak Object Pool with a WeakHashMap
If you make heavy use of small immutable classes, like String and the primitive wrapper classes like java.lang.Integer, you can take advantage of the WeakHashMap behavior to share instances and reduce memory usage, while at the same time not having to worry about objects lingering in memory after they are no longer used. The real benefit depends on how many instances you create and how the values are distributed, but you can potentially reduce the number of objects created by several orders of magnitude.
Here is an example of a weak object pool:
import java.lang.ref.WeakReference;
import java.util.Map;
import java.util.WeakHashMap;
/**
* Oversimplistic implementation of an object pool
*/
public class WeakObjectPool {
// Map where the key is an object, and the value is a weak reference
// to the same object. We use the key to do the lookup, and the value
// to actually return the object when it is found.
private Map map = new WeakHashMap();
public Object replace(Object object) {
WeakReference reference = (WeakReference) map.get(object);
if (reference != null) {
Object result = reference.get();
// Another null check, since the GC may have kicked in between the
// two lines above.
if (result != null) {
return result;
}
}
// If we got here it is because the map doesn't have the key, add it.
map.put(object, new WeakReference(object));
return object;
}
}Now, this class can be used like this:
class ObjectPoolClient {
private static WeakObjectPool objectPool = new WeakObjectPool();
public static void main(String args[]) throws Exception {
BufferedReader reader = new BufferedReader(new FileReader("input.csv"));
List<String[]> parsedLines = new ArrayList<String[]>();
String line;
while ((line = reader.readLine()) != null) {
String[] elements = line.split(",");
for (int i = 0; i < elements.length; i++) {
// replace the string read from the file with the pool instance
elements[i] = (String) objectPool.replace(elements[i]);
}
parsedLines.add(elements);
}
reader.close();
// Cool, we saved a lot of memory by reusing the repeated strings!
doSomethingInteresting(parsedLines);
// Now, we get rid of the references and soon the garbage collector
// will reclaim the memory
parsedLines = null;
doMoreInterestingStuff();
}
}Assuming the input file contains lots of repeated values, we've been able to save a lot of heap space by not having the same string repeated over and over again in memory. Also, we get the added benefit of releasing the memory when the strings are no longer needed.
Weak Object Pool in conjunction with the Flyweight pattern
The Flyweight pattern is a perfect match for the Weak Object Pool. The assumption behind this pattern is that the flyweight instances are shared to reduce memory consumption. The flyweight factory could use a Weak Object Pool to store the flyweights. This way, once a given flyweight is no longer in use, it will be released from the factory's storage and its memory reclaimed.
Words of caution
Don't go out using Weak Object Pools everywhere you have immutable classes instantiation. It is only an advantage to use it when there is a lot of repetition of the values. If the values are more randomly distributed, you will be better off not using this pattern, because it incurs in a small memory overhead for the internal map structures. If you apply the Weak Object Pool pattern in the wrong situation, you may end up with worse performance!
Also, it is very important to use this pattern only to store immutable classes . If you use this pattern for a non-immutable class, you can find yourself with bugs that are very difficult to reproduce and fix. Those bugs may happen if an instance of an object stored in the Weak Object Pool is shared by two completely unrelated clients, and one client modifies the instance. Then the other client will see the modified value and it will be very difficult to trace the original modification of the object.
More information:
- Wikipedia entry about the flyweight pattern
- WeakHashMap is not a cache!: My first article about the WeakHashMap class and the most common misconception about its applicability.
- Immutable Classes: an introductory article about immutable classes.


13 comentários:
Sorry if this sounds stupid, but what would be the difference, on the CSV example, of having a Weak Object Pool over a HashSet, for instance?
I understand that the key benefit would be if you use all the values from the CSV while processing and, later on, you no longer reference them all, allowing the weak references to be collected.
Am I right or missing something :) ?
Hi Felipe, it surely doesn't sound stupid. In that example, it is true that there wouldn't be any difference between using a HashSet or a WeakObjectPool. I wrote the example to be as simple as possible on purpose in order to improve readability and it turns out I simplified it so much that the benefit wasn't clear.
To really take advantage of the instance pool you would have to make it a singleton so that you can reuse instances between unrelated method invocations. It would be even better to make it thread safe and then you can share the same instance pool among threads.
it's very good.but i have a doubt that if the WeakHahMap object is garbage collected or else if it takes more memory then.sorry if iam foolish
Nice, I liked the auto cleanup that you earn here. I'm sure I'll use it someday.
Just a small comment about the need to use this for Integer caching, the Integer class already implements a small cache for the frequent -127-128 value. I've written about it here in my blog
I know this is an old post, but I think it's worth noting for anyone else that comes across it that the WeakHashMap class uses WeakReference objects internally. By wrapping the key in a WeakReference object yourself you are just adding unnecessary overhead. Each key you set will end up being a WeakReference object pointing to a WeakReferenceObject pointing to the actual key.
The contents of the replace function should be more like:
Object current = map.get(object);
if(current != null)
return current;
map.put(object, object);
return object;
Looks like you misread the code Scott.
The *values* are being wrapped in a WeakReference, not the keys.
If you just put the object itself in the map with map.put(object, object) then there would be a strong reference to object as the WeakHashMap class only wraps keys in WeakReferences
Hi there. Just one question. Why are you wrapping the objects in a WeakReference? Wouldn't the whole map entry get discarded given the keys are weak?
I ask particularly because on Oracle's website they seem to imply that this is only needed if your values refer to your keys (and hence hold a strong reference to them, and the entry would never be discarded)?
http://docs.oracle.com/javase/6/docs/api/java/util/WeakHashMap.html
Hi KieronW,
We have to wrap the object in a WeakReference because if we didn't, then there would be a strong reference to that object and it would never be garbage collected.
Remember the WeakHashMap useas WeakReferences for the keys, but strong references for the values. The garbage collector decides whether to garbage collect an object based on the strongest type of reference to that object. In this case, every object in the map would have a strong reference pointing at it, so nothing would be garbage collected.
Thanks for the quick answer! I was quite confused by the behaviour, so I had a look in the WeakHashMap class. It looks like it does get rid of map entries - done in the expungeStaleEntries() method - but it seems it only ever does it when there are operations performed on the map (and after keys have been GC'ed). However, if your map is in constant use, it does look like the values are also cleared out, and it uses the ReferenceQueue for this.
I could be wrong here, so I'll have a look at the weekend and see if I can verify it in code.
Hi KieronW,
You can run this simple test to see what I'm talking about:
WeakHashMap<String, String> map = new WeakHashMap<String, String>();
String value1 = new String("value1");
map.put(value1, value1);
System.gc();
System.out.println(map.get("value1"));
value1 = null;
System.gc();
System.out.println(map.get("value1"));
This will print:
value1
value1
Which means the second call to System.gc() did not clear the map even though you cleared the strong reference in the previous line.
Now if you do this:
WeakHashMap<String, WeakReference<String>> map = new WeakHashMap<String, WeakReference<String>>();
String value1 = new String("value1");
map.put(value1, new WeakReference<String>(value1));
System.gc();
System.out.println(map.get("value1").get());
value1 = null;
System.gc();
WeakReference<String> wref = map.get("value1");
String ref = null;
if (wref != null) {
ref = wref.get();
}
System.out.println(ref);
Then you will get:
value1
null
Ah, sorry. I have totally confused the situation by not reading your second article properly.
What I was trying to do is create basically a short-term cache, not an object pool like you
are doing here.
Basically this:
String[] expressionStrs = createExpressions(); // a large number, with many duplicates
for (String expressionStr : expressionStrs) {
Expression expression = expensiveParse(expressionStr);
result.add(expression);
}
So what I was trying to do is use a cheap WeakHashMap[String,Expression] cache that only parses
expressions if it is genuinely different to previous ones. Once the expressionStrs falls out of
scope, and GC runs, my WeakHashMap becomes empty. Then, once I have finished with my expressions,
I get all my memory back.
Cheap (I don't want stuff to stick around like SoftReferences do) and easy to manage.
But of course, totally different to what you are doing here. So I guess this was more appropriate
for your first article - WeakHashMap can be used for a very specific type of cache. :)
Anyway thanks, I found your article very informative.
Hi KieronW,
This is exactly what the WeakHashMap is for!
Yep, and it works nicely. Hence why I got confused when I thought this article was the same as my usecase. I thought I must be missing something, and indeed I was, some glasses ;)
Post a Comment