classpath
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [really patch] Re: HashMap putAll/putAllInternal bug


From: Bryce McKinlay
Subject: Re: [really patch] Re: HashMap putAll/putAllInternal bug
Date: Wed, 1 Oct 2003 11:27:29 +1200

On Tuesday, Sep 30, 2003, at 02:57 Pacific/Auckland, Stuart Ballard wrote:

Furthermore, I'd argue that the very existence of a hasNext() method suggests that Sun didn't intend people to make this "optimization". If they expected that calling size() and maintaining your own counter would always be more efficient, why didn't they just leave hasNext() out and recommend coding that way?

"while (list.hasNext())" is there because its a nicer style for many situations, it makes your code simpler and less error-prone compared to using a for loop based on size. However, its existence doesn't mean that size() shouldn't work! IMO you should implement size() correctly in your code, it is not specified how putAll() etc work and there is no guarantee that Sun won't change their implementation in a future release of the JRE!

Note too that our current implementations leave the collections in an invalid state if next() causes a ConcurrentModificationException (or other RuntimeException) part way through, because they set the size variable in advance without waiting to ensure that all those elements could in fact be added. It would be possible to fix this problem without using hasNext(), and even if my patch isn't accepted I still think we should at least fix that.

Yeah, those should probably be cleaned up. In general it should be assumed that everything is invalid if you get ConcurrentModificationException, but its good to be defensive.

Of course, if there are real applications out there that rely on the way Sun implements it, then we may have to change to using hasNext(). But we should consider this carefully. If we must change it, then the addAll/putAll implementations should change throughout the collections classes for consistency - not just HashMap/Hashtable. As you noticed, in some cases this could make things significantly less efficient.

I think it's perfectly possible to implement all our collections to give the correct behavior without making things less efficient in the case where size() is correct - at least, the only difference should be the cost of repeated calls to hasNext() versus a single call to size(), and the cost of hasNext() should *always* be small enough that that difference falls into the realm of micro-optimization.

The optimization in using size comes simply from the reduction in method calls. Its true that some very smart JITs may be able to compile special copies of the method with the virtual calls to the iterator inlined where they see, for example, many putAll calls with iterator arguments of a certain class, but in general, if you add an extra method call into a loop, its going to make the code slower. But, you're quite right, we are talking about micro-optimization here.

The example I gave was ArrayList: the optimization is to pre-allocate space for size() worth of elements using ensureCapacity(), and then copy elements directly into the backing array. It would be perfectly possible to implement it something like this, which optimizes for when size() is correct but is robust if it's not:

I don't think is quite as simple as that. Consider the case of an addAll() with an index < size. The current implementation moves existing elements out of the way based on size(). If size() doesn't equal the actual number of elements returned, your going to have to make sure not to overrun existing elements and/or do fixups afterwards. Does Sun's implementation of addAll() really not call size()?

However, having said all this, the argument for compatibility with Sun's implementation is somewhat compelling (even though only poorly written applications would suffer from this problem, IMO!) . I don't really object to this change, at least if we can come up with good solutions for things like the addAll() problem.

Regards

Bryce.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]