HashSet類註釋翻譯、fast-fail、源碼分析

沒看過HashMap源碼的,先看HashMap:http://blog.csdn.net/disiwei1012/article/details/73530598

一、類註釋翻譯

This class implements the <tt> Set</tt> interface, backed by a hash table
(actually a <tt>HashMap </tt> instance).  It makes no guarantees as to the
iteration order of the set; in particular, it does not guarantee that the
order will remain constant over time.  This class permits the <tt> null</tt>
element.

HashSet類實現了Set接口,底層由hash表支持(基於HashMap實現)。不能保證集合的迭代順序;特別是它不能保證元素的順序不隨時間而改變。
HashSet允許Null類型的元素。

This class offers constant time performance for the basic operations
( <tt>add </tt> , <tt> remove </tt>, <tt> contains</tt> and <tt> size</tt> ),
assuming the hash function disperses the elements properly among the
buckets.  Iterating over this set requires time proportional to the sum of
the <tt>HashSet </tt> instance's size (the number of elements) plus the
"capacity" of the backing <tt> HashMap</tt> instance (the number of
buckets).  Thus, it's very important not to set the initial capacity too
high (or the load factor too low) if iteration performance is important.

如果hash函數能夠在桶中合理的分散元素,HashSet能夠爲該類基本的操作(add、remove、contains、size)提供效率的保證。
迭代HashSet集合需要的時間是和集合元素的數量以及桶的大小成比例的。由此,如果想提高效率,就不要將集合的初始容量設置太大(或者加載因子設置太小)

Note that this implementation is not synchronized.</strong>
If multiple threads access a hash set concurrently, and at least one of
the threads modifies the set, it <i> must</i> be synchronized externally.
This is typically accomplished by synchronizing on some object that
naturally encapsulates the set.
If no such object exists, the set should be "wrapped" using the
{@link Collections#synchronizedSet Collections.synchronizedSet}
method.  This is best done at creation time, to prevent accidental
unsynchronized access to the set: <pre>
     Set s = Collections.synchronizedSet(new HashSet(...));</pre>

HashSet類不是同步的,如果多個線程同時訪問這個集合,並且大於等於一個線程對集合進行修改,那麼必須要保證同步。
典型的實現方式是:通過同步一些對象(該集合中的元素都報錯在該對象中,例如同步HashSet集合中的map對象)。
如果這種對象不存在,又想同步集合,可以這樣寫:

Collections.synchronizedSet(new HashSet(...))
The iterators returned by this class's <tt> iterator</tt> method are
fail - fast</i> : if the set is modified at any time after the iterator is
created, in any way except through the iterator's own <tt> remove</tt>
method, the Iterator throws a {@link ConcurrentModificationException}.
Thus, in the face of concurrent modification, the iterator fails quickly
and cleanly, rather than risking arbitrary, non- deterministic behavior at
an undetermined time in the future.

Note that the fail - fast behavior of an iterator cannot be guaranteed
as it is, generally speaking, impossible to make any hard guarantees in the
presence of unsynchronized concurrent modification.  Fail- fast iterators
throw <tt>ConcurrentModificationException </tt> on a best- effort basis.
Therefore, it would be wrong to write a program that depended on this
exception for its correctness: <i> the fail- fast behavior of iterators
should be used only to detect bugs. </i>

通過集合的iterator方法可以返回迭代器。這個迭代器實現了快速報錯。快速報錯(fail-fast):如果在生成迭代器後,集合被修改(除了迭代器remove方法),迭代器將拋出異常ConcurrentModificationException。
因此,在併發修改的情況下,迭代器會迅速失敗,而不會去等待。
注意,也不能保證在非併發修改的情況下,快速報錯不會被觸發,迭代器只能盡力而爲。
因此,不應該編寫一段依賴ConcurrentModificationException異常的程序。迭代器的快速報錯應該只用於檢測Bug.

二、快速報錯fail - fast的小例子

快速報錯,是指當有其他線程對一個容器(如ArrayList,HashMap)進行了結構性修改,另外一個線程在使用iterator進行迭代,那麼這個迭代線程會拋出併發修改的異常ConcurrentModificationException。
所謂結構性修改,是對原有容器的size造成影響的操作,如remove、add、clear操作等。

public static void main(String[] args) {
        List<String> stringList = new ArrayList<String>();
        stringList .add("a" );
        stringList .add("b" );
        stringList .add("c" );
        Iterator<String> iterator = stringList .iterator();
        while (iterator .hasNext()) {
            if (iterator .next().equals( "a")) {
                stringList .remove("a" );
            }
        }
    }
Exception in thread "main" java.util.ConcurrentModificationException
     at java.util.ArrayList$Itr.checkForComodification( ArrayList.java:819)
     at java.util.ArrayList$Itr.next( ArrayList.java:791 )
     at com.demo3.Student.main( Student.java:23 )

上面這個例子沒有使用多線程,其實這個原理很簡單:ArrayList有個變量記錄集合被修改的次數,當生成迭代器對象時,迭代器也會有個對象記錄此時集合被修改的此時。
在迭代器的next、remove方法前,都會判斷生成迭代器時的集合被修改次數是否等於目前集合被修改的次數,不一致時拋出ConcurrentModificationException異常。

不能通過快速失敗去判斷是否發生了某些期望的結果,因爲是否發生快速失敗是不確定的。
爲什麼說快速失敗是不確定的,其中一種可能或許是由於線程執行的前後順序不確定吧。

三、源碼

public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable, java.io.Serializable
{
    static final long serialVersionUID = -5024744406713321676L;

    // 底層使用HashMap來保存HashSet中所有元素。
    private transient HashMap<E,Object> map ;

    // 定義一個虛擬的Object對象作爲HashMap的value,將此對象定義爲static final。
    private static final Object PRESENT = new Object();

    public HashSet() {
        map = new HashMap<>();
    }

    public HashSet(Collection<? extends E> c ) {
        map = new HashMap<>(Math.max(( int ) (c .size()/.75f) + 1, 16));
        addAll( c);
    }

    public HashSet( int initialCapacity , float loadFactor) {
        map = new HashMap<>(initialCapacity , loadFactor );
    }

    public HashSet( int initialCapacity ) {
        map = new HashMap<>(initialCapacity );
    }

    HashSet( int initialCapacity, float loadFactor , boolean dummy ) {
        map = new LinkedHashMap<>(initialCapacity , loadFactor );
    }

    public Iterator<E> iterator() {
        return map .keySet().iterator();
    }

    public int size() {
        return map .size();
    }

    public boolean isEmpty() {
        return map .isEmpty();
    }

    public boolean contains(Object o ) {
        return map .containsKey( o);
    }

    public boolean add(E e ) {
        return map.put(e , PRESENT)== null;
    }

    public boolean remove(Object o ) {
        return map.remove(o )== PRESENT;
    }

    public void clear() {
        map.clear();
    }

    public Object clone() {
        try {
            HashSet<E> newSet = (HashSet<E>) super .clone();
            newSet. map = (HashMap<E, Object>) map .clone();
            return newSet ;
        } catch (CloneNotSupportedException e ) {
            throw new InternalError();
        }
    }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章