jdk集合源碼之ArrayList

經常使用jdk提供給我們的集合，比如ArrayList，LinkedList，HashMap等等，還學習過他們之間的不同和相同點，比如ArrayList查詢快，新增刪除慢，LinkedList則相反，沒有看過底層的源碼，是沒辦法理解這些特性的。ArrayList是基於數組的實現，在內存中是一塊兒連續的空間，所以查詢速度快，但是當涉及到新增和刪除的時候，需要涉及到數據的拷貝，而LinkedList是基於鏈表的實現，更嚴格來說是雙向鏈表，鏈表的新增和刪除只是涉及到對指針的操作，速度肯定快，但是相應的查詢需要遍歷鏈表。所以這兩種結構需要分情況來使用，jdk collection包對於迭代器實現非常優美，屏蔽了不同底層數據結構的差異，提供統一的遍歷元素接口給上層應用。下面就結合jdk的源碼分析一下ArraList的實現。

<span style="background-color: rgb(160, 255, 255);">ArrayList</span>繼承自AbstractList，AbstractList主要定義了一些常用的增刪改查的接口，注意它也給出了默認的迭代器實現，包括單向和雙向的。

private transient Object[] elementData;

上面的數組elementData就是存儲我們元素的容器，是一個Object的數組。

先分析下構造方法：

/**
     * Constructs an empty list with the specified initial capacity.
     *
     * @param   initialCapacity   the initial capacity of the list
     * @exception IllegalArgumentException if the specified initial capacity
     *            is negative
     */
    public ArrayList(int initialCapacity) {
	super();
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
	this.elementData = new Object[initialCapacity];
    }

    /**
     * Constructs an empty list with an initial capacity of ten.
     */
    public ArrayList() {
	this(10);
    }

    /**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's
     * iterator.
     *
     * @param c the collection whose elements are to be placed into this list
     * @throws NullPointerException if the specified collection is null
     */
    public ArrayList(Collection<? extends E> c) {
	elementData = c.toArray();
	size = elementData.length;
	// c.toArray might (incorrectly) not return Object[] (see 6260652)
	if (elementData.getClass() != Object[].class)
	    elementData = Arrays.copyOf(elementData, size, Object[].class);
    }

最常用的是第二種，不帶參數的，實際上無參構造方法，調用的是第一個構造函數，並傳入10，這個10就是指的數組的容量，注意不是數組中實際存儲的數據的size，默認爲10。生成一個容量爲10的數組。這個方法也很簡單，重點看一下最後一種構造函數，依賴傳入的collection新建一個ArrayList，

elementData.getClass() != Object[].class

這句話的主要目的是修復jdk的一個bug，bug號爲6260652，這個bug主要是因爲Arrays.asList方法返回一個List，如果針對該List調用toArray方法，返回的並不是Object[]，而是具體的類型，比如Arrays.asList("a").toArray()返回的是String數組，如果使用該List構造一個ArrayList並且填加一個int類型的數據，會報ArrayStoreException異常，所以需要向上轉型爲Object[].class類型。

接着看add方法：

public boolean add(E e) {
	ensureCapacity(size + 1);  // Increments modCount!!
	elementData[size++] = e;
	return true;
    }

ensureCapacity方法是一個挺關鍵的方法，首先的一個功能是確保數組還有剩餘的空間，其次它裏面對一個非常重要的變量做了++操作：

public void ensureCapacity(int minCapacity) {
	modCount++;
	int oldCapacity = elementData.length;
	if (minCapacity > oldCapacity) {
	    Object oldData[] = elementData;
	    int newCapacity = (oldCapacity * 3)/2 + 1;
    	    if (newCapacity < minCapacity)
		newCapacity = minCapacity;
            // minCapacity is usually close to size, so this is a win:
            elementData = Arrays.copyOf(elementData, newCapacity);
	}
    }

modCount變量是父類中的一個變量，主要的作用是在集合結構改變的時候就++一次，記錄集合被改變的次數，modCount只在使用迭代器遍歷的時候才使用，防止遍歷的時候有併發的操作，或者遍歷的時候使用非迭代器提供的方法來改變集合的結構（新增，刪除等操作）。實現的原理是當迭代的時候，獲取迭代前的modCount值並在每次迭代的時候比較該值是否改變。迭代器的內容後面會講到，繼續看ensureCapacity方法，如果當前數組的容量不夠，需要擴容，每次擴大1.5倍加1的容量。之後將新增的元素放到當前list的size位置，size++。

再來看下任意位置的插入操作：

public void add(int index, E element) {
	if (index > size || index < 0)
	    throw new IndexOutOfBoundsException(
		"Index: "+index+", Size: "+size);

	ensureCapacity(size+1);  // Increments modCount!!
	System.arraycopy(elementData, index, elementData, index + 1,
			 size - index);
	elementData[index] = element;
	size++;
    }

假設當前的存儲結構如下圖：

需要將d插入到位置1，需要分兩步操作，將1（包括1）後面的部分向後移動一個位置，將d插入到位置1。移動的操作使用的是System.arraycopy方法，

從該數組的1（包括1）開始，複製到該數組的index+1=2位置開始，然後複製size-index個元素，也即將index包括index後面的元素全部後移index+1-index=1個位置。

分析完單個元素的插入，再來分析更復雜的多個元素的隨機插入。

<pre name="code" class="java">public boolean addAll(int index, Collection<? extends E> c) {
	if (index > size || index < 0)
	    throw new IndexOutOfBoundsException(
		"Index: " + index + ", Size: " + size);

	Object[] a = c.toArray();
	int numNew = a.length;
	ensureCapacity(size + numNew);  // Increments modCount

	int numMoved = size - index;
	if (numMoved > 0)
	    System.arraycopy(elementData, index, elementData, index + numNew,
			     numMoved);

        System.arraycopy(a, 0, elementData, index, numNew);
	size += numNew;
	return numNew != 0;
    }

假設當前的初始結構如下，需要插入的集合元素爲d和e：

addAll方法中的numNew爲待插入集合的長度，該長度也是位置index及其後面的元素需要移動的位置數。如果直接插入到ArrayList的末尾，則直接應用一次arraycopy函數，假設需要插入到3的位置則將a的元素從0開始複製到elementData數組從位置index=3開始的位置，複製numNew=2個元素。

如果插入的位置不在末尾，在情況稍微複雜一點，需要先移動元素，假設插入的位置爲1，則需要將從1開始的後面的所有元素size-index=2，向後移動numNew=2個元素。

即System.arraycopy(element,1,element,3,2)，之後繼續執行arraycopy函數，完成整個插入的動作。

新增全部分析完，可以看到主要是通過System.arraycopy函數完成，再來看一下remove的操作，可以想象應該也是通過System.arraycopy完成，只不過是需要向前移動。

public E remove(int index) {
	RangeCheck(index);

	modCount++;
	E oldValue = (E) elementData[index];

	int numMoved = size - index - 1;
	if (numMoved > 0)
	    System.arraycopy(elementData, index+1, elementData, index,
			     numMoved);
	elementData[--size] = null; // Let gc do its work

	return oldValue;
    }

假設初始的結構如下

現在需要刪除的元素是e，也即index爲2，需要將index+1=3及其後面的size-index-1=2個元素向前移動一個位置。移動完成後的結構：

然後將--size位置的元素置爲null。新增刪除方法看完，再來看看查詢的操作。

public int indexOf(Object o) {
	if (o == null) {
	    for (int i = 0; i < size; i++)
		if (elementData[i]==null)
		    return i;
	} else {
	    for (int i = 0; i < size; i++)
		if (o.equals(elementData[i]))
		    return i;
	}
	return -1;
    }

ArrayList的查找是從頭往後開始，返回第一個滿足條件的位置，且ArrayList中可以存儲null對象，因此查找動作需要分null和非null兩種情況，如果沒有找到則返回-1。

public void trimToSize() {
	modCount++;
	int oldCapacity = elementData.length;
	if (size < oldCapacity) {
            elementData = Arrays.copyOf(elementData, size);
	}
    }

trimToSize的方法名已經表明了該函數的作用，就像cleancode中講到的方法命名需要做到知名達意，該方法就是將當前未使用到得數組空間刪除，使capacity正好等於size，使用的Arrays.copyOf方法完成，因此這個方法也不能隨意調用，設計到拷貝動作，還是很浪費資源的。

/**
     * Returns a shallow copy of this <tt>ArrayList</tt> instance.  (The
     * elements themselves are not copied.)
     *
     * @return a clone of this <tt>ArrayList</tt> instance
     */
    public Object clone() {
	try {
	    ArrayList<E> v = (ArrayList<E>) super.clone();
	    v.elementData = Arrays.copyOf(elementData, size);
	    v.modCount = 0;
	    return v;
	} catch (CloneNotSupportedException e) {
	    // this shouldn't happen, since we are Cloneable
	    throw new InternalError();
	}
    }

ArrayList的clone方法就像註釋上描述的一樣，返回的是對象的潛拷貝，如果存儲的是對象，拷貝的只是對象的引用。

ArrayList繼承自AbstractList，並且實現了List<E>, RandomAccess, Cloneable, java.io.Serializable這四個接口，後面三個接口都是標識接口，接口中並沒有任何方法，分別表明ArrayList支持隨機訪問，克隆，序列化，主要是方便instanceof方法來進行識別。如果一個類沒有實現Cloneable接口，則調用Object的clone方法會拋出CloneNotSupportedException異常。我們主要來看下AbstractList爲子類提供的迭代器的功能。首先看單向的迭代器，只能向後迭代，Itr是AbstractList中的內部類，主要有三個成員變量，cursor初始值爲0標示當前迭代的元素，lastRet，expectedModCount初始值爲modCount的值，檢測迭代期間集合接口是否被非法的修改。

hasNext如果cursor!=size則返回true，否則false，cursor的移動操作，在next方法中，每當調用next的時候，先保存當前cursor指向的值，然後賦值給lastRet，最後cursor++；所以這裏的lastRet（非-1）永遠指向當前返回的元素的下標。目前還沒看出來lastRet究竟有什麼用，不急，先看下remove方法（迭代期間可以安全的進行刪除操作）。

public void remove() {
	    if (lastRet == -1)
		throw new IllegalStateException();
            checkForComodification();

	    try {
		AbstractList.this.remove(lastRet);
		if (lastRet < cursor)
		    cursor--;
		lastRet = -1;
		expectedModCount = modCount;
	    } catch (IndexOutOfBoundsException e) {
		throw new ConcurrentModificationException();
	    }
	}

如果lastRet==-1則拋出異常，那麼什麼時候lastRet爲-1，首先初始化的時候，也就是迭代的時候，什麼都沒做，先去remove，第二種情況是當針對當前元素做過一次remove的時候，再次調用remove的時候。AbstractList.this.remove(lastRet);方法表明刪除的是當前的元素，由於刪除操作涉及到modCount的改變，所以需要對expectedModCount重新一次賦值。

普通的單向迭代器只提供了一個remove操作，雙向的迭代器在繼承單向迭代器的基礎上還提供了set和add操作，支持指針的向前移動。

public boolean hasPrevious() {
	    return cursor != 0;
	}

        public E previous() {
            checkForComodification();
            try {
                int i = cursor - 1;
                E previous = get(i);
                lastRet = cursor = i;
                return previous;
            } catch (IndexOutOfBoundsException e) {
                checkForComodification();
                throw new NoSuchElementException();
            }
        }

如果cursor!=0表明前面還有元素，previous返回當前下標的前一個節點，比如當前cursor爲5則返回下標爲4的元素，並且lastRet等於cursor，分別做-1操作。也就是都分別指向本次迭代返回的元素。再來看看修改操作：

public void set(E e) {
	    if (lastRet == -1)
		throw new IllegalStateException();
            checkForComodification();

	    try {
		AbstractList.this.set(lastRet, e);
		expectedModCount = modCount;
	    } catch (IndexOutOfBoundsException ex) {
		throw new ConcurrentModificationException();
	    }
	}

由於可以更新多次，這裏並沒有將lastRet置爲-1。注意set操作也只能在next操作後才能正常使用。

public void add(E e) {
            checkForComodification();

	    try {
		AbstractList.this.add(cursor++, e);
		lastRet = -1;
		expectedModCount = modCount;
	    } catch (IndexOutOfBoundsException ex) {
		throw new ConcurrentModificationException();
	    }
	}
    }

add操作，針對當前的元素在其後面增加新的元素，並且lastRet置爲-1。add操作完成後不能立馬調用remove和set操作。

迭代期間增刪改並不影響我們的迭代過程。

jdk集合源碼之ArrayList

各種排序算法python和java實現(二)

jvm的happens-before原則

關於類的初始化

一個關於awk命令和sort命令的小例子

從一道題目看類加載

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結