集合框架之ArrayList源碼分析

集合框架之`ArrayList`源碼分析

一、繼承結構

ArrayList中繼承實現是這樣的public class ArrayList<E> extends AbstractList<E> implements List<E>, RandomAccess, Cloneable, java.io.Serializable。其中RandomAccess、Serializable和Cloneable是三個標記接口，其實沒有任何方法。

RandomAccess：RandomAccess 是 List 實現所使用的標記接口，用來表明其支持快速（通常是固定時間）隨機訪問。在需要的邏輯中用instanceof來做專門判斷處理。

public static void shuffle(List<?> list, Random rnd) {
	int size = list.size();
    // 效率上不同，根據RandomAccess來選擇一個更快的處理方式, 類似於註解
    if (size < SHUFFLE_THRESHOLD || list instanceof RandomAccess) {              		
    	for (int i=size; i>1; i--)
			swap(list, i-1, rnd.nextInt(i));
                                                                 
	} else {
		Object arr[] = list.toArray();

		// Shuffle array
        for (int i=size; i>1; i--)
			swap(arr, i-1, rnd.nextInt(i));

        // Dump array back into list
        ListIterator it = list.listIterator();
        for (int i=0; i<arr.length; i++) {
            it.next();
            it.set(arr[i]);
        }
    }
}

Serializable：標記接口，只要實現了該接口，即相當於開啓可序列化操作標識。否則不能進行序列化。
Cloneable：標記接口，只有實現這個接口後，然後在類中重寫Object中的clone方法，然後通過類調用clone方法才能克隆成功，如果不實現這個接口，則會拋出CloneNotSupportedException(克隆不被支持)異常；Cloneable接口只是個合法調用 clone() 的標識（marker-interface）

這些都是可以理解爲註解的，註解中@interface也是一個接口，註解相當於是語法糖（說法不準，只爲理解），也相當於上面說的標識接口。

二、相關方法分析

3.1 常量方法

/** 默認初始化容量爲10 */
private static final int DEFAULT_CAPACITY = 10;

/** ArrayList空實例共享的一個空數組 */
private static final Object[] EMPTY_ELEMENTDATA = {};

/**  */
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

/**  transient不可序列化, 真正存儲數據的元素 */
transient Object[] elementData; 

/** ArrayList的大小（它包含的元素數）  */
private int size;

/** 最大長度 */
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

/** 被修改的次數 */
protected transient int modCount = 0;

3.2. 構造方法

有參構造函數

/**
 * 創建一個指定容量的空list
 *
 * @param  initialCapacity  初始化list的容量
 * @throws IllegalArgumentException 如果指定的初始容量爲負
 */
public ArrayList(int initialCapacity) {
    // 判斷initialCapacity是否大於0
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {  // 如果初始容量爲0
        // 將EMPTY_ELEMENTDATA指代的空數組賦值給elementData
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        // initialCapacity小於0則拋出參數異常
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    }
}

無慘構造函數

/**
 * 創建一個容量爲10的空list.但是默認爲0，只有第一次add時候才能會將容量改爲10，這也是一種延時分配容量的策略，畢竟我們很多時候不會new出ArrayList之後就立馬使用。
 */
public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

使用Collection構造ArrayList

/**
 * 通過集合來構造一個List
 *
 * @param c 集合元素
 * @throws NullPointerException 如果指定的集合爲null
 */
public ArrayList(Collection<? extends E> c) {
    // 將集合轉成數組賦值給elementData
    elementData = c.toArray();
    // 判斷元素個數是否等於0
    if ((size = elementData.length) != 0) {
        // 判斷 elementData 的 class 類型是否爲 Object[]，不是的話則做一次轉換
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        // 如果元素個數爲0 則會使用 空數組替換 可以理解爲 new ArrayList(0)
        this.elementData = EMPTY_ELEMENTDATA;
    }
}

3.3. `add`方法

add方法涉及到擴容操作，具體看一下整個執行路徑上涉及的方法！

先是add方法：

/**
 * 插入指定元素到list的末尾
 *
 * @param e 需要插入list的元素
 * @return <tt>true</tt> (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    // 添加元素的時候先將size + 1，然後開始確認容量大小是否合適，不合適就擴容
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    // 添加元素，並將size加一
    elementData[size++] = e;
    // 返回添加成功
    return true;
}

private void ensureCapacityInternal(int minCapacity) {
    ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}

/**
 * 計算list容量
 *
 * @param elementData 當前存儲數據的數組
 * @param minCapacity 最小容量
 */
private static int calculateCapacity(Object[] elementData, int minCapacity) {
    // 如果是使用無參構造生成的list，這個if是true， 那麼直接返回默認的容量10。
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        // 第一次添加元素的  DEFAULT_CAPACITY = 10 ， minCapacity = 1,
        return Math.max(DEFAULT_CAPACITY, minCapacity);
    }
    return minCapacity;
}

/**
 * 確保容量足夠，不夠的話就需要進行擴容
 * @param  minCapacity 容量
 */
private void ensureExplicitCapacity(int minCapacity) {
    // modCount 記錄操作次數加一
    modCount++;

    // minCapacity代表的時候數組下一個位置的座標，如果大於數組的長度。則說明需要擴展容量了
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

/**
 * 增加容量以確保它至少可以容納最小容量參數指定的元素數量
 *
 * @param minCapacity 所需要的最小容量
 */
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    // 右移相當於除以2， 所以此處新的容量是就容量的1.5倍
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    // 如果新的容量還是小於所需要的最小容量，那麼就直接將所需要的最小容量賦值給新的容量
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    // 如果新的容量大於數組最大容量 
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        // 限制最大長度爲Integer.MAX_VALUE 
        newCapacity = hugeCapacity(minCapacity);
    // 將原來的數據複製到新的空間中
    elementData = Arrays.copyOf(elementData, newCapacity);
}

3.4. 指定位置add元素

/**
 * 將指定的元素插入此列表中的指定位置
 * 將當前在該位置的元素（如果有）和任何後續元素右移（將其索引添加一個）
 *
 * @param index 指定元素要插入的索引
 * @param element 被插入的元素
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public void add(int index, E element) {
    // 檢測要移除位置的下標是否合法。就是檢測當前要插入的index是否大於size小於0
    rangeCheckForAdd(index);
	// 檢測是否需要擴容，並將list的修改記錄加一， 具體步驟看上面3.3
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    // 將從指定位置的元素以及之後所有的元素整體右移
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    // 在指定位置插入元素
    elementData[index] = element;
    // list的元素數量加一
    size++;
}

來張圖說明下添加的過程

3.5. `remove`方法

原理圖：

/**
 * 移除list中指定位置的元素
 *
 * @param index 移除位置
 * @return 返回被被移除位置的元素
 * @throws IndexOutOfBoundsException 下標越界異常
 */
public E remove(int index) {
    // 檢測要移除位置的下標是否合法
    rangeCheck(index);
		
    // 操作記錄加一
    modCount++;
    // 獲取指定位置的元素存起來
    E oldValue = elementData(index);

    // 得到要被移動的元素個數
    int numMoved = size - index - 1;
    if (numMoved > 0)
        // 將index位置之後的元素整體向前移動一個位置
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    // 設置數組最後一個位置的爲空,並將該size減一
    elementData[--size] = null; // clear to let GC do its work
	// 返回被移除的元素	
    return oldValue;
}

然後在看一下另一個移除方法：

/**
 * Removes the first occurrence of the specified element from this list,
 * if it is present.  If the list does not contain the element, it is
 * unchanged.  More formally, removes the element with the lowest index
 * <tt>i</tt> such that
 * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>
 * (if such an element exists).  Returns <tt>true</tt> if this list
 * contained the specified element (or equivalently, if this list
 * changed as a result of the call).
 *
 * @param o 移除指定元素從list中（指定元素存在）
 * @return <tt>true</tt> 如果當前list包含指定的元素
 */
public boolean remove(Object o) {
    // 如果指定元素是null， 則移除list中null元素
    if (o == null) {
        // 遍歷所有元素，找到null的下標，調用fastRemove方法進行刪除
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        for (int index = 0; index < size; index++)
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}

/*
     * Private remove method that skips bounds checking and does not
     * return the value removed.
     */
private void fastRemove(int index) {
    modCount++;
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work
}

3.6. 查找相關的方法

indexof方法：第一個發現的

/**
 * 返回指定元素在此列表中首次出現的索引；如果此列表不包含該元素，則返回-1。
 * 如果符合 返回最低（第一個找到的）索引
 * 表達式： (o == null？get(i)== null : o.equals(get(i)))
 */
public int indexOf(Object o) {
    // 判斷被查找的o不爲null
    if (o == null) {
        // 如果爲null，找到一個爲null的元素返回其下標
        for (int i = 0; i < size; i++)
            if (elementData[i]==null)
                return i;
    } else {
        // 遍歷查找返回下標
        for (int i = 0; i < size; i++)
            if (o.equals(elementData[i]))
                return i;
    }
    // 找不到返回 -1  
    return -1;
}

lastIndexOf：最後一個發現的

/**
 * Returns the index of the last occurrence of the specified element
 * in this list, or -1 if this list does not contain the element.
 * More formally, returns the highest index <tt>i</tt> such that
 * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>,
 * or -1 if there is no such index.
 */
public int lastIndexOf(Object o) {
    // 查找的元素爲null
    if (o == null) {
        // 倒序查找第一個爲null，返回其索引
        for (int i = size-1; i >= 0; i--)
            if (elementData[i]==null)
                return i;
    } else {
        // 倒序遍歷查找返回下標
        for (int i = size-1; i >= 0; i--)
            if (o.equals(elementData[i]))
                return i;
    }
    // 沒找到返回-1
    return -1;
}

3.7. 迭代器

首先看一下迭代器的使用：循環中刪除元素：

Iterator<String> iterator = list.iterator();
while(iterator.hasNext()) {
    iterator.next();
    iterator.remove();
}

如果使用forEach循環就行刪除，會出現：java.util.ConcurrentModificationException的fail-fast（快速失效）異常。下面介紹這個原因，先看一下ArrayList中迭代器的源碼。

public Iterator<E> iterator() {return new Itr();}

private class Itr implements Iterator<E> {
    int cursor;       // index of next element to return
    int lastRet = -1; // index of last element returned; -1 if no such
    int expectedModCount = modCount;

    Itr() {}

    /** 是否包含下一個 */
    public boolean hasNext() {
        return cursor != size;
    }

    /** 獲取當前元素，並將遊標指向下一個元素 */
    @SuppressWarnings("unchecked")
    public E next() {
        // 檢測是否存在併發修改的異常，使用forEach循環刪除報錯的也是這個地方
        checkForComodification();
        // 記錄下當前遊標值
        int i = cursor;
        // i 大於list中元素的數量的時候 排除異常
        if (i >= size)
            throw new NoSuchElementException();
        Object[] elementData = ArrayList.this.elementData;
        // 判斷list中元素的總數是否大於i
        if (i >= elementData.length)
            throw new ConcurrentModificationException();
        // 遊標加一，指向下一個元素
        cursor = i + 1;
        // 然後通過i保存的list下標的值，將對應的數據返回
        return (E) elementData[lastRet = i];
    }

    /** 移除 */
    public void remove() {
        if (lastRet < 0)
            throw new IllegalStateException();
        checkForComodification();

        try {
            // 這邊調用得是ArrayList中的remove方法
            ArrayList.this.remove(lastRet);
            // 遊標回退
            cursor = lastRet;
            // 當前之復位 -1 
            lastRet = -1;
            // 同步修改記錄 ， 這裏也是爲什麼我們在foreach中調用list的remove方法報錯的原因，因爲remove沒有修改 expectedModCount，但是forEach使用的迭代器進行的，所有就報錯了
            expectedModCount = modCount;
        } catch (IndexOutOfBoundsException ex) {
            throw new ConcurrentModificationException();
        }
    }

    @Override
    @SuppressWarnings("unchecked")
    public void forEachRemaining(Consumer<? super E> consumer) {
        Objects.requireNonNull(consumer);
        final int size = ArrayList.this.size;
        int i = cursor;
        if (i >= size) {
            return;
        }
        final Object[] elementData = ArrayList.this.elementData;
        if (i >= elementData.length) {
            throw new ConcurrentModificationException();
        }
        while (i != size && modCount == expectedModCount) {
            consumer.accept((E) elementData[i++]);
        }
        // update once at end of iteration to reduce heap write traffic
        cursor = i;
        lastRet = i - 1;
        checkForComodification();
    }

    final void checkForComodification() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}

三、相關問題

3.1. `ArrayList`和`Vector`的區別

ArrayList是線程不安全的，Vector是線程安全的
擴容的時候ArrayList擴0.5倍，Vector擴1倍

3.2. `ArrayList`如何線程安全？

Collections工具類有一個synchronizedList方法，這樣做意義不大。

3.3. `ArrayList`的存儲數組使用`transient`進行修飾的原因

transient修飾的屬性意味着不會被序列化，也就是說在序列化ArrayList的時候，不序列化elementData。

因爲會進行擴容，使用數組不總數滿數組，全部序列化的話是浪費空間，但是這並不代表不進行序列化，只是進行部分序列化，我們可以看一下ArrayList中的writeObject

/**
 * 保存ArrayList實例到流中 (序列化).
 *
 * @serialData The length of the array backing the <tt>ArrayList</tt>
 *             instance is emitted (int), followed by all of its elements
 *             (each an <tt>Object</tt>) in the proper order.
 */
private void writeObject(java.io.ObjectOutputStream s)
    throws java.io.IOException{
    // Write out element count, and any hidden stuff
    int expectedModCount = modCount;
    s.defaultWriteObject();

    // 寫出大小作爲與clone（）行爲兼容的容量， 反序列化的時候也需要先讀取數組的大小
    s.writeInt(size);

    // 按照正確的順序寫出所有元素
    for (int i=0; i<size; i++) {
        s.writeObject(elementData[i]);
    }

    if (modCount != expectedModCount) {
        throw new ConcurrentModificationException();
    }
}

這樣做就是爲了提高效率。

3.5. 關於`ArrayList`和`LinkedList`的選擇

當你遇到訪問元素比插入或者是刪除元素更加頻繁的時候，你應該使用ArrayList，在ArrayList中增加或者刪除某個元素，通常會調用System.arraycopy方法；在頻繁的插入或者是刪除元素的情況下，LinkedList的性能會更加好一點。

3.6. 關於`ArrayList`複製的方法

淺拷貝

clone() / addAll()  // 都是淺拷貝

ArrayList newArray = oldArray.clone();  // 使用clone方法   （淺拷貝）

ArrayList myObject = new ArrayList(myTempObject);   // 使用構造方法  （淺拷貝）

// 使用 Collections.copy，需要先指定長度(淺拷貝)
List<String> newList = new ArrayList<>(Arrays.asList(new String[list.size()]));
Collections.copy(newList, list);

深拷貝

// 序列化
try {
    ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
    ObjectOutputStream out = new ObjectOutputStream(byteOut);
    out.writeObject(list);

    ByteArrayInputStream byteIn = new ByteArrayInputStream(byteOut.toByteArray());
    ObjectInputStream in = new ObjectInputStream(byteIn);
    List<String> newList = (List<String>) in.readObject();
    System.out.println(newList);
}catch (IOException | ClassNotFoundException e) {
    e.printStackTrace();
}

另一種就是對List存儲的數據一個個進行序列化然後裝入List中，另外BeanUils中的copyProperties也是存在淺拷貝的問題的，所以最好還是通過實現Serializable進行序列化。

四、總結來說

ArrayList是一個動態數組，不是線程安全的，允許元素爲null
增導致擴容並修改modCount值
擴容導致數組複製，所以增刪效率就很低了，但是數組天生就是對於查改很友好，效率高
ArrayList對於Vector來說，所有的ApI都是未加鎖，所以線程不安全，而Vector加了sychronized所以他是線程安全的，以及Vector擴容時，容量翻倍，ArrayList容量增加50%。
雖然elementData被定義爲transient，但是ArrayList被自己實現了writeObject和readObject方法。不會將整個list的容量全都序列化了，只會對存儲的元素進行序列化，避免浪費空間。
ArrayList中clone/addAll這些方法都是淺拷貝，需要注意。

集合框架之ArrayList源碼分析