淺談排序算法之合併排序(4)

在前三篇博客中,筆者分別講述了冒泡排序選擇排序以及插入排序,這三種排序算法是比較基本算法,原理也好,實現也罷,難度都不是很大。筆者在這篇博客中,打算聊聊合併排序(Merge-Sort)。

《算法導論》中提到,合併排序是分治(Divide-and-Conquer)思想的第一個應用。分治思想中,對於一個給定的問題,將該問題分解爲多個子問題,每個子問題又可以向下繼續分解,直到分解出的子問題十分容易地就能夠求解出來。然後將每個子問題的解合併起來,就是原始問題的最終解。合併排序的流程如下:

  • 將一個數組分成左右兩個元素個數差不多相等的左右兩個數組,對應數組區間分別爲[start, mid]以及[mid + 1, end]。
  • 若當前數組只含有一個元素,即end == start,則表示返回上一步(遞歸中的歸來),對兩個數組進行合併操作。
  • 合併的時候,需要將兩個區間內的數組元素取出來分別存放於兩個數組之內,然後比較兩個數組的頭部,頭元素較小的往原數組裏賦值,並且賦值數組和被賦值數組的指針分別向後移動一位(具體參見代碼)。若移動過程中,有一個數組移動到末尾,則將另外一個數組的剩餘部分直接複製到原始數組對應位置的後面。

合併排序的時間複雜度大致爲O(n*log(n)),通常情況下來說,其也是一種穩定的排序算法。
假設有數組如下:

[7, 3, 14, 10, 9, 14, 7, 14]

當二分進行到最下面一層的時候,有8個子數組分別爲:

[7], [3], [14], [10], [9], [14], [7], [14]

然後依次是對[7]和[3]的合併操作,合併結果爲[3, 7],然後是對[14]和[10]的合併,結果爲[10, 14],然後是對[3, 7]和[10, 14]的合併,合併結果爲[3, 7, 10, 14],以此類推。最終得到的排序結果爲:

[3, 7, 7, 9, 10, 14, 14, 14]

合併排序算法的示例代碼如下:

import com.sun.istack.internal.NotNull;

import java.util.Arrays;
import java.util.Random;

/**
 * A demo of {@code MergeSort}.
 *
 * @author Mr.K
 */
public class MergeSort {

    public static void main(String[] args) {
        int N = 20;
        int[] numbers = new int[N];
        Random random = new Random();
        for (int i = 0; i < N; i++) {
            numbers[i] = random.nextInt(2 * N);
        }
        System.out.println("待排序數組: " + Arrays.toString(numbers) + "\n");
        mergeSort(numbers, 0, numbers.length - 1);
        System.out.println("\n已排序數組: " + Arrays.toString(numbers));
    }

    /**
     * Accepts an array, a number representing start index(inclusive) and a number
     * representing end index(inclusive) and sorts the array by {@code MergeSort}.
     * <ul>
     *     <li>If parameter <em>start</em> is less than <em>end</em>, which means
     *     there are at least two numbers to be sorted. And if so, finds the middle
     *     index and sorts the index from <em>start</em>(inclusive) to <em>mid</em>
     *     (inclusive), and from <em>mid + 1</em>(inclusive) to <em>end</em>(inclusive)
     *     by invoking {@code mergeSort} itself, respectively. As you can see, this
     *     is what we called <em>Recursion</em>.</li>
     *     <li>After two recursions, an operation called <em>merge</em> should be
     *     invoked to merge two sub-array, ensuring numbers from <em>start</em>
     *     (inclusive) to <em>end</em>(inclusive) are in an ascending order.</li>
     * </ul>
     * Be aware that when invoking this method, parameters <em>start</em> and <em>end
     * </em> should be the first element and the last element of the specified array,
     * respectively. In other words, if you'd like to sort all elements in an array,
     * you should invoke this method like
     * <blockquote>
     *     mergeSort(numbers, 0, numbers.length - 1);
     * </blockquote>
     * where <em>numbers</em> is an array of integer number.<br><br>
     * <p>
     * When there are enough numbers to be sorted via {@code MergeSort}, the cost of time
     * of {@code MergeSort} is O(n * lgn), which is much better than O(n^2), the cost of
     * time of {@link org.vimist.pro.sort.InsertionSort#insertionSort(int[])},
     * {@link org.vimist.pro.sort.BubbleSort#bubbleSort(int[])},
     * {@link org.vimist.pro.sort.SelectionSort#selectionSort(int[])}.<br><br>
     * <p>
     * To some conclusions, {@code MergeSort} is a form of thinking <em>Divide-and-Conquer
     * </em>, which divides question into sub-questions and sub-question are divided into
     * sub-sub-questions until questions are easy to be conquered. After conquering, merge
     * the results and the whole questions will be resolved. Usually, when implementing
     * <em>Divide-and-Conquer</em>, <em>Recursion</em> is used, which may be an obstacle to
     * make full use of this thinking.
     * <p>Final words, {@code MergeSort} is stable cause when merging, the condition is
     * <blockquote>
     *     if (L[i] <= R[j])<br>
     * </blockquote>
     * As you can see, array <em>L</em> located at the left of array <em>R</em>. Thus, it's
     * stable.
     * </p>
     *
     * @param numbers specified array to be sorted
     * @param start   index of the beginning of the range of sort
     * @param end     index of the end of the range of sort
     */
    public static void mergeSort(@NotNull int[] numbers, @NotNull int start, @NotNull int end) {
        if (start < end) {
            int mid = start + (end - start) / 2;
            mergeSort(numbers, start, mid);
            mergeSort(numbers, mid + 1, end);
            merge(numbers, start, mid, end);
        } else {
            return;
        }
    }

    /**
     * Accepts an array and merges two sub-arrays, one of which starts from <em>start</em>
     * (inclusive) to <em>mid</em>(inclusive) and the other one starts from <em>mid + 1</em>
     * (inclusive) to <em>end</em>(inclusive).
     * <ul>
     *     <li>This first step in this process is to assign numbers from index <em>start</em>
     *     (inclusive) to index <em>mid</em>(inclusive) and numbers from index <em>mid + 1</em>
     *     (inclusive) to index <em>end</em>(inclusive) to two arrays, named <em>L</em> and
     *     <em>R</em>, respectively.</li>
     *     <li>And then, compares the head of both sub-arrays, finds the minimum number from
     *     <em>L</em> and <em>R</em> and assigns that number to the <em>start</em> position
     *     and moves two cursor, one of which is the assigned array, the other one is the original
     *     array, to the next position.</li>
     *     <li>If one of these two arrays goes to the end, the remaining of the other one should
     *     be placed to the next position of original array directly.</li>
     * </ul>
     * At last, the cost of time of {@code Merge} is O(n) if and only if there are <em>n</em>
     * numbers to be merged.
     *
     * @param number specified array to be merged
     * @param start  start index of the range to merge
     * @param mid    middle index
     * @param end    end index of the range to merge
     */
    public static void merge(@NotNull int[] number, @NotNull int start, @NotNull int mid, @NotNull int end) {
        /**
         * Assign numbers in the range of (start, mid) and the range of (mid + 1, end)
         * to two sub-arrays.
         */
        int[] L = new int[mid - start + 1], R = new int[end - mid];
        for (int i = 0; i < L.length; i++) {
            L[i] = number[start + i];
        }
        for (int i = 0; i < R.length; i++) {
            R[i] = number[mid + 1 + i];
        }
        System.out.println("待合併數組: " + Arrays.toString(L) + ", " + Arrays.toString(R));

        /**
         * Compares and re-assigns to ensure that the numbers in the range of (start, end)
         * in ascending order.
         */
        int i = 0, j = 0, k = start;
        for (; i < L.length && j < R.length && k < end; k++) {
            if (i < L.length && j < R.length) {
                /**
                 * Making {@code MergeSort} stable
                 */
                if (L[i] <= R[j]) {
                    number[k] = L[i++];
                } else {
                    number[k] = R[j++];
                }
            }
        }

        /**
         * If one of both sub-arrays goes to the end, then assigning without comparing.
         */
        while (i < L.length) {
            number[k++] = L[i++];
        }
        while (j < R.length) {
            number[k++] = R[j++];
        }
    }

}

其運行結果爲:

待排序數組: [10, 9, 26, 16, 6, 34, 21, 30, 21, 5, 33, 13, 2, 1, 2, 0, 18, 11, 25, 11]

待合併數組: [10], [9]
待合併數組: [9, 10], [26]
待合併數組: [16], [6]
待合併數組: [9, 10, 26], [6, 16]
待合併數組: [34], [21]
待合併數組: [21, 34], [30]
待合併數組: [21], [5]
待合併數組: [21, 30, 34], [5, 21]
待合併數組: [6, 9, 10, 16, 26], [5, 21, 21, 30, 34]
待合併數組: [33], [13]
待合併數組: [13, 33], [2]
待合併數組: [1], [2]
待合併數組: [2, 13, 33], [1, 2]
待合併數組: [0], [18]
待合併數組: [0, 18], [11]
待合併數組: [25], [11]
待合併數組: [0, 11, 18], [11, 25]
待合併數組: [1, 2, 2, 13, 33], [0, 11, 11, 18, 25]
待合併數組: [5, 6, 9, 10, 16, 21, 21, 26, 30, 34], [0, 1, 2, 2, 11, 11, 13, 18, 25, 33]

已排序數組: [0, 1, 2, 2, 5, 6, 9, 10, 11, 11, 13, 16, 18, 21, 21, 25, 26, 30, 33, 34]

後來,筆者在《算法導論》上後面的例題中看到了一個要求,在合併排序中使用插入排序,題目的大意是:

在合併排序算法中,當子問題足夠下時,考慮使用插入排序,對n/k個長度爲k的子列表進行排序,然後再用標準的合併機制將它們合併在一起。此處的k是一個特定的值。

要求中有求解k的最大漸近值是什麼,由於筆者才疏學淺,這裏就沒有論證,僅僅只是將插入排序應用到合併排序中去。在合併排序中,當end - start > k時,子數組的長度大於k,繼續分解。當end - start <= k時,對子數組使用插入排序算法。排序後進行兩個數組的合併操作並返回至上一次遞歸處繼續下一步操作。

import com.sun.istack.internal.NotNull;

import java.util.Arrays;
import java.util.Random;

/**
 * A demo of {@code MergeSort} combining with {@code InsertionSort}.
 *
 * @author Mr.k
 */
public class CombineMergeAndInsertion {

    public static void main(String[] args) {
        int N = 40;
        int[] numbers = new int[N];
        Random random = new Random();
        for (int i = 0; i < N; i++) {
            numbers[i] = random.nextInt(2 * N);
        }
        System.out.println("待排序數組: " + Arrays.toString(numbers) + "\n");
        mergeSort(numbers, 0, numbers.length - 1, 4);
        System.out.println("\n已排序數組: " + Arrays.toString(numbers));
    }

    /**
     * Accepts an array and merges two sub-arrays, one of which starts from <em>start</em>
     * (inclusive) to <em>mid</em>(inclusive), and the other one starts from <em>mid + 1</em>
     * (inclusive) to <em>end</em>(inclusive).
     * <ul>
     *     <li>The first step of the process of this method is to check whether <em>end - start
     *     </em> is greater than <em>k</em>. If so, then invokes this method itself from the
     *     range of [start, mid] and the range of [mid + 1, end] where
     *     <blockquote>
     *         mid = start + (end - start) / 2
     *     </blockquote>
     *     And then merge these two array where the total range is from <em>start</em> to
     *     <em>end</em>.</li>
     *     <li>On the other hand, when <em>end - start</em> is less than or equals to <em>k</em>,
     *     then {@link org.vimist.pro.sort.InsertionSort#insertionSort(int[])} is used to sort
     *     the sub-array in the range of [start, end].</li>
     * </ul>
     * This is a combination of {@code MergeSort} and {@code InsertionSort}. In some bad cases,
     * the cost of time of this combination is O(nk + n * lg(n / k)).
     *
     * @param arr   specified array
     * @param start start index
     * @param end   end index
     * @param k     maximum number to apply {@code InsertionSort}
     */
    public static void mergeSort(@NotNull int[] arr, @NotNull int start, @NotNull int end, @NotNull int k) {
        if (end - start >= k) {
            int mid = start + (end - start) / 2;
            mergeSort(arr, start, mid, k);
            mergeSort(arr, mid + 1, end, k);
            merge(arr, start, mid, end);
        } else {
            int[] array = new int[end - start + 1];
            System.arraycopy(arr, start, array, 0, end - start + 1);
            System.out.println("待排序部分數組: " + Arrays.toString(array));
            for (int i = start + 1; i <= end; i++) {
                int key = arr[i];
                int j = i - 1;
                System.out.println("插入排序, 待排序數字: " + key);
                while (j >= start && arr[j] > key) {
                    arr[j + 1] = arr[j--];
                }
                arr[j + 1] = key;
            }
            System.arraycopy(arr, start, array, 0, end - start + 1);
            System.out.println("已排序部分數組: " + Arrays.toString(array));
        }
    }

    /**
     * Accepts an array and merges two sub-arrays, one of which starts from <em>start</em>
     * (inclusive) to <em>mid</em>(inclusive) and the other one starts from <em>mid + 1</em>
     * (inclusive) to <em>end</em>(inclusive).
     * <ul>
     *     <li>This first step in this process is to assign numbers from index <em>start</em>
     *     (inclusive) to index <em>mid</em>(inclusive) and numbers from index <em>mid + 1</em>
     *     (inclusive) to index <em>end</em>(inclusive) to two arrays, named <em>L</em> and
     *     <em>R</em>, respectively.</li>
     *     <li>And then, compares the head of both sub-arrays, finds the minimum number from
     *     <em>L</em> and <em>R</em> and assigns that number to the <em>start</em> position
     *     and moves two cursor, one of which is the assigned array, the other one is the original
     *     array, to the next position.</li>
     *     <li>If one of these two arrays goes to the end, the remaining of the other one should
     *     be placed to the next position of original array directly.</li>
     * </ul>
     * At last, the cost of time of {@code Merge} is O(n) if and only if there are <em>n</em>
     * numbers to be merged.
     *
     * @param number specified array to be merged
     * @param start  start index of the range to merge
     * @param mid    middle index
     * @param end    end index of the range to merge
     */
    public static void merge(@NotNull int[] number, @NotNull int start, @NotNull int mid, @NotNull int end) {
        /**
         * Assign numbers in the range of (start, mid) and the range of (mid + 1, end)
         * to two sub-arrays.
         */
        int[] L = new int[mid - start + 1], R = new int[end - mid];
        for (int i = 0; i < L.length; i++) {
            L[i] = number[start + i];
        }
        for (int i = 0; i < R.length; i++) {
            R[i] = number[mid + 1 + i];
        }
        System.out.println("待合併數組: " + Arrays.toString(L) + ", " + Arrays.toString(R));

        /**
         * Compares and re-assigns to ensure that the numbers in the range of (start, end)
         * in ascending order.
         */
        int i = 0, j = 0, k = start;
        for (; i < L.length && j < R.length && k < end; k++) {
            if (i < L.length && j < R.length) {
                /**
                 * Making {@code MergeSort} stable
                 */
                if (L[i] <= R[j]) {
                    number[k] = L[i++];
                } else {
                    number[k] = R[j++];
                }
            }
        }

        /**
         * If one of both sub-arrays goes to the end, then assigning without comparing.
         */
        while (i < L.length) {
            number[k++] = L[i++];
        }
        while (j < R.length) {
            number[k++] = R[j++];
        }
    }

}

運行結果如下:

待排序數組: [37, 24, 40, 74, 72, 51, 35, 60, 0, 5, 70, 58, 12, 79, 26, 53, 11, 8, 67, 71, 54, 49, 38, 76, 26, 70, 69, 62, 48, 48, 41, 56, 27, 71, 43, 44, 62, 37, 79, 0]

待排序部分數組: [37, 24, 40]
插入排序, 待排序數字: 24
插入排序, 待排序數字: 40
已排序部分數組: [24, 37, 40]
待排序部分數組: [74, 72]
插入排序, 待排序數字: 72
已排序部分數組: [72, 74]
待合併數組: [24, 37, 40], [72, 74]
待排序部分數組: [51, 35, 60]
插入排序, 待排序數字: 35
插入排序, 待排序數字: 60
已排序部分數組: [35, 51, 60]
待排序部分數組: [0, 5]
插入排序, 待排序數字: 5
已排序部分數組: [0, 5]
待合併數組: [35, 51, 60], [0, 5]
待合併數組: [24, 37, 40, 72, 74], [0, 5, 35, 51, 60]
待排序部分數組: [70, 58, 12]
插入排序, 待排序數字: 58
插入排序, 待排序數字: 12
已排序部分數組: [12, 58, 70]
待排序部分數組: [79, 26]
插入排序, 待排序數字: 26
已排序部分數組: [26, 79]
待合併數組: [12, 58, 70], [26, 79]
待排序部分數組: [53, 11, 8]
插入排序, 待排序數字: 11
插入排序, 待排序數字: 8
已排序部分數組: [8, 11, 53]
待排序部分數組: [67, 71]
插入排序, 待排序數字: 71
已排序部分數組: [67, 71]
待合併數組: [8, 11, 53], [67, 71]
待合併數組: [12, 26, 58, 70, 79], [8, 11, 53, 67, 71]
待合併數組: [0, 5, 24, 35, 37, 40, 51, 60, 72, 74], [8, 11, 12, 26, 53, 58, 67, 70, 71, 79]
待排序部分數組: [54, 49, 38]
插入排序, 待排序數字: 49
插入排序, 待排序數字: 38
已排序部分數組: [38, 49, 54]
待排序部分數組: [76, 26]
插入排序, 待排序數字: 26
已排序部分數組: [26, 76]
待合併數組: [38, 49, 54], [26, 76]
待排序部分數組: [70, 69, 62]
插入排序, 待排序數字: 69
插入排序, 待排序數字: 62
已排序部分數組: [62, 69, 70]
待排序部分數組: [48, 48]
插入排序, 待排序數字: 48
已排序部分數組: [48, 48]
待合併數組: [62, 69, 70], [48, 48]
待合併數組: [26, 38, 49, 54, 76], [48, 48, 62, 69, 70]
待排序部分數組: [41, 56, 27]
插入排序, 待排序數字: 56
插入排序, 待排序數字: 27
已排序部分數組: [27, 41, 56]
待排序部分數組: [71, 43]
插入排序, 待排序數字: 43
已排序部分數組: [43, 71]
待合併數組: [27, 41, 56], [43, 71]
待排序部分數組: [44, 62, 37]
插入排序, 待排序數字: 62
插入排序, 待排序數字: 37
已排序部分數組: [37, 44, 62]
待排序部分數組: [79, 0]
插入排序, 待排序數字: 0
已排序部分數組: [0, 79]
待合併數組: [37, 44, 62], [0, 79]
待合併數組: [27, 41, 43, 56, 71], [0, 37, 44, 62, 79]
待合併數組: [26, 38, 48, 48, 49, 54, 62, 69, 70, 76], [0, 27, 37, 41, 43, 44, 56, 62, 71, 79]
待合併數組: [0, 5, 8, 11, 12, 24, 26, 35, 37, 40, 51, 53, 58, 60, 67, 70, 71, 72, 74, 79], [0, 26, 27, 37, 38, 41, 43, 44, 48, 48, 49, 54, 56, 62, 62, 69, 70, 71, 76, 79]

已排序數組: [0, 0, 5, 8, 11, 12, 24, 26, 26, 27, 35, 37, 37, 38, 40, 41, 43, 44, 48, 48, 49, 51, 53, 54, 56, 58, 60, 62, 62, 67, 69, 70, 70, 71, 71, 72, 74, 76, 79, 79]
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章