近日,筆者忙裏偷閒,學習了下堆排序(Heap-Sort)。堆排序算法,就時間複雜度而言,堆排序跟合併排序(Merge-Sort)算法是一樣的,都是O(n * log(n));就排序方式而言,堆排序跟插入排序(Insertion-Sort)一樣,都具有空間原址性。
這裏首先介紹下堆的概念。堆,或者稱爲二叉堆,可以看作是一棵近似的完全二叉樹。在這棵樹中,除了最底層以外,該樹是完全充滿的,而且是從左至右填充的(這裏筆者手畫了幾個栗子)。
在二叉堆中,通常使用兩個屬性:length表示數組長度,heap_size表述還有多少個元素存儲在二叉堆中。二叉堆一般具有兩種形式:最大堆和最小堆。在最大堆中,除了根結點以外的所有結點i均滿足
A[parent(i)] >= A[i]
即某個結點的值,至多與父結點的值相等,最大堆的最大元素存儲在根結點上。而最小堆則滿足
A[parent(i)] <= A[i]
即某個結點的值,至少都與父結點的值相等,其最小元素存儲在根結點上。
在堆排序算法中,一般考慮實現最大堆。考慮隨機數組排序成升序數組的栗子,堆排序算法大致流程如下:
- 首先將隨機數組建成一個最大堆。
- 從數組最後一位開始向前迭代,直到第二個元素。迭代過程中,將當前最大堆的最大值(數組首)依次從數組最後一位開始向前放置。同時,heap_size自減一,然後調用
maxHeapify
維持當前堆成爲一個最大堆。
其中maxHeapify
過程,接受一個數組arr和一個下標i。在調用maxHeapify
時,通常假設根結點爲left(i)和right(i)的二叉樹是最大堆,但這時的arr[i]有可能小於其子節點,maxHeapify
通過讓arr[i]的值在最大堆中“逐漸下降”,從而使得以下標i爲根結點的子樹滿足最大堆的性質。這裏筆者以《算法導論》上的栗子,maxHeapify(arr, 2)的執行過程如下。
代碼如下:
package org.vimist.pro.Algorithm.Sort;
import org.jetbrains.annotations.NotNull;
import java.util.Arrays;
import java.util.Random;
/**
* A demo to illustrate how {@code Heap-Sort} works.
*
* @author Mr.K
*/
public class HeapSort {
// length of the array
private static int length = 12;
// size of binary heap
private static int heapSize = length;
// array to be sorted
private static int[] arr = new int[length];
// random number generator
private static Random random = new Random();
public static void main(String[] args) {
for (int i = 0; i < arr.length; i++) {
arr[i] = random.nextInt(3 * length);
}
System.out.println("待排序數組: " + Arrays.toString(arr) + "\n");
new HeapSort().heapSort(arr);
System.out.println("\n排序後數組: " + Arrays.toString(arr));
}
/**
* Accepts an array, constructs a <em>Maximum-Heap</em> by that array and
* starts by using that <em>Maximum-Heap</em>.
* <ul>
* <li>In the <em>Maximum-Heap</em>, the largest number is placed at
* position 0. So the first element makes exchange with the last index
* in the range of [0, Heap-Size], from the start at the end of the
* array to the end at second element of the array.</li>
* <li>When exchange has been made, the binary heap in the range of
* [0, Heap-Size] will not be a <em>Maximum-Heap</em>, hence process,
* which aims to maintain the condition of <em>Maximum-Heap</em>, will
* be invoked.</li>
* </ul>
* Generally, {@code HeapSort} may cost O(n * log(n)) with respect to time,
* in some worst cases.
*
* @param arr specified array to be sorted
*/
public void heapSort(@NotNull int[] arr) {
buildMaxHeap(arr);
System.out.println("最大二叉堆: " + Arrays.toString(arr) + "\n");
for (int i = arr.length - 1; i > 0; i--) {
exchange(arr, 0, i);
heapSize--;
maxHeapify(arr, 0);
System.out.println("第" + String.format("%2d", length - i) + "步: " + Arrays.toString(arr));
}
}
/**
* Accepts an array and constructs a <em>Maximum-Heap</em> by the specified
* array. The complexity of time for the method is almost O(n), which means
* this operation can be finished in linear cost of time.
*
* @param arr specified array to construct a maximum-heap
*/
public void buildMaxHeap(@NotNull int[] arr) {
for (int i = length / 2; i >= 0; i--) {
maxHeapify(arr, i);
}
}
/**
* Accepts an array, almost a maximum heap, and an index, which is going
* to be used to ensure that the whole binary heap is a <em>Maximum-Heap
* </em>. The complexity of time for the process is O(h) is the height of
* the binary heap is h.
*
* @param arr specified array, representing the binary heap
* @param index index of node to be re-sorted
*/
public void maxHeapify(@NotNull int[] arr, @NotNull int index) {
int left = left(index), right = right(index), largest;
if (left < heapSize && arr[left] > arr[index]) {
largest = left;
} else {
largest = index;
}
if (right < heapSize && arr[right] > arr[largest]) {
largest = right;
}
if (largest != index) {
exchange(arr, index, largest);
maxHeapify(arr, largest);
}
}
/**
* Accepts an array and two numbers, which are indexes of two element, and
* exchanges these two element.
*
* @param arr specified array
* @param i one of the indexes
* @param j the other one of the indexes
*/
public void exchange(@NotNull int[] arr, @NotNull int i, @NotNull int j) {
int temp = arr[i] ^ arr[j];
arr[i] = temp ^ arr[i];
arr[j] = temp ^ arr[j];
}
/**
* Gets an index and returns the left child node of current node, marked by
* ths specified index.
*
* @param index index of current node
* @return index of the left child node of current node
*/
public int left(@NotNull int index) {
return (index + 1) * 2 - 1;
}
/**
* Gets an index and returns the right child node of current node, marked by
* ths specified index.
*
* @param index index of current node
* @return index of the right child node of current node
*/
public int right(@NotNull int index) {
return (index + 1) * 2;
}
}
運行結果如下:
待排序數組: [23, 15, 8, 12, 26, 25, 28, 6, 6, 7, 26, 20]
最大二叉堆: [28, 26, 25, 12, 26, 23, 8, 6, 6, 7, 15, 20]
第 1步: [26, 26, 25, 12, 20, 23, 8, 6, 6, 7, 15, 28]
第 2步: [26, 20, 25, 12, 15, 23, 8, 6, 6, 7, 26, 28]
第 3步: [25, 20, 23, 12, 15, 7, 8, 6, 6, 26, 26, 28]
第 4步: [23, 20, 8, 12, 15, 7, 6, 6, 25, 26, 26, 28]
第 5步: [20, 15, 8, 12, 6, 7, 6, 23, 25, 26, 26, 28]
第 6步: [15, 12, 8, 6, 6, 7, 20, 23, 25, 26, 26, 28]
第 7步: [12, 7, 8, 6, 6, 15, 20, 23, 25, 26, 26, 28]
第 8步: [8, 7, 6, 6, 12, 15, 20, 23, 25, 26, 26, 28]
第 9步: [7, 6, 6, 8, 12, 15, 20, 23, 25, 26, 26, 28]
第10步: [6, 6, 7, 8, 12, 15, 20, 23, 25, 26, 26, 28]
第11步: [6, 6, 7, 8, 12, 15, 20, 23, 25, 26, 26, 28]
排序後數組: [6, 6, 7, 8, 12, 15, 20, 23, 25, 26, 26, 28]