Java 查找算法

這個問題有幾個點要先確認

必須是有序，如果無序的話就只能全遍歷了
查找算法跟數據結構相關，不同的數據結構適用於不同的查找算法
查找算法與磁盤I/O有一定的關係，比如數據庫在索引排序的時候，如果每次都從磁盤讀取一個節點然後進行判斷

數組

如果知道下標的話就方便了，查找的複雜度爲1.
如果是針對值的查找，那麼順序遍歷是O(n),

二分查找

使用二分查找的話可以減少時間複雜度爲：O（logn）

/**
 * 二分查找又稱折半查找，它是一種效率較高的查找方法。 
　　【二分查找要求】：1.必須採用順序存儲結構 2.必須按關鍵字大小有序排列。
 * @author wzj
 *
 */
public class BinarySearch { 
    public static void main(String[] args) {
        int[] src = new int[] {1, 3, 5, 7, 8, 9}; 
        System.out.println(binarySearch(src, 3));
        System.out.println(binarySearch(src,3,0,src.length-1));
    }

    /**
     * * 二分查找算法 * *
     * 
     * @param srcArray
     *            有序數組 *
     * @param des
     *            查找元素 *
     * @return des的數組下標，沒找到返回-1
     */ 
   public static int binarySearch(int[] srcArray, int des){ 

        int low = 0; 
        int high = srcArray.length-1; 
        while(low <= high) { 
            int middle = (low + high)/2; 
            if(des == srcArray[middle]) { 
                return middle; 
            }else if(des <srcArray[middle]) { 
                high = middle - 1; 
            }else { 
                low = middle + 1; 
            }
        }
        return -1;
   }

      /**  
     *二分查找特定整數在整型數組中的位置(遞歸)  
     *@paramdataset  
     *@paramdata  
     *@parambeginIndex  
     *@paramendIndex  
     *@returnindex  
     */
    public static int binarySearch(int[] dataset,int data,int beginIndex,int endIndex){  
       int midIndex = (beginIndex+endIndex)/2;  
       if(data <dataset[beginIndex]||data>dataset[endIndex]||beginIndex>endIndex){
           return -1;  
       }
       if(data <dataset[midIndex]){  
           return binarySearch(dataset,data,beginIndex,midIndex-1);  
       }else if(data>dataset[midIndex]){  
           return binarySearch(dataset,data,midIndex+1,endIndex);  
       }else {  
           return midIndex;  
       }  
   } 
}

但是插入因爲會涉及當前節點後的所有值得移動，一次，其時間複雜度爲O(n) + O(log n)

鏈表

只能從頭節點遍歷，查找的複雜度是O（n）
插入或者是刪除，因爲只需要移動指針，時間複雜度爲O（1） + O（n）

樹

樹的查找，主要是先序遍歷，中序等遍歷方式。
插入和刪除，還是比較快
常用的會有如下的衍生方式：

二叉樹

二叉樹的構建：

class BinaryNode{
        int value;
        BinaryNode left;
        BinaryNode right;
        public BinaryNode(int value){
            this.value = value;
            this.left = null;
            this.right = null;
        }

        public void add(int value){
            if(value > this.value){
                if(this.right != null){
                    this.right.add(value);
                }else{
                    this.right = new BinaryNode(value);
                }
            }else{
                if(this.left != null){
                    this.left.add(value);
                }else{
                    this.left = new BinaryNode(value);
                }
            }
        }

        // 中序查找
        public BinaryNode get(int value){
            if(this.value == value){
                return this;
            }
            if(this.value > value){
                return this.left.get(value);
            }

            if(this.value < value){
                return this.right.get(value);
            }
            return null;
        }
    }

插入的複雜度本身並不高，只是簡單的節點添加。但是因爲尋找插入位置的查找操作的複雜度跟樹的高度相關爲logn，極差的情況下可能接近於線性查找。

平衡二叉樹

平衡二叉樹是儘量減少數高的二叉樹，其算法中增加了左旋和右旋的操作。插入複雜度會高一些，但是會得到不錯的查找性能。

B+Tree

學習自這裏
這個就要說一下上面說的跟磁盤I/O相關的，因此爲了減少磁盤I/O。可以利用磁盤的預讀特性，一次提取大概相當於一頁大小的節點到內存中。
先要說一下B-Tree.
一個平衡的m-way查找數，其要滿足如下的條件：

每節點中的數據量 < m
每層節點數 <= m
子數節點要完全大於、小於、或者在其之間。也就是不能越過父節點的兩個值
葉子節點中的值的個數>=m/2
非葉子節點中的值的個數=子節點個數-1
如下圖：

可以看出，三個子節點的有兩個值，三個子節點中的數據分別對應了小於、之間、大於這個範圍

B+Tree
與上面的差別是：

所有關鍵字都在葉子節點
父節點存儲的都是到子節點的指針
會有兩個入口，一個是根節點，另外一個是從最小葉子節點開始的指針

查找跟二叉樹比較像，因爲插入的時候已經是相當於二分算法了，所以只需要，遞歸找到就可以了。

Hash表

爲了解決一些不容易排序，或者查找的對象。比如圖像，視頻等等。
在Java的HashMap中有使用。
是一個鏈表的數組
- 對key進行進行散列函數，求Hash值，找到其對應的鏈表。
- 剩下的解決hash衝突的問題
- 解決hash衝突，可以在命中鏈表之後順序比較
- 這裏順便再說一下一致性hash. 預置很多節點，選擇最近的節點存入，可以解決增加節點數據轉移的問題。

數組

二分查找

鏈表

樹

二叉樹

平衡二叉樹

B+Tree

Hash表

Java 查找算法

Mysql第七天查詢優化2

一小時寫給同組的如何使用工具檢測代碼質量

mysql 查詢時間條件問題

京東成都研究所奮戰618

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結