一步一步跟我學習lucene（7）---lucene搜索之IndexSearcher構建過程

最近一直在寫一步一步跟我學習lucene系列（http://blog.csdn.net/wuyinggui10000/article/category/3173543），個人的博客也收到了很多的訪問量，謝謝大家的關注，這也是對我個人的一個激勵，O(∩_∩)O哈哈~，個人感覺在博客的編寫過程中自己收穫了很多，我會一直繼續下去，在工作的過程中自己也會寫出更多類似系列的博客，也算是對自己只是的一種積累；

IndexSearcher

搜索引擎的構建分爲索引內容和查詢索引兩個大方面，這裏要介紹的是lucene索引查詢器即IndexSearcher的構建過程；

首先了解下IndexSearcher：

IndexSearcher提供了對單個IndexReader的查詢實現；
我們對索引的查詢，可以通過調用search(Query,n)或者search(Query,Filter,n)方法；
在索引內容變動不大的情況下，我們可以對索引的搜索採用單個IndexSearcher共享的方式來提升性能；
如果索引有變動，我們就需要使用DirectoryReader.openIfChanged(DirectoryReader)來獲取新的reader，然後創建新的IndexSearcher對象；
爲了使查詢延遲率低，我們最好使用近實時搜索的方法（此時我們的DirectoryReader的構建就要採用DirectoryReader.open(IndexWriter, boolean)）
IndexSearcher實例是完全線程安全的,這意味着多個線程可以併發調用任何方法。如果需要外部同步,無需添加IndexSearcher的同步；

IndexSearcher的創建過程

根據索引文件路徑創建FSDirectory的實例，返回的FSDirectory實例跟系統或運行環境有關，對於Linux, MacOSX, Solaris, and Windows 64-bit JREs返回的是一個MMapDirectory實例，對於其他非windows JREs環境返回的是NIOFSDirectory，而對於其他Windows的JRE環境返回的是SimpleFSDirectory，其執行效率依次降低
接着DirectoryReader根據獲取到的FSDirectory實例讀取索引文件並得到DirectoryReader對象；DirectoryReader的open方法返回實例的原理：讀取索引目錄中的Segments文件內容，倒序遍歷SegmentInfos並填充到SegmentReader（IndexReader的一種實現）數組，並構建StandardDirectoryReader的實例
有了IndexReader，IndexSearcher對象實例化就手到拈來了，new IndexSearcher(DirectoryReader)就可以得到其實例；如果我們想提高IndexSearcher的執行效率可以new IndexSearcher(DirecotoryReader,ExcuterService)來創建IndexSearcher對象，這樣做的好處爲對每塊segment採用了分工查詢，但是要注意IndexSearcher並不維護ExcuterService的生命週期，我們還需要自行調用ExcuterService的close/awaitTermination

相關實踐

以下是根據IndexSearcher相關的構建過程及其特性編寫的一個搜索的工具類

package com.lucene.search;

import java.io.File;
import java.io.IOException;
import java.nio.file.Paths;
import java.util.concurrent.ExecutorService;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.MultiReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.BooleanClause.Occur;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.MatchAllDocsQuery;
import org.apache.lucene.search.NumericRangeQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.Sort;
import org.apache.lucene.search.SortField;
import org.apache.lucene.search.SortField.Type;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.TopFieldCollector;
import org.apache.lucene.store.FSDirectory;

import com.lucene.index.IndexUtil;

public class SearchUtil {
	public static final Analyzer analyzer = new StandardAnalyzer();
	/**獲取IndexSearcher對象（適合單索引目錄查詢使用）
	 * @param indexPath 索引目錄
	 * @return
	 * @throws IOException
	 * @throws InterruptedException 
	 */
	public static IndexSearcher getIndexSearcher(String indexPath,ExecutorService service,boolean realtime) throws IOException, InterruptedException{
		DirectoryReader reader = DirectoryReader.open(IndexUtil.getIndexWriter(indexPath, true), realtime);
		IndexSearcher searcher = new IndexSearcher(reader,service);
		if(service != null){
			service.shutdown();
		}
		return searcher;
	}
	
	/**多目錄多線程查詢
	 * @param parentPath 父級索引目錄
	 * @param service 多線程查詢
	 * @return
	 * @throws IOException
	 * @throws InterruptedException 
	 */
	public static IndexSearcher getMultiSearcher(String parentPath,ExecutorService service,boolean realtime) throws IOException, InterruptedException{
		MultiReader multiReader;
		File file = new File(parentPath);
		File[] files = file.listFiles();
		IndexReader[] readers = new IndexReader[files.length];
		if(!realtime){
			for (int i = 0 ; i < files.length ; i ++) {
				readers[i] = DirectoryReader.open(FSDirectory.open(Paths.get(files[i].getPath(), new String[0])));
			}
		}else{
			for (int i = 0 ; i < files.length ; i ++) {
				readers[i] = DirectoryReader.open(IndexUtil.getIndexWriter(files[i].getPath(), true), true);
			}
		}
	
		multiReader = new MultiReader(readers);
		IndexSearcher searcher = new IndexSearcher(multiReader,service);
		if(service != null){
			service.shutdown();
		}
		return searcher;
	}
	
	/**從指定配置項中查詢
	 * @return
	 * @param analyzer 分詞器
	 * @param field 字段
	 * @param fieldType	字段類型
	 * @param queryStr 查詢條件
	 * @param range 是否區間查詢
	 * @return
	 */
	public static Query getQuery(String field,String fieldType,String queryStr,boolean range){
		Query q = null;
		if(queryStr != null && !"".equals(queryStr)){
			if(range){
				String[] strs = queryStr.split("\\|");
				if("int".equals(fieldType)){
					int min = new Integer(strs[0]);
					int max = new Integer(strs[1]);
					q = NumericRangeQuery.newIntRange(field, min, max, true, true);
				}else if("double".equals(fieldType)){
					Double min = new Double(strs[0]);
					Double max = new Double(strs[1]);
					q = NumericRangeQuery.newDoubleRange(field, min, max, true, true);
				}else if("float".equals(fieldType)){
					Float min = new Float(strs[0]);
					Float max = new Float(strs[1]);
					q = NumericRangeQuery.newFloatRange(field, min, max, true, true);
				}else if("long".equals(fieldType)){
					Long min = new Long(strs[0]);
					Long max = new Long(strs[1]);
					q = NumericRangeQuery.newLongRange(field, min, max, true, true);
				}
			}else{
				if("int".equals(fieldType)){
					q = NumericRangeQuery.newIntRange(field, new Integer(queryStr), new Integer(queryStr), true, true);
				}else if("double".equals(fieldType)){
					q = NumericRangeQuery.newDoubleRange(field, new Double(queryStr), new Double(queryStr), true, true);
				}else if("float".equals(fieldType)){
					q = NumericRangeQuery.newFloatRange(field, new Float(queryStr), new Float(queryStr), true, true);
				}else{
					Term term = new Term(field, queryStr);
					q = new TermQuery(term);
				}
			}
		}else{
			q= new MatchAllDocsQuery();
		}
		
		System.out.println(q);
		return q;
	}
	/**多條件查詢類似於sql in
	 * @param querys
	 * @return
	 */
	public static Query getMultiQueryLikeSqlIn(Query ... querys){
		BooleanQuery query = new BooleanQuery();
		for (Query subQuery : querys) {
			query.add(subQuery,Occur.SHOULD);
		}
		return query;
	}
	
	/**多條件查詢類似於sql and
	 * @param querys
	 * @return
	 */
	public static Query getMultiQueryLikeSqlAnd(Query ... querys){
		BooleanQuery query = new BooleanQuery();
		for (Query subQuery : querys) {
			query.add(subQuery,Occur.MUST);
		}
		return query;
	}
	/**對多個條件進行排序構建排序條件
	 * @param fields
	 * @param type
	 * @param reverses
	 * @return
	 */
	public static Sort getSortInfo(String[] fields,Type[] types,boolean[] reverses){
		SortField[] sortFields = null;
		int fieldLength = fields.length;
		int typeLength = types.length;
		int reverLength = reverses.length;
		if(!(fieldLength == typeLength) || !(fieldLength == reverLength)){
			return null;
		}else{
			sortFields = new SortField[fields.length];
			for (int i = 0; i < fields.length; i++) {
				sortFields[i] = new SortField(fields[i], types[i], reverses[i]);
			}
		}
		return new Sort(sortFields);
	}
	/**根據查詢器、查詢條件、每頁數、排序條件進行查詢
	 * @param query 查詢條件
	 * @param first 起始值
	 * @param max 最大值
	 * @param sort 排序條件
	 * @return
	 */
	public static TopDocs getScoreDocsByPerPageAndSortField(IndexSearcher searcher,Query query, int first,int max, Sort sort){
		try {
			if(query == null){
				System.out.println(" Query is null return null ");
				return null;
			}
			TopFieldCollector collector = null;
			if(sort != null){
				collector = TopFieldCollector.create(sort, first+max, false, false, false);
			}else{
				sort = new Sort(new SortField[]{new SortField("modified", SortField.Type.LONG)});
				collector = TopFieldCollector.create(sort, first+max, false, false, false);
			}
			searcher.search(query, collector);
			return collector.topDocs(first, max);
		} catch (IOException e) {
			// TODO Auto-generated catch block
		}
		return null;
	}
	
	/**獲取上次索引的id,增量更新使用
	 * @return
	 */
	public static Integer getLastIndexBeanID(IndexReader multiReader){
		Query query = new MatchAllDocsQuery();
		IndexSearcher searcher = null;
		searcher = new IndexSearcher(multiReader);
		SortField sortField = new SortField("id", SortField.Type.INT,true);
		Sort sort = new Sort(new SortField[]{sortField});
		TopDocs docs = getScoreDocsByPerPageAndSortField(searcher,query, 0, 1, sort);
		ScoreDoc[] scoreDocs = docs.scoreDocs;
		int total = scoreDocs.length;
		if(total > 0){
			ScoreDoc scoreDoc = scoreDocs[0];
			Document doc = null;
			try {
				doc = searcher.doc(scoreDoc.doc);
			} catch (IOException e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
			return new Integer(doc.get("id"));
		}
		return 0;
	}
}

以上即是lucene搜索之IndexSearcher構建過程相關內容；

一步一步跟我學習lucene（7）---lucene搜索之IndexSearcher構建過程

IndexSearcher

IndexSearcher的創建過程

相關實踐

相關代碼下載

lightdb hash index的性能和限制

一步一步跟我學習lucene（9）---lucene搜索之拼寫檢查和相似度查詢提示（spellcheck）

一步一步跟我學習lucene（1）---lucene的IndexWriter對象創建和索引策略的選擇

一步一步跟我學習lucene（7）---lucene搜索之IndexSearcher構建過程

一步一步跟我學習lucene（6）---lucene索引優化之多線程創建索引

Mysql 5.7 Root密碼忘記回覆（重置root密碼）

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結