Solr 6.0 學習(十三)Solr緩存

Solr是基於Lucene的全文搜索服務器。實際上所有的搜索索引文件都是以文件形式存儲在磁盤中。數據量到一定程度上,磁盤的IO會影響搜索性能。那麼針對這種情況,我們優化的過程中勢必需要運用緩存技術。目前,我們熟知的緩存nosql數據庫:redis、mongodb、memcache。不過,本文不在這裏針對這些nosql數據庫做討論,本文主要是針對solr已經實現的緩存技術做探討。

1、httpcache

  • xml配置

    我們看到solrconfig.xml文件中的配置:

<httpCaching never304="true" />
    <!-- If you include a <cacheControl> directive, it will be used to
         generate a Cache-Control header (as well as an Expires header
         if the value contains "max-age=")

         By default, no Cache-Control header is generated.

         You can use the <cacheControl> option even if you have set
         never304="true"
      -->
    <!--
       <httpCaching never304="true" >
         <cacheControl>max-age=30, public</cacheControl> 
       </httpCaching>
      -->
    <!-- To enable Solr to respond with automatically generated HTTP
         Caching headers, and to response to Cache Validation requests
         correctly, set the value of never304="false"

如果要啓用httpcache需要配置,如下:

<httpCaching never304="false" >
         <cacheControl>max-age=30, public</cacheControl> 
       </httpCaching>

max-age:緩存時間,以秒爲單位
public:所有資源都應用

  • solr內部實現
    HttpSolrCall源碼片段
HttpCacheHeaderUtil.setCacheControlHeader(this.config, resp, reqMethod);
//判斷緩存是否有效
        if ((this.config.getHttpCachingConfig().isNever304()) || 
          (!HttpCacheHeaderUtil.doCacheHeaderValidation(this.solrReq, this.req, reqMethod, resp)))
        {
          solrRsp = new SolrQueryResponse();

          SolrRequestInfo.setRequestInfo(new SolrRequestInfo(this.solrReq, solrRsp));
          execute(solrRsp);
          HttpCacheHeaderUtil.checkHttpCachingVeto(solrRsp, resp, reqMethod);
          Iterator headers = solrRsp.httpHeaders();
          while (headers.hasNext()) {
            Map.Entry entry = (Map.Entry)headers.next();
            resp.addHeader((String)entry.getKey(), (String)entry.getValue());
          }
          QueryResponseWriter responseWriter = this.core.getQueryResponseWriter(this.solrReq);
          if (this.invalidStates != null) this.solrReq.getContext().put("_stateVer_", this.invalidStates);
          writeResponse(solrRsp, responseWriter, reqMethod);
        }

我們看到HttpCacheHeaderUtil這個緩存工具類

    //設置緩存頭部信息
  public static void setCacheControlHeader(SolrConfig conf, HttpServletResponse resp, Method method)
  {
    if ((Method.POST == method) || (Method.OTHER == method)) {
      return;
    }
    //獲取xml的配置
    String cc = conf.getHttpCachingConfig().getCacheControlHeader();
    if (null != cc) {
      resp.setHeader("Cache-Control", cc);
    }
    Long maxAge = conf.getHttpCachingConfig().getMaxAge();
    if (null != maxAge)
      resp.setDateHeader("Expires", timeNowForHeader() + maxAge.longValue() * 1000L);
  }

我們着重看下如下代碼:

HttpCacheHeaderUtil.doCacheHeaderValidation(this.solrReq, this.req, reqMethod, resp)

public static boolean doCacheHeaderValidation(SolrQueryRequest solrReq, HttpServletRequest req, Method reqMethod, HttpServletResponse resp)
  {
    if ((Method.POST == reqMethod) || (Method.OTHER == reqMethod)) {
      return false;
    }

    long lastMod = calcLastModified(solrReq);
    String etag = calcEtag(solrReq);

    resp.setDateHeader("Last-Modified", lastMod);
    resp.setHeader("ETag", etag);

    if (checkETagValidators(req, resp, reqMethod, etag)) {
      return true;
    }

    if (checkLastModValidators(req, resp, lastMod)) {
      return true;
    }

    return false;
  }

  public static boolean checkETagValidators(HttpServletRequest req, HttpServletResponse resp, Method reqMethod, String etag)
  {
    List ifNoneMatchList = Collections.list(req
      .getHeaders("If-None-Match"));

    if ((ifNoneMatchList.size() > 0) && (isMatchingEtag(ifNoneMatchList, etag))) {
      if ((reqMethod == Method.GET) || (reqMethod == Method.HEAD))
        sendNotModified(resp);
      else {
        sendPreconditionFailed(resp);
      }
      return true;
    }

    List ifMatchList = Collections.list(req
      .getHeaders("If-Match"));

    if ((ifMatchList.size() > 0) && (!isMatchingEtag(ifMatchList, etag))) {
      sendPreconditionFailed(resp);
      return true;
    }

    return false;
  }

  public static boolean checkLastModValidators(HttpServletRequest req, HttpServletResponse resp, long lastMod)
  {
    try
    {
      long modifiedSince = req.getDateHeader("If-Modified-Since");
      if ((modifiedSince != -1L) && (lastMod <= modifiedSince))
      {
        sendNotModified(resp);
        return true;
      }

      long unmodifiedSince = req.getDateHeader("If-Unmodified-Since");
      if ((unmodifiedSince != -1L) && (lastMod > unmodifiedSince))
      {
        sendPreconditionFailed(resp);
        return true;
      }
    }
    catch (IllegalArgumentException localIllegalArgumentException)
    {
    }
    return false;
  }

主要是用來判斷當前的搜索請求request的請求頭header的If-Modified-SinceIf-None-Match的兩個值。

2、其他緩存

  • filterCache

    Filter cache:這個是被用來緩存過濾器(就是查詢參數fq)的結果和基本的枚舉類型。

<!-- Filter Cache

         Cache used by SolrIndexSearcher for filters (DocSets),
         unordered sets of *all* documents that match a query.  When a
         new searcher is opened, its caches may be prepopulated or
         "autowarmed" using data from caches in the old searcher.
         autowarmCount is the number of items to prepopulate.  For
         LRUCache, the autowarmed items will be the most recently
         accessed items.

         Parameters:
           class - the SolrCache implementation LRUCache or
               (LRUCache or FastLRUCache)
           size - the maximum number of entries in the cache
           initialSize - the initial capacity (number of entries) of
               the cache.  (see java.util.HashMap)
           autowarmCount - the number of entries to prepopulate from
               and old cache.  
      -->
    <filterCache class="solr.FastLRUCache"
                 size="512"
                 initialSize="512"
                 autowarmCount="0"/>
  • queryResultCache

    Query result cache:緩存查詢結果集。

<!-- Query Result Cache

         Caches results of searches - ordered lists of document ids
         (DocList) based on a query, a sort, and the range of documents requested.  
      -->
    <queryResultCache class="solr.LRUCache"
                     size="512"
                     initialSize="512"
                     autowarmCount="0"/>
  • documentCache

Document cache:這個是被用來緩存lucene documents的,就是存儲field的那個東西。
注:這個緩存是短暫的,也不會自動更新。

<!-- Document Cache

         Caches Lucene Document objects (the stored fields for each
         document).  Since Lucene internal document ids are transient,
         this cache will not be autowarmed.  
      -->
    <documentCache class="solr.LRUCache"
                   size="512"
                   initialSize="512"
                   autowarmCount="0"/>

配置參數:
1、Class:指定使用solr的哪種緩存機制。
我們通過三種緩存的配置可以看到,其實現主要是分爲兩種:solr.LRUCache和solr.FastLRUCache.

LRUCache:基於線程安全的LinkedHashMap實現。

FastLRUCache:基於ConcurrentHashMap實現。

2、Size:允許分配多少個實體(entity)的緩存空間。

3、initialSize:分配初始多少個實體(entity)的緩存空間。

4、autowarmCount:自動預裝入實體數。

  • queryResultWindowSize

queryResultWindowSize:配合queryResultCache來使用。
簡單來說:如果需要分頁查詢,那麼配置爲50,那麼solr在查詢的時候會緩存0-49個結果,那麼翻頁查詢的時候就會直接從緩存中獲取。
配置如下:

<!-- Result Window Size

        An optimization for use with the queryResultCache.  When a search
        is requested, a superset of the requested number of document ids
        are collected.  For example, if a search for a particular query
        requests matching documents 10 through 19, and queryWindowSize is 50,
        then documents 0 through 49 will be collected and cached.  Any further
        requests in that range can be satisfied via the cache.  
     -->
   <queryResultWindowSize>20</queryResultWindowSize>

這幾種緩存,實際運用中需要根據查詢的頻率,緩存個數來具體設置,也需要實踐觀察。

發佈了84 篇原創文章 · 獲贊 49 · 訪問量 22萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章