Zookeeper服務啓動過程與服務器角色

最近在看Zookeeper，想把學習Zookeeper的過程記錄下來，這篇博客主要是爲了對Zookeeper做一個宏觀的記錄。

一、什麼是Zookeeper：

ZooKeeper是一個集中的服務，用於維護配置信息、命名、提供分佈式同步和提供組服務。它可以在分佈式系統中協作多個任務，在分佈式系統中，開發面臨的困難主要有：消息延遲、處理器性能和時鐘偏移，後面兩個會間接引起第一個問題，當我們面臨一個網絡錯誤時，很難確定是網絡超時還是系統崩潰，Zookeeper提供了一套功能強大的API去解決這些問題，這些功能主要包括：

保障強一致性、有序性和持久性。
實現通用的同步原語的能力。
簡單的併發處理機制。

在Zookeeper Server側主要包括四個個部分：

ZKDatabase:類似文件系統的數據結構，Zookeeper用該對象模型存儲數據
ServerCnxn:代表客戶端連接的服務器，用於接收客戶端的請求，並轉發到具體的服務器上
ZookeeperServer:ZK角色服務器，主要有三種角色，Leader、Follower和Observer。Zookeeper處於不同的角色時會把請求交給對應角色服務器處理
RequestProcessor:請求處理器，用於處理每一個請求。

二、Zookeeper服務啓動過程：

1、初始化Config

根據入參中提供的zoo.cfg文件，解析該文件並生成配置信息對象

public void runFromConfig(QuorumPeerConfig config) throws IOException {
      try {
          ManagedUtil.registerLog4jMBeans();
      } catch (JMException e) {
          LOG.warn("Unable to register log4j JMX control", e);
      }
  
      LOG.info("Starting quorum peer");
      try {
          ServerCnxnFactory cnxnFactory = ServerCnxnFactory.createFactory();
          cnxnFactory.configure(config.getClientPortAddress(),
                                config.getMaxClientCnxns());
          
          //創建仲裁成員
          quorumPeer = new QuorumPeer();
          //設置客戶端連接地址
          quorumPeer.setClientPortAddress(config.getClientPortAddress());
          //設置快照文件路徑和事務日誌文件路徑
          quorumPeer.setTxnFactory(new FileTxnSnapLog(
                      new File(config.getDataLogDir()),
                      new File(config.getDataDir())));
          //設置集羣服務器信息
          quorumPeer.setQuorumPeers(config.getServers());
          //設置選舉算法
          quorumPeer.setElectionType(config.getElectionAlg());
          //設置服務器Id
          quorumPeer.setMyid(config.getServerId());
          //設置時間單位
          quorumPeer.setTickTime(config.getTickTime());
          //設置最小回話超時時間
          quorumPeer.setMinSessionTimeout(config.getMinSessionTimeout());
          //設置最大回話超時時間
          quorumPeer.setMaxSessionTimeout(config.getMaxSessionTimeout());
          //設置初始化時間 單位TickTime
          quorumPeer.setInitLimit(config.getInitLimit());
          //設置發出請求和接收響應的同步時間 單位TickTime
          quorumPeer.setSyncLimit(config.getSyncLimit());
          quorumPeer.setQuorumVerifier(config.getQuorumVerifier());
          quorumPeer.setCnxnFactory(cnxnFactory);
          quorumPeer.setZKDatabase(new ZKDatabase(quorumPeer.getTxnFactory()));
          quorumPeer.setLearnerType(config.getPeerType());
  
          quorumPeer.start();
          quorumPeer.join();
      } catch (InterruptedException e) {
          // warn, but generally this is ok
          LOG.warn("Quorum Peer interrupted", e);
      }
    }

2、加載ZKDatabase數據

ZK服務器首先從快照文件中加載數據。
再根據事務日誌修正快照文件中的數據（Zookeeper會獲取該快照開始的前一個提交，並利用事務日誌文件重放該提交之後的事務）

public long restore(DataTree dt, Map<Long, Integer> sessions, 
            PlayBackListener listener) throws IOException {
        //將快照中的內容反序列化的ZKDatabase中
        snapLog.deserialize(dt, sessions);
        FileTxnLog txnLog = new FileTxnLog(dataDir);
        //從事務日誌中根據快照文件最早的事務ID讀取所有的事務
        TxnIterator itr = txnLog.read(dt.lastProcessedZxid+1);
        long highestZxid = dt.lastProcessedZxid;
        TxnHeader hdr;
        while (true) {
            // iterator points to 
            // the first valid txn when initialized
            hdr = itr.getHeader();
            if (hdr == null) {
                //empty logs 
                return dt.lastProcessedZxid;
            }
            if (hdr.getZxid() < highestZxid && highestZxid != 0) {
                LOG.error(highestZxid + "(higestZxid) > "
                        + hdr.getZxid() + "(next log) for type "
                        + hdr.getType());
            } else {
                highestZxid = hdr.getZxid();
            }
            try {
                //重放所有的事務
                processTransaction(hdr,dt,sessions, itr.getTxn());
            } catch(KeeperException.NoNodeException e) {
               throw new IOException("Failed to process transaction type: " +
                     hdr.getType() + " error: " + e.getMessage(), e);
            }
            listener.onTxnLoaded(hdr, itr.getTxn());
            if (!itr.next()) 
                break;
        }
        return highestZxid;
    }

3、啓動ServerCnxn服務

ServerCnxn代表這一個面上客戶端的socket連接，Zookeeper中默認的ServerCnxn設置是NIOServerCnxnFactory，採用Reactor模式使用JavaNio api編寫的網絡連接服務。ServerCnxn主要負責接收客戶端的請求並將請求轉發給具體的ZookeeperServer執行。

4、領導選舉

當一個服務器進入LOOKING狀態時，會向集羣中每一個服務器發送投票信息Vote,投票中包含服務器標識符(sid)和最新事務ID(zxid),當一個服務器收到一個投票信息時，該服務器會根據以下規則修改自己的投票信息

將接收的VoteId和VoteZxid作爲一個標識符，並獲取接收方當前的zxid，用myzxid和mysid表示接收方自己的值。
if(VoteZxid>myzxid) || if(VoteZxid==myzxid &&VoteId>mysid )則修改自己的投票信息。
否則，保留自己的投票信息。

當服務器接收到的仲裁成員數量(大於仲裁成員的一半)的投票都一樣時，表明Leader已經產生了。

//判斷收到的票數是否大於投票成員數目的二分之一，則認爲產生了Leader
if (termPredicate(recvset, proposedLeader,
                            proposedZxid)) {
                            // Otherwise, wait for a fixed amount of time
                            LOG.info("Passed predicate");
                            Thread.sleep(finalizeWait);
    
                            // Notification probe = recvqueue.peek();
    
                            // Verify if there is any change in the proposed leader
                            //檢查票是否都一樣，若一樣則刪除
                            while ((!recvqueue.isEmpty())
                                    && !totalOrderPredicate(
                                            recvqueue.peek().leader, recvqueue
                                                    .peek().zxid)) {
                                recvqueue.poll();
                            }
                            //集合爲空則表明投票都一樣 產生了Leader，並進行設置
                            if (recvqueue.isEmpty()) {
                                // LOG.warn("Proposed leader: " +
                                // proposedLeader);
                                self.setPeerState(
                                        (proposedLeader == self.getId()) ? 
                                         ServerState.LEADING :
                                         ServerState.FOLLOWING);
    
                                leaveInstance();
                                return new Vote(proposedLeader, proposedZxid);
                            }
                        }

5、啓動角色服務器

Zookeeper會根絕當前的角色來執行響應角色的服務器處理請求，Zookeeper的角色服務器主要分爲三種：

LeaderZookeeperServer
FollowerZookeeperServer
ObserverZookeeperServer

Zookeeper中，每種類型的服務器都會註冊不同的RequestProcessor來執行真正的處理過程。

LeaderZookeeperServer:

protected void setupRequestProcessors() {
        RequestProcessor finalProcessor = new FinalRequestProcessor(this);
        RequestProcessor toBeAppliedProcessor = new Leader.ToBeAppliedRequestProcessor(
                finalProcessor, getLeader().toBeApplied);
        commitProcessor = new CommitProcessor(toBeAppliedProcessor,
                Long.toString(getServerId()), false);
        commitProcessor.start();
        ProposalRequestProcessor proposalProcessor = new ProposalRequestProcessor(this,
                commitProcessor);
        proposalProcessor.initialize();
        firstProcessor = new PrepRequestProcessor(this, proposalProcessor);
        ((PrepRequestProcessor)firstProcessor).start();
    }

PreRequestProcessor:預處理請求，將請求轉換爲具體類型的請求。

ProposalRequestProcessor：該處理器會準備一個提案，並將該提案發送給追從者，ProposalRequestProcessor將會把所有請求轉發給CommitRequestProcessor，而且對於寫操作，還會將請求轉發給SyncRequestProcessor。

SyncRequestProcessor：該處理器將操作持久話到事務日誌中，並生成快照數據。

AckRequestProcessor：生成一條確認消息返回給自己。

CommitRequestProcessor：該處理器會將收到足夠多確認消息的事務進行提交。

ToBeAppliedRequestProcessor：將請求從從提議列表中刪除，並提交到待應用列表中。

FinalRequestProcessor：如果請求對象中包含事務數據，該處理器將會接收對Zookeeper樹的修改否則，將數據返給客戶端。

FollowerZookeeperServer：

protected void setupRequestProcessors() {
        RequestProcessor finalProcessor = new FinalRequestProcessor(this);
        commitProcessor = new CommitProcessor(finalProcessor,
                Long.toString(getServerId()), true);
        commitProcessor.start();
        firstProcessor = new FollowerRequestProcessor(this, commitProcessor);
        ((FollowerRequestProcessor) firstProcessor).start();
        syncProcessor = new SyncRequestProcessor(this,
                new SendAckRequestProcessor((Learner)getFollower()));
        syncProcessor.start();
    }

FollowerRequestProcessor：處理客戶端請求，將請求轉發給CommitProcessor，同時也會轉發寫到Leader。

CommitRequestProcessor：對於讀請求，會直接轉發到FinalRequestProcessor，對於寫請求，在提交到FinalRequestProcessor之前會等待事務提交。當羣首接收一個新的寫請求操作時，會生成一個提議，並將該提議發送給追隨者。當追隨者收到一個提議後，會發送給SyncRequestProcessor處理器處理，並通過SendRequestProcessor發送確認消息。當羣首接收到足夠多的消息確認提交這個提議，併發送提交事務的消息給追隨者。當追隨者接收到提交事務的消息時，通過CommitProcessor處理器處理。

ObserverZooKeeperServer：

ObserverZooKeeperServer類似與FollowerRequestProcessor，但是不參與響應提議的過程。

protected void setupRequestProcessors() {      
        // We might consider changing the processor behaviour of 
        // Observers to, for example, remove the disk sync requirements.
        // Currently, they behave almost exactly the same as followers.
        RequestProcessor finalProcessor = new FinalRequestProcessor(this);
        commitProcessor = new CommitProcessor(finalProcessor,
                Long.toString(getServerId()), true);
        commitProcessor.start();
        firstProcessor = new ObserverRequestProcessor(this, commitProcessor);
        ((ObserverRequestProcessor) firstProcessor).start();
        syncProcessor = new SyncRequestProcessor(this,
                new SendAckRequestProcessor(getObserver()));
        syncProcessor.start();
    }

到此爲止，Zookeeper的啓動過程和處理請求的過程大致結束了，本片文章比較長，越是到後面越有點力不從心，接下來會不斷的修正和完善這篇文章，歡迎前來指正。

Zookeeper服務啓動過程與服務器角色

Mysql架構與概念

Mysql Innodb Lock

多路平衡查找樹B-Tree

Redis簡單介紹-內存優化

Redis簡單介紹-管道

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結