Hadoop RPC分析（一） -- Client

[Hadoop RPC調用入口]

在使用Hadoop RPC基本框架中，主要是通過getProxy來獲得一個客戶端代理對象，通過這個對象來向服務端發送RPC請求。

getProxy有多個重載方法，最終都是調用到了下面這個函數來進行實現

（來自org.apache.hadoop.ipc.RPC）

publicstatic<T> ProtocolProxy<T> getProtocolProxy(Class<T> protocol,

longclientVersion,

InetSocketAddress addr,

UserGroupInformation ticket,

Configuration conf,

SocketFactory factory,

intrpcTimeout,

RetryPolicy connectionRetryPolicy)throwsIOException {

if(UserGroupInformation.isSecurityEnabled()) {

SaslRpcServer.init(conf);

}

returngetProtocolEngine(protocol,conf).getProxy(protocol, clientVersion,

addr, ticket, conf, factory, rpcTimeout, connectionRetryPolicy);

}

而在服務端，通過build方法，來構建一個Server對象

（來自 org.apache.hadoop.ipc.RPC.Builder）

/**

* Build the RPC Server.

*@throwsIOException on error

*@throwsHadoopIllegalArgumentException when mandatory fields are not set

publicServerbuild()throwsIOException, HadoopIllegalArgumentException {

if(this.conf == null) {

thrownewHadoopIllegalArgumentException("conf is not set");

}

if(this.protocol == null) {

thrownewHadoopIllegalArgumentException("protocol is not set");

}

if(this.instance == null) {

thrownewHadoopIllegalArgumentException("instance is not set");

}

returngetProtocolEngine(this.protocol,this.conf).getServer(

this.protocol,this.instance,this.bindAddress,this.port,

this.numHandlers,this.numReaders,this.queueSizePerHandler,

this.verbose,this.conf,this.secretManager,this.portRangeConfig);

}

通過上面的兩個入口，分別在客戶端和服務端生成了進行遠程調用所需要的對象。

上面的getProtocolEngine，是獲取一個RPC引擎，默認使用的是WritableRpcEngine（新版本貌似改成了ProtobufRpcEngine？），這裏使用WritableRpcEngine來進行源碼追蹤。

下面簡述追蹤路徑:

客戶端：WritableRpcEngine.getProxy() ---> Invoker ---> Client

使用了jdk的動態代理，Invoker實現了InvocationHandler接口，其invoke方法的實現，就是通過調用Client的call方法實現的，代碼如下

@Override

publicObjectinvoke(Object proxy, Method method, Object[] args)

throwsThrowable {

longstartTime = 0;

if(LOG.isDebugEnabled()) {

startTime = Time.now();

}

ObjectWritable value = (ObjectWritable)

client.call(RPC.RpcKind.RPC_WRITABLE,newInvocation(method, args),remoteId);

if(LOG.isDebugEnabled()) {

longcallTime = Time.now() - startTime;

LOG.debug("Call: " + method.getName() +" "+ callTime);

}

因此，我們對於客戶端的理解，將主要集中在Client類上。

服務端：WritableRpcEngine.getServer() ---> Server

new操作生成了一個Server對象，因此我們對於服務端的理解，將主要集中在Server類上。

[Hadoop RPC客戶端：Client]

客戶端的思路可以簡述爲：將調用的方法信息通過網絡發送到服務端，並等待服務端的返回。所以本質上，RPC就是對一般的網絡訪問做了封裝，造成了類似本地調用的假象。

這裏我們將主要關注客戶端的一次RPC是什麼樣的流程，並希望能找到對應的實現代碼。

與Client相關的類主要爲下面幾個（都是Client的內部類）

Client.Connection -------- 一個Connection對象表示一個和服務端之間的連接通道，它提供了和具體調用業務無關的底層通道信息，作爲一個基礎工具存在

Client.Call -------- 一個Call表示一次遠程過程調用，它裏面包含了本次遠程過程調用的請求信息，調用結果返回等信息，作爲遠程過程調用業務存在。

由於實現了底層通道和具體的調用業務無關，多個調用業務可以複用同一個底層通道，在Connection內部會維護多個當前存在的調用業務。

通道本身是業務無關的，客戶端和服務端之間是可以存在多條並行的通道的，在Client內部會有一個Connection的線程池。

首先來看Client的屬性

（來自org.apache.hadoop.ipc.Client）

/** A counter for generating call IDs. */

privatestaticfinalAtomicIntegercallIdCounter=newAtomicInteger();

privatestaticfinalThreadLocal<Integer>callId=newThreadLocal<Integer>();

privatestaticfinalThreadLocal<Integer>retryCount=newThreadLocal<Integer>();

private Hashtable<ConnectionId, Connection>connections= newHashtable<ConnectionId, Connection>();

privateClass<?extendsWritable>valueClass; // class of call values

privateAtomicBooleanrunning=newAtomicBoolean(true);// if client runs

finalprivateConfigurationconf;

privateSocketFactorysocketFactory; // how to create sockets

privateintrefCount= 1;

privatefinalintconnectionTimeout;

privatefinalbooleanfallbackAllowed;

privatefinalbyte[]clientId;

finalstaticintCONNECTION_CONTEXT_CALL_ID= -3;

可以看到，在Client中是存在多個與服務端的連接對象的。

再看下Connection的屬性

（來自 org.apache.hadoop.ipc.Client.Connection）

privateInetSocketAddressserver; // server ip:port

privatefinalConnectionIdremoteId; // connection id

privateAuthMethodauthMethod;// authentication method

privateAuthProtocolauthProtocol;

privateintserviceClass;

privateSaslRpcClientsaslRpcClient;

privateSocketsocket=null; // connected socket

privateDataInputStreamin;

privateDataOutputStreamout;

privateintrpcTimeout;

privateintmaxIdleTime;//connections will be culled if it was idle for

//maxIdleTime msecs

privatefinalRetryPolicyconnectionRetryPolicy;

privateintmaxRetriesOnSocketTimeouts;

privatebooleantcpNoDelay;// if T then disable Nagle's Algorithm

privatebooleandoPing;//do we need to send ping message

privateintpingInterval;// how often sends ping to the server in msecs

privateByteArrayOutputStreampingRequest;// ping message

// currently active calls

privateHashtable<Integer, Call>calls=newHashtable<Integer, Call>();

privateAtomicLonglastActivity=newAtomicLong();// last I/O activity time

privateAtomicBooleanshouldCloseConnection=newAtomicBoolean(); // indicate if the connection is closed

privateIOExceptioncloseException;// close reason

privatefinalObjectsendRpcRequestLock=newObject();

基本上都是建立與服務端的連接所需要的基本配置信息，有一個calls屬性，存放的是提交到當前這個連接的請求對象。

Call對象就表示一次遠程過程調用業務，因此它含有遠程調用業務所需要的參數信息，來看Call的屬性

（來自 org.apache.hadoop.ipc.Client.Call）

/**

* Class that represents an RPC call

staticclassCall {

finalintid; // call id

finalintretry; // retry count

finalWritablerpcRequest; // the serialized rpc request

WritablerpcResponse; // null if rpc has error

IOExceptionerror; // exception, null if success

finalRPC.RpcKindrpcKind; // Rpc EngineKind

booleandone; // true when call is done

OK，在結構上了解了類的作用後，就可以來看下客戶端的一次遠程調用的流程了。只需要研究Client.call即可，代碼如下

publicWritablecall(RPC.RpcKind rpcKind, Writable rpcRequest,

ConnectionId remoteId,intserviceClass)throwsIOException {

finalCall call = createCall(rpcKind, rpcRequest);

Connection connection = getConnection(remoteId, call, serviceClass);

try{

connection.sendRpcRequest(call); // send the rpc request

}catch(RejectedExecutionException e) {

thrownewIOException("connection has been closed", e);

}catch(InterruptedException e) {

Thread.currentThread().interrupt();

LOG.warn("interrupted waiting to send rpc request to server", e);

thrownewIOException(e);

}

booleaninterrupted =false;

synchronized(call) {

while(!call.done) {

try{

call.wait(); // wait for the result

}catch(InterruptedException ie) {

// save the fact that we were interrupted

interrupted =true;

}

if(interrupted) {

// set the interrupt flag now that we are done waiting

Thread.currentThread().interrupt();

}

if(call.error!=null) {

if(call.errorinstanceofRemoteException) {

call.error.fillInStackTrace();

throwcall.error;

}else{// local exception

InetSocketAddress address = connection.getRemoteAddress();

throwNetUtils.wrapException(address.getHostName(),

address.getPort(),

NetUtils.getHostname(),

call.error);

}

}else{

returncall.getRpcResponse();

}

執行步驟如下：

1、createCall創建一次遠程調用業務對象

2、getConnection獲取一個可用的連接對象，在這裏會對連接進行初始化，和服務端建立起socket連接，同時把提交的call業務，保存到這個連接對象中

3、connection.sendRpcRequest，執行一次遠程過程調用業務操作

4、call.wait等待結果返回

5、call.getRpcResponse返回遠程調用結果

整體流程如上面描述的步驟所示，在第二步和第三步中會使用到Connection的相關功能，我們來對Connection做進一步的分析

[Connection]

首先來看getConnection的功能，下面是部分主幹流程分支。

/** Get a connection from the pool, or create a new one and add it to the

* pool. Connections to a given ConnectionId are reused. */

private Connection getConnection(ConnectionId remoteId,

Call call, int serviceClass) throws IOException {

Connection connection;

do {

synchronized ( connections) {

connection = connections.get(remoteId);

if (connection == null) {

connection = new Connection(remoteId, serviceClass);

connections.put(remoteId, connection);

}

} while (!connection.addCall(call));

connection.setupIOstreams();

return connection;

}

從代碼行看，獲取一個Connection對象，然後把當前遠程調用業務加到這個Connection對象提交的calls映射表中，同時建立網絡連接，但這裏還沒有做網絡請求發送數據。

private synchronized boolean addCall (Call call) {

if (shouldCloseConnection.get())

return false ;

calls.put(call. id, call);

notify();

return true;

}

注意這裏的addCall函數中，有一個notify()的調用。記住Connection本身是繼承了Thread類的，本身也是一個獨立的線程來運行，但這裏的這個notify調用，不是在自身線程調用的。這個是由Client的call引起的調用。在將會看到對應的wait調用存在。 connection.setUpIOstreams做流讀寫的初始化，不做細究。

上面第三步sendRpcRequest是發送遠程調用請求，在這裏做的網絡請求發送，由於sendRpcRequest的實現涉及到後面的討論，這裏列出其主幹代碼

public void sendRpcRequest (final Call call)

throws InterruptedException, IOException {

synchronized ( sendRpcRequestLock) {

Future<?> senderFuture = SEND_PARAMS_EXECUTOR.submit(new Runnable() {

@Override

public void run() {

synchronized (Connection.this.out) {

byte[] data = d.getData();

int totalLength = d.getLength();

out.writeInt(totalLength); // Total Length

out.write(data, 0, totalLength); // RpcRequestHeader + RpcRequest

out.flush();

}

});

}

senderFuture.get();

}

這裏使用線程池做了網咯數據的發送之後，並沒有去同步地等待數據的返回。而在Client.call函數中，是會一直等待call業務的返回。所以，必定有一個地方是會去接受網絡返回之後，將call的調用狀態設置爲完成，這樣才能讓Client.call函數調用結束。

之前說過Connection本身就是可以作爲線程來執行的，這裏就需要去看Connection的run方法了。去掉異常分支之後的代碼如下

@Override

public void run() {

while (waitForWork()) {//wait here for work - read or close connection

receiveRpcResponse();

}

close();

}

來看下waitForWork的實現邏輯

private synchronized boolean waitForWork () {

if (calls.isEmpty() && ! shouldCloseConnection.get() && running .get()) {

long timeout = maxIdleTime -

(Time. now()-lastActivity.get());

if (timeout>0) {

try {

wait(timeout);

} catch (InterruptedException e) {}

}

if (! calls.isEmpty() && ! shouldCloseConnection.get() && running .get()) {

return true ;

} else if (shouldCloseConnection .get()) {

return false ;

} else if (calls .isEmpty()) { // idle connection closed or stopped

markClosed( null);

return false ;

} else { // get stopped but there are still pending requests

markClosed((IOException) new IOException().initCause(

new InterruptedException()));

return false ;

}

注意這裏的wait函數，正好對應前面addCall中的notify函數。

至於receiveRpcResponse，它的作用就是通過網絡去讀取遠程過程調用的返回結果，找到對應的callId，然後找到對應的Call對象，設置它的狀態。

總結下Connection設計思想：Connection線程自身會一直用wait等待，直到外界有請求到達後觸發notify操作，同時更新Connection內部維護的callId和Call對象之間的關係，發送網絡請求。

Connnection線程自身在運行的情況下會去讀取網絡數據，在獲取的返回結果數據中，有對應的callId存在。由於是採用異步方式去讀取的數據，因此會根據調用的業務的callId來找到對應的Call對象，將其狀態置爲完成，這樣對應的Client.call才能正常結束，否則就會一直等待（不考慮超時）。

Hadoop RPC分析（一） -- Client

Struts2緩存解析

ThreadLocal簡單學習

Android測試用例執行線程和UI線程

條件鎖

http請求的參數和屬性

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結