Zookeeper學習筆記:客戶端程序分析一

Zookeper的客戶端程序有多種不同語言的版本,C和JAVA。因爲平時在項目中接觸的比較多的是C語言開發,所以在這裏也就主要對C語言的客戶端程序進行分析


Zookeeper的C語言的客戶端代碼在解壓後的zookeeper壓縮包src/c目錄下,在該目錄下可以通過./configure, make&sudo make install來安裝程序。安裝好了之後,主要有兩個執行程序:cli_mt, cli_st。需要注意的是,這兩個程序分別需要鏈接libzookeeper_mt.so.2,libzookeeper_st.so.2。所以在運行的時候可能需要通過export LD_LIBRARY_PATH來指定路徑


這裏我們首先來分析該客戶端程序的Log機制,其主要實現文件爲src/c/include/zookeeper_log.h, src/c/src/zk_log.c,在zookeeper.c中定義了可選的LogLevel:

	typedef enum {ZOO_LOG_LEVEL_ERROR=1,ZOO_LOG_LEVEL_WARN=2,ZOO_LOG_LEVEL_INFO=3,ZOO_LOG_LEVEL_DEBUG=4} ZooLogLevel;

並且從zookeeper_log.h的宏定義我們可以知道,log的實現主要是由log_message這個函數來實現的

#define LOG_ERROR(x) if(logLevel>=ZOO_LOG_LEVEL_ERROR) \
    log_message(ZOO_LOG_LEVEL_ERROR,__LINE__,__func__,format_log_message x)
#define LOG_WARN(x) if(logLevel>=ZOO_LOG_LEVEL_WARN) \
    log_message(ZOO_LOG_LEVEL_WARN,__LINE__,__func__,format_log_message x)
#define LOG_INFO(x) if(logLevel>=ZOO_LOG_LEVEL_INFO) \
    log_message(ZOO_LOG_LEVEL_INFO,__LINE__,__func__,format_log_message x)
#define LOG_DEBUG(x) if(logLevel==ZOO_LOG_LEVEL_DEBUG) \
    log_message(ZOO_LOG_LEVEL_DEBUG,__LINE__,__func__,format_log_message x)
而log_message函數的實現主要是在zk_log.c文件中,我們主要來分析以下zk_log.c這個文件。

__attribute__((constructor)) void prepareTSDKeys() {
    pthread_key_create (&time_now_buffer, freeBuffer);
    pthread_key_create (&format_log_msg_buffer, freeBuffer);
}
這裏,我們可以看到使用了pthread_key_create來創建專屬於每一個線程的time_now_buffer和format_log_msg_buffer變量。這是爲什麼呢?因爲如果在多線程的環境下,當多個線程並行的調用Log函數時,由於log信息裏面包含了打印log的時間戳,而該時間戳是程序經過格式化後的數據:

static const char* time_now(char* now_str){
    struct timeval tv;
    struct tm lt;
    time_t now = 0;
    size_t len = 0;

    gettimeofday(&tv,0);

    now = tv.tv_sec;
    localtime_r(&now, <);

    // clone the format used by log4j ISO8601DateFormat
    // specifically: "yyyy-MM-dd HH:mm:ss,SSS"

    len = strftime(now_str, TIME_NOW_BUF_SIZE,
                          "%Y-%m-%d %H:%M:%S",
                          <);

    len += snprintf(now_str + len,
                    TIME_NOW_BUF_SIZE - len,
                    ",%03d",
                    (int)(tv.tv_usec/1000));

    return now_str;
}

所以,如果每次打印log信息都經過malloc開闢空間來存儲格式化的信息,在效率和性能上肯定是不高的。所以在zookeeper client端的log程序中,通過事先開闢好的一塊固定的內存空間來存儲這個時間戳信息,也就是上面代碼的now_str對於每一個線程來說是一塊固定內存的起始地址,這個我們可以從log_message的函數實現中看出來:

void log_message(ZooLogLevel curLevel,int line,const char* funcName,
    const char* message)
{
    static const char* dbgLevelStr[]={"ZOO_INVALID","ZOO_ERROR","ZOO_WARN",
            "ZOO_INFO","ZOO_DEBUG"};
    static pid_t pid=0;
#ifdef WIN32
    char timebuf [TIME_NOW_BUF_SIZE];
#endif
    if(pid==0)pid=getpid();
#ifndef THREADED
    fprintf(LOGSTREAM, "%s:%d:%s@%s@%d: %s\n", time_now(get_time_buffer()),pid,
            dbgLevelStr[curLevel],funcName,line,message);
#else
#ifdef WIN32
    fprintf(LOGSTREAM, "%s:%d(0x%lx):%s@%s@%d: %s\n", time_now(timebuf),pid,
            (unsigned long int)(pthread_self().thread_id),
            dbgLevelStr[curLevel],funcName,line,message);
#else
    fprintf(LOGSTREAM, "%s:%d(0x%lx):%s@%s@%d: %s\n", time_now(get_time_buffer()),pid,
            (unsigned long int)pthread_self(),
            dbgLevelStr[curLevel],funcName,line,message);
#endif
#endif
    fflush(LOGSTREAM);
}

而對於get_time_buffer的實現,我們可以知道,如果是單線程的程序,使用的是static的靜態存儲空間,而對於多線程的程序,則使用上面介紹的pthread_getspecific()來獲取每一個線程專屬的變量:

#ifdef THREADED
char* getTSData(pthread_key_t key,int size){
    char* p=pthread_getspecific(key);
    if(p==0){
        int res;
        p=calloc(1,size);
        res=pthread_setspecific(key,p);
        if(res!=0){
            fprintf(stderr,"Failed to set TSD key: %d",res);
        }
    }
    return p;
}

char* get_time_buffer(){
    return getTSData(time_now_buffer,TIME_NOW_BUF_SIZE);
}

char* get_format_log_buffer(){
    return getTSData(format_log_msg_buffer,FORMAT_LOG_BUF_SIZE);
}
#else
char* get_time_buffer(){
    static char buf[TIME_NOW_BUF_SIZE];
    return buf;
}

char* get_format_log_buffer(){
    static char buf[FORMAT_LOG_BUF_SIZE];
    return buf;
}
#endif


到這裏,我們可就基本清楚了整個zookeeper client端Log的機制,整體架構還是挺簡單的,也沒有考慮高性能的問題,畢竟只是作爲一個客戶端使用。而在這裏,我們想要關注的是這種架構下的一些額外的問題:

1. int pthread_key_create(pthread_key_t *key, void (*destr_function) (void*));

    這個函數創建的key關聯的變量的個數是有限制的:

    pthread_key_create allocates a new TSD key. The key is  stored  in  the location pointed to by key. There is a limit of PTHREAD_KEYS_MAX on the number of keys allocated at a given time. The value  initially  associated with the returned key is NULL in all currently executing threads.

    並且,只有當一個線程通過pthread_exit或者被cancel的時候,destr_function纔會被調用,如果該變量的值爲NULL,則不會調用destr_function銷燬該變量,並且,destr_function調用的順序是未知的。

    如果在調用destr_function,某個之前已經調用過destr_function的變量又被賦non-NULL值,則這個銷燬的流程又會再重複一遍,這種重複是有次數限制的,最多爲PTHREAD_DESTRUCTOR_ITERATIONS


2. 上面的Log機制中調用了fprintf來作爲輸出到文件或者terminal的方式,這裏我們也許會有一個疑惑:在多線程的環境下,每一次fprintf並不需要明顯的加鎖操作,fprintf是線程安全的嗎?

    這裏我們可以從stackflow(http://stackoverflow.com/questions/11664434/how-fprintf-behavior-when-multi-threaded-and-multi-processed)一些大拿的回答還有自己去查看源碼來分析:

If you're using a single FILE object to perform output on an open file, then whole fprintf calls on that FILE will be atomic, i.e. lock is held on the FILE for the duration of the fprintf call. Since a FILE is local to a single process's address space, this setup is only possible in multi-threaded applications; it does not apply to multi-process setups where several different processes are accessing separate FILE objects referring to the same underlying open file. Even though you're using fprintf here, each process has its own FILE it can lock and unlock without the others seeing the changes, so writes can end up interleaved. There are several ways to prevent this from happening:

a. Allocate a synchronization object (e.g. a process-shared semaphore or mutex) in shared memory and make each process obtain the lock before writing to the file (so only one process can write at a time); OR

b. <span style="font-family: Arial, Helvetica, sans-serif;">Use filesystem-level advisory locking, e.g. fcntl locks or the (non-POSIX) BSD flock interface; OR</span>
c. Instead of writing directly to the log file, write to a pipe that another process will feed into the log file. Writes to a pipe are guaranteed (by POSIX) to be atomic as long as they are smaller than PIPE_BUF bytes long. You cannot use fprintf in this case (since it might perform multiple underlying write operations), but you could use snprintf to a PIPE_BUF-sized buffer followed by write.
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章