開源 syslog 日誌系統 scribe

開源 syslog 日誌系統 scribe


scribe 官網

https://github.com/facebookarchive/scribe


簡介

Scribe 是 facebook 開源的日誌收集系統,在 facebook 內部已經得到大量的應用,目前在各大互聯網公司內部已經得到大量的應用。

它能夠從各種日誌源上收集日誌,存儲到一箇中央存儲系統 (可以是 NFS,分佈式文件系統等)上,以便於進行集中統計分析處理。它爲日誌的“分佈式收集,統一處理”提供了一個可擴展的,高容錯的方案。

它最重要的特點是容錯性好。當中央存儲系統的網絡或者機器出現故障時,scribe 會將日誌轉存到本地或者另一個位置,當中央存儲系統恢復後,scribe 會將轉存的日誌重新傳輸給中央存儲系統。

其通常與 Hadoop 結合使用,scribe 用於向 HDFS 中 push 日誌,而 Hadoop 通過 MapReduce 作業進行定期處理。

架構

scribe 的架構比較簡單,主要包括三部分,分別爲 scribe agent, scribe 和 存儲系統。

(1) scribe agent

scribe agent 實際上是一個 thrift client。 向 scribe 發送數據的唯一方法是使用 thrift client, scribe 內部定義了一個 thrift 接口,用戶使用該接口將數據發送給 server。

(2) scribe

scribe 接收到 thrift client 發送過來的數據,根據配置文件,將不同 topic 的數據發送給不同的對象。scribe 提供了各種各樣的 store,如 file, HDFS 等,scribe 可將數據加載到這些 store 中。

(3) 存儲系統

存儲系統實際上就是 scribe 中的 store,當前 scribe 支持非常多的 store,包括 file(文件),buffer(雙層存儲,一個主儲存,一個副存儲),network(另一個scribe服務器),bucket(包含多個 store,通過 hash 的將數據存到不同 store 中),null (忽略數據),thriftfile(寫到一個 Thrift TFileTransport 文件中)和 multi(把數據同時存放到不同 store 中)。


【CentOS-7】

 

安裝環境軟件

sudo yum install git make bison libtool automake openssl-devel gcc-c++ python-devel

# libevent,是一個用 C 語言編寫的、輕量級的開源高性能事件通知庫
# 安裝 libevent libevent-devel
yum install libevent libevent-devel

# flex,是一個生成詞法分析器的工具,它可以利用正則表達式來生成匹配相應字符串的 C 語言代碼。
# 安裝 flex
yum install flex 

# 安裝 byacc
yum install byacc 

# 安裝 openjdk
yum install java-1.7.0-openjdk 

# 一個構建工具,它通過自動完成所有的編譯代碼,運行測試以及打包重新部署的結果等繁瑣費力的任務來幫助軟件團隊開發大程序
# 安裝 ant
yum install ant

# Autoconf 是一個用於包,以適應多種 Unix 類系統的 shell 腳本的工具。
# 安裝 autoconf
yum install autoconf

# Boost是爲C++語言標準庫提供擴展的一些C++程序庫的總稱。
# 安裝 boost
yum install boost boost-devel

# libevent,是一個用 C 語言編寫的、輕量級的開源高性能事件通知庫
# 安裝 libevent
yum install libevent

# 安裝 libicu-devel
yum install libicu-devel

# 安裝 thrift
wget http://rpmfind.net/linux/epel/7/x86_64/Packages/t/thrift-0.9.1-15.el7.x86_64.rpm
yum install thrift-0.9.1-15.el7.x86_64.rpm

# 安裝 fb303
wget http://rpmfind.net/linux/epel/7/x86_64/Packages/f/fb303-0.9.1-15.el7.x86_64.rpm
yum install fb303-0.9.1-15.el7.x86_64.rpm

刷新動態鏈接庫

/sbin/ldconfig

下載 scribe

git clone https://github.com/facebookarchive/scribe.git

Readme 

Archived Repo
=============

This is an archived project and is no longer supported or updated by Facebook. 
Please do not file issues or pull-requests against this repo. If you wish to 
continue to develop this code yourself, we recommend you fork it.

-------------

Introduction
============

Scribe is a server for aggregating log data that's streamed in real
time from clients. It is designed to be scalable and reliable.

See the Scribe Wiki for documentation:
http://wiki.github.com/facebook/scribe

Keep up to date on Scribe development by joining the Scribe Discussion Group:
http://groups.google.com/group/scribe-server/


License (See LICENSE file for full license)
===========================================
Copyright 2007-2008 Facebook

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


Hierarchy
=========

scribe/

  aclocal/
    Contains scripts for building/linking with Boost

  examples/
    Contains simple examples of using Scribe

  if/
    Contains Thrift interface for Scribe

  lib/
    Contains Python package for Scribe

  src/
    Contains Scribe source

  test/
    Contain php scripts for testing scribe


Requirements
============

[libevent] Event Notification library
[boost] Boost C++ library (version 1.36 or later)
[thrift] Thrift framework (version 0.5.0 or later)
[fb303] Facebook Bassline (included in thrift/contrib/fb303/)
   fb303 r697294 or later is required.
[hadoop] optional. version 0.19.1 or higher (http://hadoop.apache.org)

These libraries are open source and may be freely obtained, but they are not
provided as a part of this distribution.


Helpful tips:
-Thrift, fb303, and scribe installation expects python to be installed
 under /usr.  See PY_PREFIX option in 'configure --help' to change this path.
-Some python installs do not include python site-packages in the default
 python include path.  If python cannot find the installed packages for
 scribe or fb303, try setting the environment variable PYTHONPATH to the
 location of the installed packages.  This path gets output during
 'make install'. (Eg: PYTHONPATH='/usr/lib/python2.5/site-packages').


To build
========

./bootstrap.sh <configure options>
make

(If you have multiple versions of Boost installed, see Boost configure options below.)

Subsequent builds
=================

./bootstrap <configure options>
make

OR

./configure <configure options>
make

NOTE: After the first run with bootstrap.sh you can use "[ ./bootstrap | ./configure ] <options>" followed by "make"
to create builds with different configurations. "bootstrap" can be passed the same arguments as "configure".

Make sure that if you change configure.ac and|or add macros run "bootstrap.sh".
to regenerate configure. In short whenever in doubt run "bootstrap.sh".


Configure options
=================

To find all available configure options run
./configure --help

Use *only* the listed options.

Examples:
# To disable optimized builds and turn on debug. [ default has been set to optimized]
./configure --disable-opt

# To disable static libraries and enable shared libraries. [ default has been set to static]
./configure --disable-static

# To build scribe with Hadoop support
./configure --enable-hdfs

# If the build process cannot find your Hadoop/Jvm installs, you may need to specify them manually:
./configure --with-hadooppath=/usr/local/hadoop --enable-hdfs CPPFLAGS="-I/usr/local/java/include -I/usr/local/java/include/linux" LDFLAGS="-ljvm -lhdfs"

# To set thrift home to a non-default location
./configure --with-thriftpath=/myhome/local/thrift

# If Boost is installed in a non-default location or there are multiple Boost versions
# installed, you will need to specify the Boost path and library names
./configure --with-boost=/usr/local --with-boost-system=boost_system-gcc40-mt-1_36 --with-boost-filesystem=boost_filesystem-gcc40-mt-1_36


Install
=======

as root:
make install


Run
===

See the examples directory to learn how to use Scribe.


Acknowledgements
================
The build process for Scribe uses autoconf macros to compile/link with Boost.
These macros were written by Thomas Porschberg, Michael Tindal, and
Daniel Casimiro.  See the m4 files in the aclocal subdirectory for more
information.

運行

# 查看腳本
cat bootstrap.sh

# 執行腳本
./bootstrap.sh --prefix=/usr/local/scribe --with-thriftpath=/usr/local/thrift/ --with-fb303path=/usr/local/fb303/ --with-boost=/usr/local/boost/

Scribe 的配置文件分爲全局配置和存儲配置兩部分: 

全局配置

port:指示scribe服務器在哪一個端口上監聽,默認是0,通過命令行參數選項-P可以指定端口,也能夠通過配置文件指定。在源代碼中就賦值給變量port。

max_msg_per_second:默認值是0,如果這個參數值是0將被忽略。隨着最近的改變這個參數很少被關聯使用到,max_queue_size參數將被應用到限制每秒最大的消息數。在scribeHandler::throttleDeny被使用。

max_queue_size(按字節):接收消息的隊列的最大字節,默認是5,000,000字節。在scribeHandler::Log使用。

check_interval(秒):用於控制多長時間檢查一次存儲,默認值是5.

new_thread_per_category(是/否):如果爲是,將爲每一個分類場景創建一個新的線程,否則將創一個單線程爲每一個在配置文件中定義的存儲。對於前綴存儲或默認存儲,如果這個參數設置成“否”將導致所有匹配這個分類的消息都由一個單獨的存儲來處理。否則將爲每一個唯一的分類名創建一個新的存儲。默認爲“是”。

num_thrift_server_threads:爲接收消息的監聽線程數量,默認是3.

max_conn:最大的鏈接數。

其他開源的日誌系統

scribe主頁:https://github.com/facebook/scribe

chukwa主頁:http://incubator.apache.org/chukwa/

kafka主頁:http://sna-projects.com/kafka/

Flume主頁:https://github.com/cloudera/flume/


參考:

https://www.cnblogs.com/likehua/p/3796826.html

https://blog.csdn.net/weixin_34200628/article/details/89997699

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章