淺談服務埋點(2)——Metrics

還是那個話題:爲什麼要做服務埋點

  就像我們操作系統裏面的資源管理器一樣,如果能夠實時或者準實時的看到整個系統耗費的CPU,內存等資源,對我們快速對系統做出響應,以及優化很重要。同樣,對於對外提供接口或者服務的WebService的監控,比如在哪個地方,哪臺機器上,花了多少CPU,多少內存,每一個服務的響應時間,出錯的次數頻率等,這些信息記錄下來之後,我們就可以看到服務在運行時的動態的表現,更加容易找出錯誤或者定位問題點來進行優化。

這裏寫圖片描述

  那麼,最簡單的做法是,在應用系統的關鍵地方,或者所有程序的入口,出口進行埋點,然後將這些採樣信息不斷的發送到某一個消息隊列或者內存DB中,然後其他系統進行讀取分析和展示。
  之前談到了AOP,它確實能夠一定程度上解決你的這些問題。但是你要相信這個世界的輪子之多以及輪子之好,基本都會有集成好的東西來用的。

————————————————我是分割線——————————————

Metrics是什麼

  作爲一款監控指標的度量類庫,Metrics可以爲你的代碼的運行提供無與倫比的洞察力,它能夠捕獲JVM以及應用層面的性能參數,同時它提供了很多模塊可以爲第三方庫或者應用提供輔助統計信息, 比如Jetty, Logback, Log4j, Apache HttpClient, Ehcache, JDBI, Jersey, 它還可以將度量數據發送給Ganglia和Graphite以提供圖形化的監控。
  

Metrics如何使用

1、將metrics-core加入到maven pom.xml中:

<dependencies>
    <dependency>
        <groupId>com.codahale.metrics</groupId>
        <artifactId>metrics-core</artifactId>
        <version>${metrics.version}</version>
    </dependency>
</dependencies>

2、core包核- 列表內容心功能

  • Metrics Registries類似一個metrics容器,維護一個Map,可以是一個服務一個實例。
  • 支持五種metric類型:Gauges、Counters、Meters、Histograms和Timers。
  • 可以將metrics值通過JMX、Console,CSV文件和SLF4J loggers發佈出來。

一、Gauge(儀表)

  Gauge代表一個度量的即時值。 當你開汽車的時候, 當前速度是Gauge值。 你測體溫的時候, 體溫計的刻度是一個Gauge值。 當你的程序運行的時候, 內存使用量和CPU佔用率都可以通過Gauge值來度量。或者你也可以理解爲統計瞬時狀態的數據信息。比如我們可以查看一個隊列當前的size。

package com.netease.test.metrics;

import com.codahale.metrics.ConsoleReporter;
import com.codahale.metrics.Gauge;
import com.codahale.metrics.JmxReporter;
import com.codahale.metrics.MetricRegistry;

import java.util.Queue;
import java.util.concurrent.LinkedBlockingDeque;
import java.util.concurrent.TimeUnit;

/**
 * 測試Gauges,實時統計pending狀態的job個數
 */
public class TestGauges {
    /**
     * 實例化一個registry,最核心的一個模塊,相當於一個應用程序的metrics系統的容器,維護一個Map
     */
    private static final MetricRegistry metrics = new MetricRegistry();

    private static Queue<String> queue = new LinkedBlockingDeque<String>();

    /**
     * 在控制檯上打印輸出
     */
    private static ConsoleReporter reporter = ConsoleReporter.forRegistry(metrics).build();

    public static void main(String[] args) throws InterruptedException {
        reporter.start(3, TimeUnit.SECONDS);

        //實例化一個Gauge
        Gauge<Integer> gauge = new Gauge<Integer>() {
            @Override
            public Integer getValue() {
                return queue.size();
            }
        };

        //註冊到容器中
        metrics.register(MetricRegistry.name(TestGauges.class, "pending-job", "size"), gauge);

        //測試JMX
        JmxReporter jmxReporter = JmxReporter.forRegistry(metrics).build();
        jmxReporter.start();

        //模擬數據
        for (int i=0; i<20; i++){
            queue.add("a");
            Thread.sleep(1000);
        }

    }
}

/*
console output:
14-2-17 15:29:35 ===============================================================

-- Gauges ----------------------------------------------------------------------
com.netease.test.metrics.TestGauges.pending-job.size
             value = 4


14-2-17 15:29:38 ===============================================================

-- Gauges ----------------------------------------------------------------------
com.netease.test.metrics.TestGauges.pending-job.size
             value = 6


14-2-17 15:29:41 ===============================================================

-- Gauges ----------------------------------------------------------------------
com.netease.test.metrics.TestGauges.pending-job.size
             value = 9
 */

registry 中每一個metric都有唯一的名字。 MetricRegistry 提供了一個靜態的輔助方法用來生成這個名字:

MetricRegistry.name(QueueManager.class, "pending-job", "size")

生成的name爲com.netease.test.metrics.TestGauges.pending-job.size。
另外,Core包種還擴展了幾種特定的Gauge:

  • JMX Gauges—提供給第三方庫只通過JMX將指標暴露出來。
  • Ratio Gauges—簡單地通過創建一個gauge計算兩個數的比值。
  • Cached Gauges—對某些計量指標提供緩存。
  • Derivative Gauges—提供Gauge的值是基於其他Gauge值的接口。

如上面的代碼中jmxReporter.start()被啓動後, 所有registry中註冊的metric都可以通過JConsole或者VisualVM查看。

———————————————我是小分割線,介紹下VisualVM———————————————

首次接觸VisualVM,這裏介紹一下VisualVM的簡單使用,以及利用VisualVM查看metric變量。
首先需要在IDEA中進行如下配置:在你要做測試的類上,通過Edit Configurations配置好VM options:

-Dcom.sun.management.jmxremote.port=8088
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false

這裏寫圖片描述

接下來配置VisualVM。打開jdk安裝目錄下bin/jvisualvm.exe,首先配置MBean這樣這一個插件。工具→插件→可用插件→MBeans→安裝。

這裏寫圖片描述

接下來配置jxm鏈接,與IDEA中配置的端口號保持一致。(注:這步一定要在程序跑起來之後,再配置,否則會連接失敗)

這裏寫圖片描述

接下里就可以通過MBean來查看你metrics中的值了。

這裏寫圖片描述

但是遺憾的是並沒有找到MBeans生成圖表的功能,希望知道的大神指點一手。

二、Counter(計數器)

Counter是一個AtomicLong實例,它維護一個計數器,可以通過inc()和dec()方法對計數器進行修改。使用步驟與Gauge基本類似,在MetricRegistry中提供了靜態方法可以直接實例化一個Counter。

package com.netease.test.metrics;

import com.codahale.metrics.ConsoleReporter;
import com.codahale.metrics.Counter;
import com.codahale.metrics.MetricRegistry;

import java.util.LinkedList;
import java.util.Queue;
import java.util.concurrent.TimeUnit;
import static com.codahale.metrics.MetricRegistry.*;
/**
 * User: hzwangxx
 * Date: 14-2-14
 * Time: 14:02
 * 測試Counter
 */
public class TestCounter {

    /**
     * 實例化一個registry,最核心的一個模塊,相當於一個應用程序的metrics系統的容器,維護一個Map
     */
    private static final MetricRegistry metrics = new MetricRegistry();

    /**
     * 在控制檯上打印輸出
     */
    private static ConsoleReporter reporter = ConsoleReporter.forRegistry(metrics).build();

    /**
     * 實例化一個counter,同樣可以通過如下方式進行實例化再註冊進去
     * pendingJobs = new Counter();
     * metrics.register(MetricRegistry.name(TestCounter.class, "pending-jobs"), pendingJobs);
     */
    private static Counter pendingJobs = metrics.counter(name(TestCounter.class, "pedding-jobs"));
//    private static Counter pendingJobs = metrics.counter(MetricRegistry.name(TestCounter.class, "pedding-jobs"));



    private static Queue<String> queue = new LinkedList<String>();

    public static void add(String str) {
        pendingJobs.inc();
        queue.offer(str);
    }

    public String take() {
        pendingJobs.dec();
        return queue.poll();
    }

    public static void main(String[]args) throws InterruptedException {
        reporter.start(3, TimeUnit.SECONDS);
        while(true){
            add("1");
            Thread.sleep(1000);
        }

    }
}

/*
console output:
14-2-17 17:52:34 ===============================================================

-- Counters --------------------------------------------------------------------
com.netease.test.metrics.TestCounter.pedding-jobs
             count = 4


14-2-17 17:52:37 ===============================================================

-- Counters --------------------------------------------------------------------
com.netease.test.metrics.TestCounter.pedding-jobs
             count = 6


14-2-17 17:52:40 ===============================================================

-- Counters --------------------------------------------------------------------
com.netease.test.metrics.TestCounter.pedding-jobs
             count = 9

 */

三、Meters(計數器)

  Meters用來度量某個時間段的平均處理次數(request per second),每1、5、15分鐘的TPS。比如一個service的請求數,通過metrics.meter()實例化一個Meter之後,然後通過meter.mark()方法就能將本次請求記錄下來。統計結果有總的請求數,平均每秒的請求數,以及最近的1、5、15分鐘的平均TPS。

package com.netease.test.metrics;

import com.codahale.metrics.ConsoleReporter;
import com.codahale.metrics.Meter;
import com.codahale.metrics.MetricRegistry;

import java.util.concurrent.TimeUnit;

import static com.codahale.metrics.MetricRegistry.*;

/**
 * Date: 14-2-17
 * Time: 18:34
 * 測試Meters
 */
public class TestMeters {
    /**
     * 實例化一個registry,最核心的一個模塊,相當於一個應用程序的metrics系統的容器,維護一個Map
     */
    private static final MetricRegistry metrics = new MetricRegistry();

    /**
     * 在控制檯上打印輸出
     */
    private static ConsoleReporter reporter = ConsoleReporter.forRegistry(metrics).build();

    /**
     * 實例化一個Meter
     */
    private static final Meter requests = metrics.meter(name(TestMeters.class, "request"));

    public static void handleRequest() {
        requests.mark();
    }

    public static void main(String[] args) throws InterruptedException {
        reporter.start(3, TimeUnit.SECONDS);
        while(true){
            handleRequest();
            Thread.sleep(100);
        }
    }

}

/*
14-2-17 18:43:08 ===============================================================

-- Meters ----------------------------------------------------------------------
com.netease.test.metrics.TestMeters.request
             count = 30
         mean rate = 9.95 events/second
     1-minute rate = 0.00 events/second
     5-minute rate = 0.00 events/second
    15-minute rate = 0.00 events/second


14-2-17 18:43:11 ===============================================================

-- Meters ----------------------------------------------------------------------
com.netease.test.metrics.TestMeters.request
             count = 60
         mean rate = 9.99 events/second
     1-minute rate = 10.00 events/second
     5-minute rate = 10.00 events/second
    15-minute rate = 10.00 events/second


14-2-17 18:43:14 ===============================================================

-- Meters ----------------------------------------------------------------------
com.netease.test.metrics.TestMeters.request
             count = 90
         mean rate = 9.99 events/second
     1-minute rate = 10.00 events/second
     5-minute rate = 10.00 events/second
    15-minute rate = 10.00 events/second
*/

土鱉continue:

Histogram(直方圖)和Timer(計時器)和Health Checks(健康檢查)

Go On:

四、Histogram(直方圖)

Histograms主要使用來統計數據的分佈情況,最大值、最小值、平均值、標準偏差、中位數,百分比(75%、90%、95%、98%、99%和99.9%)。例如,需要統計某個請求的參數值的分部情況,可以使用該種類型的Metrics進行統計。具體的樣例代碼如下:

package com.netease.test.metrics;

import com.codahale.metrics.ConsoleReporter;
import com.codahale.metrics.Histogram;
import com.codahale.metrics.MetricRegistry;

import java.util.Random;
import java.util.concurrent.TimeUnit;

import static com.codahale.metrics.MetricRegistry.name;

/**
 * Date: 14-2-17
 * Time: 18:34
 * 測試Histograms
 */
public class TestHistograms {
    /**
     * 實例化一個registry,最核心的一個模塊,相當於一個應用程序的metrics系統的容器,維護一個Map
     */
    private static final MetricRegistry metrics = new MetricRegistry();

    /**
     * 在控制檯上打印輸出
     */
    private static ConsoleReporter reporter = ConsoleReporter.forRegistry(metrics).build();

    /**
     * 實例化一個Histograms
     */
    private static final Histogram randomNums = metrics.histogram(name(TestHistograms.class, "random"));

    public static void handleRequest(double random) {
        randomNums.update((int) (random*100));
    }

    public static void main(String[] args) throws InterruptedException {
        reporter.start(3, TimeUnit.SECONDS);
        Random rand = new Random();
        while(true){
            handleRequest(rand.nextDouble());
            Thread.sleep(100);
        }
    }

}

/*
14-2-17 19:39:11 ===============================================================

-- Histograms ------------------------------------------------------------------
com.netease.test.metrics.TestHistograms.random
             count = 30
               min = 1
               max = 97
              mean = 45.93
            stddev = 29.12
            median = 39.50
              75% <= 71.00
              95% <= 95.90
              98% <= 97.00
              99% <= 97.00
            99.9% <= 97.00


14-2-17 19:39:14 ===============================================================

-- Histograms ------------------------------------------------------------------
com.netease.test.metrics.TestHistograms.random
             count = 60
               min = 0
               max = 97
              mean = 41.17
            stddev = 28.60
            median = 34.50
              75% <= 69.75
              95% <= 92.90
              98% <= 96.56
              99% <= 97.00
            99.9% <= 97.00


14-2-17 19:39:17 ===============================================================

-- Histograms ------------------------------------------------------------------
com.netease.test.metrics.TestHistograms.random
             count = 90
               min = 0
               max = 97
              mean = 44.67
            stddev = 28.47
            median = 43.00
              75% <= 71.00
              95% <= 91.90
              98% <= 96.18
              99% <= 97.00
            99.9% <= 97.00
*/

這個還真的蠻方便的。

五、Timer(計時器)

這個或許可能是我們相對來講更想要用到的了:Timer用來測量一段代碼被調用的速率和用時。實際上Histogram也能做到,這個Timer就是基於Histograms和Meters來實現的。

package com.netease.test.metrics;

import com.codahale.metrics.ConsoleReporter;
import com.codahale.metrics.MetricRegistry;
import com.codahale.metrics.Timer;

import java.util.Random;
import java.util.concurrent.TimeUnit;

import static com.codahale.metrics.MetricRegistry.name;

/**
 * Date: 14-2-17
 * Time: 18:34
 * 測試Timers
 */
public class TestTimers {
    /**
     * 實例化一個registry,最核心的一個模塊,相當於一個應用程序的metrics系統的容器,維護一個Map
     */
    private static final MetricRegistry metrics = new MetricRegistry();

    /**
     * 在控制檯上打印輸出
     */
    private static ConsoleReporter reporter = ConsoleReporter.forRegistry(metrics).build();

    /**
     * 實例化一個Meter
     */
//    private static final Timer requests = metrics.timer(name(TestTimers.class, "request"));
    private static final Timer requests = metrics.timer(name(TestTimers.class, "request"));

    public static void handleRequest(int sleep) {
        Timer.Context context = requests.time();
        try {
            //some operator
            Thread.sleep(sleep);
        } catch (InterruptedException e) {
            e.printStackTrace();
        } finally {
            context.stop();
        }

    }

    public static void main(String[] args) throws InterruptedException {
        reporter.start(3, TimeUnit.SECONDS);
        Random random = new Random();
        while(true){
            handleRequest(random.nextInt(1000));
        }
    }

}

/*
14-2-18 9:31:54 ================================================================

-- Timers ----------------------------------------------------------------------
com.netease.test.metrics.TestTimers.request
             count = 4
         mean rate = 1.33 calls/second
     1-minute rate = 0.00 calls/second
     5-minute rate = 0.00 calls/second
    15-minute rate = 0.00 calls/second
               min = 483.07 milliseconds
               max = 901.92 milliseconds
              mean = 612.64 milliseconds
            stddev = 196.32 milliseconds
            median = 532.79 milliseconds
              75% <= 818.31 milliseconds
              95% <= 901.92 milliseconds
              98% <= 901.92 milliseconds
              99% <= 901.92 milliseconds
            99.9% <= 901.92 milliseconds


14-2-18 9:31:57 ================================================================

-- Timers ----------------------------------------------------------------------
com.netease.test.metrics.TestTimers.request
             count = 8
         mean rate = 1.33 calls/second
     1-minute rate = 1.40 calls/second
     5-minute rate = 1.40 calls/second
    15-minute rate = 1.40 calls/second
               min = 41.07 milliseconds
               max = 968.19 milliseconds
              mean = 639.50 milliseconds
            stddev = 306.12 milliseconds
            median = 692.77 milliseconds
              75% <= 885.96 milliseconds
              95% <= 968.19 milliseconds
              98% <= 968.19 milliseconds
              99% <= 968.19 milliseconds
            99.9% <= 968.19 milliseconds


14-2-18 9:32:00 ================================================================

-- Timers ----------------------------------------------------------------------
com.netease.test.metrics.TestTimers.request
             count = 15
         mean rate = 1.67 calls/second
     1-minute rate = 1.40 calls/second
     5-minute rate = 1.40 calls/second
    15-minute rate = 1.40 calls/second
               min = 41.07 milliseconds
               max = 968.19 milliseconds
              mean = 591.35 milliseconds
            stddev = 302.96 milliseconds
            median = 650.56 milliseconds
              75% <= 838.07 milliseconds
              95% <= 968.19 milliseconds
              98% <= 968.19 milliseconds
              99% <= 968.19 milliseconds
            99.9% <= 968.19 milliseconds

*/

六、Health Checks(健康檢查)

Metrics提供了一個獨立的模塊:Health Checks,用於對Application、其子模塊或者關聯模塊的運行是否正常做檢測。該模塊是獨立metrics-core模塊的,使用時則導入metrics-healthchecks包。

<dependency>                                    
  <groupId>com.codahale.metrics</groupId>       
  <artifactId>metrics-healthchecks</artifactId> 
  <version>3.0.1</version>         
</dependency>

使用起來和與上述幾種類型的Metrics有點類似,但是需要重新實例化一個Metrics容器HealthCheckRegistry,待檢測模塊繼承抽象類HealthCheck並實現check()方法即可,然後將該模塊註冊到HealthCheckRegistry中,判斷的時候通過isHealthy()接口即可。接下來以檢查兩個數據庫的狀態爲例子:

package com.netease.test.metrics;

import com.codahale.metrics.health.HealthCheck;
import com.codahale.metrics.health.HealthCheckRegistry;

import java.util.Map;
import java.util.Random;

/**
 * Date: 14-2-18
 * Time: 9:57
 */
public class DatabaseHealthCheck extends HealthCheck{
    private final Database database;

    public DatabaseHealthCheck(Database database) {
        this.database = database;
    }

    @Override
    protected Result check() throws Exception {
        if (database.ping()) {
            return Result.healthy();
        }
        return Result.unhealthy("Can't ping database.");
    }

    /**
     * 模擬Database對象
     */
    static class Database {
        /**
         * 模擬database的ping方法
         * @return 隨機返回boolean值
         */
        public boolean ping() {
            Random random = new Random();
            return random.nextBoolean();
        }
    }

    public static void main(String[] args) {
//        MetricRegistry metrics = new MetricRegistry();
//        ConsoleReporter reporter = ConsoleReporter.forRegistry(metrics).build();
        HealthCheckRegistry registry = new HealthCheckRegistry();
        registry.register("database1", new DatabaseHealthCheck(new Database()));
        registry.register("database2", new DatabaseHealthCheck(new Database()));
        while (true) {
            for (Map.Entry<String, Result> entry : registry.runHealthChecks().entrySet()) {
                if (entry.getValue().isHealthy()) {
                    System.out.println(entry.getKey() + ": OK");
                } else {
                    System.err.println(entry.getKey() + ": FAIL, error message: " + entry.getValue().getMessage());
                    final Throwable e = entry.getValue().getError();
                    if (e != null) {
                        e.printStackTrace();
                    }
                }
            }
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {

            }
        }
    }
}

/*
console output:
database1: OK
database2: FAIL, error message: Can't ping database.
database1: FAIL, error message: Can't ping database.
database2: OK
database1: OK
database2: FAIL, error message: Can't ping database.
database1: FAIL, error message: Can't ping database.
database2: OK
database1: FAIL, error message: Can't ping database.
database2: FAIL, error message: Can't ping database.
database1: FAIL, error message: Can't ping database.
database2: FAIL, error message: Can't ping database.
database1: OK
database2: OK
database1: OK
database2: FAIL, error message: Can't ping database.
database1: FAIL, error message: Can't ping database.
database2: OK
database1: OK
database2: OK
database1: FAIL, error message: Can't ping database.
database2: OK
database1: OK
database2: OK
database1: OK
database2: OK
database1: OK
database2: FAIL, error message: Can't ping database.
database1: FAIL, error message: Can't ping database.
database2: FAIL, error message: Can't ping database.

 */

Metrics與Spring的集成

metrics-spring這個庫爲Spring增加了Metric庫, 提供基於XML或者註解方式。
1、引入包:

<dependency>
    <groupId>com.ryantenney.metrics</groupId>
    <artifactId>metrics-spring</artifactId>
    <version>3.0.1</version>
</dependency>

2、xml文件配置

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:metrics="http://www.ryantenney.com/schema/metrics"
       xsi:schemaLocation="
           http://www.springframework.org/schema/beans
           http://www.springframework.org/schema/beans/spring-beans-3.2.xsd
           http://www.ryantenney.com/schema/metrics
           http://www.ryantenney.com/schema/metrics/metrics-3.0.xsd">
    <!-- Registry should be defined in only one context XML file -->
    <metrics:metric-registry id="metrics" />
    <!-- annotation-driven must be included in all context files -->
    <metrics:annotation-driven metric-registry="metrics" />
    <!-- (Optional) Registry should be defined in only one context XML file -->
    <metrics:reporter type="console" metric-registry="metrics" period="1m" />
    <!-- (Optional) The metrics in this example require the metrics-jvm jar-->
    <metrics:register metric-registry="metrics">
        <bean metrics:name="jvm.gc" class="com.codahale.metrics.jvm.GarbageCollectorMetricSet" />
        <bean metrics:name="jvm.memory" class="com.codahale.metrics.jvm.MemoryUsageGaugeSet" />
        <bean metrics:name="jvm.thread-states" class="com.codahale.metrics.jvm.ThreadStatesGaugeSet" />
        <bean metrics:name="jvm.fd.usage" class="com.codahale.metrics.jvm.FileDescriptorRatioGauge" />
    </metrics:register>
    <!-- Beans and other Spring config -->
</beans>

3、java註解的方式

import java.util.concurrent.TimeUnit;
import org.springframework.context.annotation.Configuration;
import com.codahale.metrics.ConsoleReporter;
import com.codahale.metrics.MetricRegistry;
import com.codahale.metrics.SharedMetricRegistries;
import com.ryantenney.metrics.spring.config.annotation.EnableMetrics;
import com.ryantenney.metrics.spring.config.annotation.MetricsConfigurerAdapter;
@Configuration
@EnableMetrics
public class SpringConfiguringClass extends MetricsConfigurerAdapter {
    @Override
    public void configureReporters(MetricRegistry metricRegistry) {
        ConsoleReporter
            .forRegistry(metricRegistry)
            .build()
            .start(1, TimeUnit.MINUTES);
    }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章