zap——Logger的設計思路

前言

之前的一系列文章中主要追蹤的是代碼的執行邏輯和處理思路,進而熟悉代碼,加深理解及使用。但是zap爲什麼快?它是怎麼設計的?在熟悉代碼後,再回頭看看zap的設計思路,看看能否解答之前的一些疑惑。

設計思路

zap README的Performance一節中提到:

For applications that log in the hot path, reflection-based serialization and 
string formatting are prohibitively expensive — they're CPU-intensive and 
make many small allocations. Put differently, using encoding/json and 
fmt.Fprintf to log tons of interface{}s makes your application slow.

Zap takes a different approach. It includes a reflection-free, zero-allocation 
JSON encoder, and the base Logger strives to avoid serialization overhead 
and allocations wherever possible. By building the high-level SugaredLogger 
on that foundation, zap lets users choose when they need to count every 
allocation and when they'd prefer a more familiar, loosely typed API.

As measured by its own benchmarking suite, not only is zap more performant 
than comparable structured logging packages — it's also faster than the 
standard library. Like all benchmarks, take these with a grain of salt.1

挑重點說下:

基於反射的序列化和字符串格式化代價高昂,這些操作佔用大量的CPU,並進行許多小的內存分配。所以用encoding/json和fmt.Fprintf來log大量的interface會使應用程序變慢(兩者均涉及反射的處理)。zap的思路就是儘可能避免此類的序列化開銷和分配,構造不基於反射的編碼器(ReflectType的Filed除外,ReflectType建議僅用作擴展使用)。

編碼器

zap的核心是encoder的實現,encoder負責log的序列化/格式化,其性能決定了整體的性能。

編碼器實現的主思路是放棄直接對interface的支持,直接處理確切的類型,避免標準庫中涉及的反射處理,提高性能。

JSON encoder

func (enc *jsonEncoder) EncodeEntry(ent Entry, fields []Field) (*buffer.Buffer, error) {
    final := enc.clone()
    final.buf.AppendByte('{')

    if final.LevelKey != "" {
        final.addKey(final.LevelKey)
        cur := final.buf.Len()
        final.EncodeLevel(ent.Level, final)
        if cur == final.buf.Len() {
            // User-supplied EncodeLevel was a no-op. Fall back to strings to keep
            // output JSON valid.
            final.AppendString(ent.Level.String())
        }
    }
    if final.TimeKey != "" {
        final.AddTime(final.TimeKey, ent.Time)
    }
    if ent.LoggerName != "" && final.NameKey != "" {
        final.addKey(final.NameKey)
        cur := final.buf.Len()
        nameEncoder := final.EncodeName

        // if no name encoder provided, fall back to FullNameEncoder for backwards
        // compatibility
        if nameEncoder == nil {
            nameEncoder = FullNameEncoder
        }

        nameEncoder(ent.LoggerName, final)
        if cur == final.buf.Len() {
            // User-supplied EncodeName was a no-op. Fall back to strings to
            // keep output JSON valid.
            final.AppendString(ent.LoggerName)
        }
    }
    if ent.Caller.Defined && final.CallerKey != "" {
        final.addKey(final.CallerKey)
        cur := final.buf.Len()
        final.EncodeCaller(ent.Caller, final)
        if cur == final.buf.Len() {
            // User-supplied EncodeCaller was a no-op. Fall back to strings to
            // keep output JSON valid.
            final.AppendString(ent.Caller.String())
        }
    }
    if final.MessageKey != "" {
        final.addKey(enc.MessageKey)
        final.AppendString(ent.Message)
    }
    if enc.buf.Len() > 0 {
        final.addElementSeparator()
        final.buf.Write(enc.buf.Bytes())
    }
    addFields(final, fields)
    final.closeOpenNamespaces()
    if ent.Stack != "" && final.StacktraceKey != "" {
        final.AddString(final.StacktraceKey, ent.Stack)
    }
    final.buf.AppendByte('}')
    if final.LineEnding != "" {
        final.buf.AppendString(final.LineEnding)
    } else {
        final.buf.AppendString(DefaultLineEnding)
    }

    ret := final.buf
    putJSONEncoder(final)
    return ret, nil
}

以上是JSON encoder的入口源碼。從這些源碼可以知道:

1.參數及數據直接限制類型,無需反射確定後再處理,性能更高

func (enc *jsonEncoder) addKey(key string) {
    enc.addElementSeparator()
    enc.buf.AppendByte('"')
    enc.safeAddString(key)
    enc.buf.AppendByte('"')
    enc.buf.AppendByte(':')
    if enc.spaced {
        enc.buf.AppendByte(' ')
    }
}

func (enc *jsonEncoder) AppendString(val string) {
    enc.addElementSeparator()
    enc.buf.AppendByte('"')
    enc.safeAddString(val)
    enc.buf.AppendByte('"')
}

直接根據對應的參數及類型直接處理,無需反射等過程,性能更高。

2.序列化是按照一定順序進行的

根據源碼可知,順序如下:

LevelKey->TimeKey->NameKey->CallerKey->MessageKey->[]Field->StacktraceKey->LineEnding

注意: 當對應的key爲空時,不參與序列化。

除MessageKey對應的內容及[]Field外,其他參數均限定在Entry內,在創建logger時指定。

3.fields參與頂層的json序列化,Field有針對的快速類型處理,但儘量不要使用反射類型ReflectType

func (f Field) AddTo(enc ObjectEncoder) {
    var err error
    switch f.Type {
    case ArrayMarshalerType:
        err = enc.AddArray(f.Key, f.Interface.(ArrayMarshaler))
    case ObjectMarshalerType:
        err = enc.AddObject(f.Key, f.Interface.(ObjectMarshaler))
    case BinaryType:
        enc.AddBinary(f.Key, f.Interface.([]byte))
    case BoolType:
        enc.AddBool(f.Key, f.Integer == 1)
    case ByteStringType:
        enc.AddByteString(f.Key, f.Interface.([]byte))
    case Complex128Type:
        enc.AddComplex128(f.Key, f.Interface.(complex128))
    case Complex64Type:
        enc.AddComplex64(f.Key, f.Interface.(complex64))
    case DurationType:
        enc.AddDuration(f.Key, time.Duration(f.Integer))
    case Float64Type:
        enc.AddFloat64(f.Key, math.Float64frombits(uint64(f.Integer)))
    case Float32Type:
        enc.AddFloat32(f.Key, math.Float32frombits(uint32(f.Integer)))
    case Int64Type:
        enc.AddInt64(f.Key, f.Integer)
    case Int32Type:
        enc.AddInt32(f.Key, int32(f.Integer))
    case Int16Type:
        enc.AddInt16(f.Key, int16(f.Integer))
    case Int8Type:
        enc.AddInt8(f.Key, int8(f.Integer))
    case StringType:
        enc.AddString(f.Key, f.String)
    case TimeType:
        if f.Interface != nil {
            enc.AddTime(f.Key, time.Unix(0, f.Integer).In(f.Interface.(*time.Location)))
        } else {
            // Fall back to UTC if location is nil.
            enc.AddTime(f.Key, time.Unix(0, f.Integer))
        }
    case Uint64Type:
        enc.AddUint64(f.Key, uint64(f.Integer))
    case Uint32Type:
        enc.AddUint32(f.Key, uint32(f.Integer))
    case Uint16Type:
        enc.AddUint16(f.Key, uint16(f.Integer))
    case Uint8Type:
        enc.AddUint8(f.Key, uint8(f.Integer))
    case UintptrType:
        enc.AddUintptr(f.Key, uintptr(f.Integer))
    case ReflectType:
        err = enc.AddReflected(f.Key, f.Interface)
    case NamespaceType:
        enc.OpenNamespace(f.Key)
    case StringerType:
        err = encodeStringer(f.Key, f.Interface, enc)
    case ErrorType:
        encodeError(f.Key, f.Interface.(error), enc)
    case SkipType:
        break
    default:
        panic(fmt.Sprintf("unknown field type: %v", f))
    }

    if err != nil {
        enc.AddString(fmt.Sprintf("%sError", f.Key), err.Error())
    }
}

func (enc *jsonEncoder) AppendString(val string) {
    enc.addElementSeparator()
    enc.buf.AppendByte('"')
    enc.safeAddString(val)
    enc.buf.AppendByte('"')
}

type jsonEncoder struct {
    *EncoderConfig
    buf            *buffer.Buffer
    spaced         bool // include spaces after colons and commas
    openNamespaces int

    // for encoding generic values by reflection
    reflectBuf *buffer.Buffer
    reflectEnc *json.Encoder
}

type ObjectEncoder interface {
    ...
    // AddReflected uses reflection to serialize arbitrary objects, so it's slow
    // and allocation-heavy.
    AddReflected(key string, value interface{}) error
    ...
}

func (enc *jsonEncoder) AddReflected(key string, obj interface{}) error {
    enc.resetReflectBuf()
    err := enc.reflectEnc.Encode(obj)
    if err != nil {
        return err
    }
    enc.reflectBuf.TrimNewline()
    enc.addKey(key)
    _, err = enc.buf.Write(enc.reflectBuf.Bytes())
    return err
}

zap對Field對應的類型均有相應的封裝處理,除ReflectType外,可直接將信息拼接至緩存中。ReflectType會調用標準庫的序列化過程,涉及到了反射的使用,因此使用zap是在非必要的情況下強烈建議不要使用ReflectType類型的Field,這根本無法發揮zap的優勢,反而因爲多了處理過程,進一步影響性能。

4.用戶自定義的子Encoder沒有操作時,會採用默認的操作,以使JSON有效

if final.LevelKey != "" {
        final.addKey(final.LevelKey)
        cur := final.buf.Len()
        final.EncodeLevel(ent.Level, final)
        if cur == final.buf.Len() {
            // User-supplied EncodeLevel was a no-op. Fall back to strings to keep
            // output JSON valid.
            final.AppendString(ent.Level.String())
        }
    }

CONSOLE encoder

func (c consoleEncoder) EncodeEntry(ent Entry, fields []Field) (*buffer.Buffer, error) {
    line := bufferpool.Get()

    // We don't want the entry's metadata to be quoted and escaped (if it's
    // encoded as strings), which means that we can't use the JSON encoder. The
    // simplest option is to use the memory encoder and fmt.Fprint.
    //
    // If this ever becomes a performance bottleneck, we can implement
    // ArrayEncoder for our plain-text format.
    arr := getSliceEncoder()
    if c.TimeKey != "" && c.EncodeTime != nil {
        c.EncodeTime(ent.Time, arr)
    }
    if c.LevelKey != "" && c.EncodeLevel != nil {
        c.EncodeLevel(ent.Level, arr)
    }
    if ent.LoggerName != "" && c.NameKey != "" {
        nameEncoder := c.EncodeName

        if nameEncoder == nil {
            // Fall back to FullNameEncoder for backward compatibility.
            nameEncoder = FullNameEncoder
        }

        nameEncoder(ent.LoggerName, arr)
    }
    if ent.Caller.Defined && c.CallerKey != "" && c.EncodeCaller != nil {
        c.EncodeCaller(ent.Caller, arr)
    }
    for i := range arr.elems {
        if i > 0 {
            line.AppendByte('\t')
        }
        fmt.Fprint(line, arr.elems[i])
    }
    putSliceEncoder(arr)

    // Add the message itself.
    if c.MessageKey != "" {
        c.addTabIfNecessary(line)
        line.AppendString(ent.Message)
    }

    // Add any structured context.
    c.writeContext(line, fields)

    // If there's no stacktrace key, honor that; this allows users to force
    // single-line output.
    if ent.Stack != "" && c.StacktraceKey != "" {
        line.AppendByte('\n')
        line.AppendString(ent.Stack)
    }

    if c.LineEnding != "" {
        line.AppendString(c.LineEnding)
    } else {
        line.AppendString(DefaultLineEnding)
    }
    return line, nil
}

func (c consoleEncoder) writeContext(line *buffer.Buffer, extra []Field) {
    context := c.jsonEncoder.Clone().(*jsonEncoder)
    defer context.buf.Free()

    addFields(context, extra)
    context.closeOpenNamespaces()
    if context.buf.Len() == 0 {
        return
    }

    c.addTabIfNecessary(line)
    line.AppendByte('{')
    line.Write(context.buf.Bytes())
    line.AppendByte('}')
}

CONSOLE encoder與JSON encoder在處理key的順序上時一致的,也是通過直接拼接字符串實現的高性能。有以下幾點可以注意下:

1.LevelKey等Key不會參與格式化,僅決定其值是否參與格式化

2.用戶自定義的子Encoder可以不操作,當然結果中也不會顯示

3.各個結果間以\t隔開

4.fields仍序列化json格式後,再格式化至總格式中

看過encoder的處理過程後,回過頭來看看Config中的EncoderConfig的設計。

EncoderConfig

type EncoderConfig struct {
    // Set the keys used for each log entry. If any key is empty, that portion
    // of the entry is omitted.
    MessageKey    string `json:"messageKey" yaml:"messageKey"`
    LevelKey      string `json:"levelKey" yaml:"levelKey"`
    TimeKey       string `json:"timeKey" yaml:"timeKey"`
    NameKey       string `json:"nameKey" yaml:"nameKey"`
    CallerKey     string `json:"callerKey" yaml:"callerKey"`
    StacktraceKey string `json:"stacktraceKey" yaml:"stacktraceKey"`
    LineEnding    string `json:"lineEnding" yaml:"lineEnding"`
    // Configure the primitive representations of common complex types. For
    // example, some users may want all time.Times serialized as floating-point
    // seconds since epoch, while others may prefer ISO8601 strings.
    EncodeLevel    LevelEncoder    `json:"levelEncoder" yaml:"levelEncoder"`
    EncodeTime     TimeEncoder     `json:"timeEncoder" yaml:"timeEncoder"`
    EncodeDuration DurationEncoder `json:"durationEncoder" yaml:"durationEncoder"`
    EncodeCaller   CallerEncoder   `json:"callerEncoder" yaml:"callerEncoder"`
    // Unlike the other primitive type encoders, EncodeName is optional. The
    // zero value falls back to FullNameEncoder.
    EncodeName NameEncoder `json:"nameEncoder" yaml:"nameEncoder"`
}

EncoderConfig限定了日誌中常用的Level、Time、Msg、Caller等信息的key,且提供了對應的自定義的名稱及encoder,均是爲了通過限定了log的key、存儲數據類型等信息,在編碼時可以直接根據數據類型處理數據,無需反射,快捷高效。同時也是限定了規範,減少濫用參數的情況。

總結

Logger通過Config限定了log常用參數的類型及key,通過這些key可以快速地進行json的序列化。整體上來看,看似Logger以犧牲一定的自由性換取足夠高的性能,但考慮到log的格式化及Filed的擴展,足以滿足絕大多數情況下的需求,而性能上的提高可以帶來很大的優勢。當然,zap爲滿足更大的靈活性,zap提供了SugaredLogger,不過性能上會比Logger差一點,在後面的章節中會再討論。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章