MyDLNote-Enhancement: Kindling the Darkness: a Practical Low-light Image Enhancer (後附 KinD ++)

Kindling the Darkness: a Practical Low-light Image Enhancer

另外,推薦新的研究成果 KinD++。

[paper] : https://arxiv.org/pdf/1905.04161v1.pdf

[Tensorflow] : https://github.com/zhangyhuaee/KinD

[KinD++ Tensorflow] : https://github.com/zhangyhuaee/KinD_plus

 

Table of Contents

Kindling the Darkness: a Practical Low-light Image Enhancer

Abstract 

Introduction

問題描述,引出 motivation

算法回顧(Previous Arts )

Our Contributions

Methodology

Consideration & Motivation

KinD Network

Experimental Validation

Implementation Details

Performance Evaluation

KinD++

網絡結構對比

NIQE code



Abstract 

Images captured under low-light conditions often suffer from (partially) poor visibility. Besides unsatisfactory lightings, multiple types of degradations, such as noise and color distortion due to the limited quality of cameras, hide in the dark. In other words, solely turning up the brightness of dark regions will inevitably amplify hidden artifacts.

研究背景:低光圖像不僅是暗,而且伴隨噪聲和顏色失真。

This work builds a simple yet effective network for Kindling the Darkness (denoted as KinD), which, inspired by Retinex theory, decomposes images into two components. One component (illumination) is responsible for light adjustment, while the other (reflectance) for degradation removal. In such a way, the original space is decoupled into two smaller subspaces, expecting to be better regularized/learned. It is worth to note that our network is trained with paired images shot under different exposure conditions, instead of using any ground-truth reflectance and illumination information.

本文特色:

1. 圖像分解:弱光圖像被分爲 光照(illumination)和 反射率(reflectance);前者負責亮度調整,後者用於去除降質(噪聲,顏色失真)。圖像分解的好處是,每一個模塊可以更好地被正規化/學習。

2. 輸入圖像:爲兩張不同曝光條件下的圖像,而不是弱光圖像和真實圖像(這樣的好處是,很難定義多亮的圖像算是真實圖像)。

Extensive experiments are conducted to demonstrate the efficacy of our design and its superiority over state-of-the-art alternatives. Our KinD is robust against severe visual defects, and user-friendly to arbitrarily adjust light levels. In addition, our model spends less than 50ms to process an image in VGA resolution on a 2080Ti GPU. All the above merits make our KinD attractive for practical use.

實驗結果:效果好、人家交互友好、運行時間快。

 


Introduction

分爲三個部分:

1. 問題描述,引出 motivation;

2. 相關算法回顧;

3. 本文的貢獻。

問題描述,引出 motivation

Very often, capturing high-quality images in dim light conditions is challenging. Though a few operations, such as setting high ISO, long exposure, and flash, can be applied under the circumstances, they suffer from different drawbacks. For instance, high ISO increases the sensitivity of an image sensor to light, but the noise is also amplified, thus leading to the low (signal-to-noise ratio) SNR. Long exposure is limited to shoot static scenes, otherwise it highly likely gets in trouble of blurry results. Using flash can somehow brighten the environment, which however frequently introduces unexpected highlights and unbalanced lighting into photos, making them visually unpleasant.

In practice, typical users may even not have the above options with limited photographing tools, e.g. cameras embedded in portable devices. Although the low-light image enhancement has been a long-standing problem in the community with a great progress made over the past years, developing a practical low-light image enhancer remains challenging, since flexibly lightening the darkness, effectively removing the degradations, and being efficient should all be concerned.

問題描述:

弱光情況下拍照可以採用三個(硬件)技術,但每種技術都會引入新的問題:

高感光度(high ISO):會增加圖像傳感器對光的靈敏度,但噪聲也會被放大,從而導致低信噪比。

長曝光(long exposure):只能拍攝靜態場景,相機不穩定或者拍攝運動物體,很容易出現模糊的結果。

使用閃光(using flash):能以某種方式照亮環境,然而卻經常引入意想不到的高光和不平衡的光線到照片中,使其在視覺上不理想。

雖然弱光圖像增強是 community 長期存在的問題,過去幾年也取得了很大的進展,但開發一種實用的弱光圖像增強器仍然具有挑戰性,因爲靈活地減輕暗區域,有效地消除退化,以及高效率都需要關注。

 

Figure 1 provides three natural images captured under challenging light conditions.

Concretely, the first case is with extremely low light. Severe noise and color distortion are hidden in the dark. By simply amplifying the intensity of the image, the degradations show up as given on the top-right corner.

The second image is photographed at sunset (weak ambient light), most objects in which suffer from backlighting. Imaging at noon facing to the light source (the sun) also hardly gets rid of the issue like the second case exhibits, although the ambient light is stronger and the scene is more visible. Note that those relatively bright regions of the last two photos will be saturated by direct amplification.

問題描述:

弱光情況下拍照可以採用直接增加亮度的(軟件件)技術,但同樣會引入新的問題:

1. 通過簡單地放大圖像的強度,會出現退化(噪聲和顏色失真),就像右上角給出的那樣;

2. 對於背光圖像,放大圖像的強度,會導致相對明亮的區域變得過曝光。

通過前兩段,作者告訴我們,弱光圖像增強問題是一件非常難的事情,值得研究。

 

Deep learning-based methods have revealed their superior performance in numerical low-level vision tasks, such as denoising and super-resolution, most of which need the training data with ground truth. For the target problem, say low-light image enhancement, no ground-truth real data exists, although the order of light intensity can be determined. Because, from the viewpoint of users, the favorite light levels for different people/requirements could be much diverse. In other words, one cannot say what light condition is the best/ground-truth. Therefore, it is not so felicitous to map an image only to a version with a specific level of light.

問題描述:基於深度學習方法的問題

基於深度學習的方法在去噪、超分辨等需要真實訓練數據的低階視覺任務中表現出了優越的性能。對於目標問題,比如弱光圖像增強,雖然可以確定光強等級,但不存在真實數據。因爲,從用戶的角度來看,不同的人/需求最喜歡的光線水平可能是多種多樣的。換句話說,誰也說不出什麼光照條件是最好的,或者說 ground-truth 的確定因人而異。因此,僅將一幅圖像映射到具有特定亮度的方法就不那麼恰當了。

 

Based on the above analysis, we summarize challenges in lowlight image enhancement as follows:

• How to effectively estimate the illumination component from a single image, and flexibly adjust light levels?

• How to remove the degradations like noise and color distortion previously hidden in the darkness after lightening up dark regions?

• How to train a model without well-defined ground-truth light conditions for low-light image enhancement by only looking at two/several different examples?

In this paper, we propose a deep neural network to take the above concerns into account simultaneously

對上面問題描述的總結:

• 如何從一幅圖像中有效地估計光照分量,靈活地調整光亮等級;

• 如何去除之前隱藏在黑暗區域的噪點、顏色失真等退化問題?

• 如何訓練一個模型,在沒有明確的 ground-truth 光照條件下,僅看兩個/幾個不同的例子,實現弱光圖像增強?

解決這些問題,便是本文的 motivation。

 

算法回顧(Previous Arts )

 


Methodology

A desired low-light image enhancer should be capable to effectively remove the degradations hidden in the darkness, and flexibly adjust light/exposure conditions. We build a deep network, denoted as KinD, to achieve the goal. As schematically illustrated in Figure 2, the network is composed of two branches for handling the reflectance and illumination components, respectively. From the perspective of functionality, it also can be divided into three modules, including layer decomposition, reflectance restoration, and illumination adjustment. In the next subsections, we shall explain the details about the network.

一個理想的弱光圖像增強算法應該能夠有效地去除隱藏在黑暗中的退化,並靈活地調整光/曝光條件。

網絡整體結構:

網絡由兩個支路組成,分別處理反射率和光照分量。

從功能上看,也可以分爲三個模塊,分別是圖層分解、反射率恢復和光照調整。

 

Consideration & Motivation

  • Layer Decomposition.

The main drawback of plain methods comes from the blindness of illumination. Thus, it is key to obtain the illumination information. If having the illumination well-extracted from the input, the rest hosts the details and possible degradations, where the restoration (or degradation removal) can be executed on. In Retinex theory, an image I can be viewed as a composition of two components, i.e. reflectance R and illumination L, in the fashion of I = R \circ L, where \circ designates the element-wise product. Further, decomposing images in the Retinex manner consequently decouples the space of mapping a degraded low-light image to a desired one into two smaller subspaces, expecting to be better and easier regularized/learned. Moreover, the illumination map is core to flexibly adjusting light/exposure conditions. Based on the above, the Retinex-based layer decomposition is suitable and necessary for the target task.

本文認爲,如果從輸入中很好地提取了光照,那麼其餘部分則承載了細節和可能的退化,在這裏可以執行圖像恢復 (或退化去除)。

Retinex 理論中,一幅圖像可以被看作是由兩部分組成的,即反射率 R 和光照 L,按照的 I = R \circ L 方式,這裏 \circ 是元素相乘。被 Retinex 理論分解的圖像,解耦空間映射一個退化的弱光圖像到一個期望的到兩個理想的更小的子空間,從而使得正則化或學習更好更容易。

被分解的光照圖像 illumination map 也是靈活調節亮度或曝光條件的核心部分。

這就是弱光圖像增強爲什麼要兩步走的原因。且基於 Retinex 的層分解對於目標任務是合適且必要的。

 

  • Data Usage & Priors.

There is no well-defined ground-truth for light conditions. Furthermore, no/few ground-truth reflectance and illumination maps for real images are available. The layer decomposition problem is in nature under-determined, thus additional priors/regularizers matter.

Suppose that the images are degradation-free, different shots of a certain scene should share the same reflectance. While the illumination maps, though could be intensively varied, are of simple and mutually consistent structure. In real situations, the degradations embodied in low-light images are often worse than those in brighter ones, which will be diverted into the reflectance component.

This inspires us that the reflectance from the image in bright light can perform as the reference (ground-truth) for that from the degraded low-light one to learn restorers.

One may ask that why not use synthetic data? Because it is hard to synthesize. The degradations are not in a simple form, and change with respect to different sensors. Please notice that the usage of reflectance (welldefined) totally differs from using images in (relatively) bright light as the reference of low light ones.

介紹了數據集的問題。

  • Illumination Guided Reflectance Restoration.

In the decomposed reflectance, the pollution of regions corresponding to darker illumination is heavier than that to brighter one. Mathematically, a degraded low-light image can be naturally modeled as I=R\circ L+E, where E designates the pollution component. By taking simple algebra steps, we have:

I=R\circ L+E=\widetilde{R}\circ L=(R+\widetilde{E})\circ L=R\circ L + \widetilde{E}\circ L,

where \widetilde{R} stands for the polluted reflectance, and \widetilde{E} is the degradation having the illumination decoupled. The relationship E = \widetilde{E} \circ L holds. Taking the additive white Gaussian noise E \sim \mathcal{N} (0, \sigma^2 ) for an example, the distribution of \widetilde{E} becomes much more complex and strongly relates to L, i.e. \sigma^2/L_i for each position i. This is to say, the reflectance restoration cannot be uniformly processed over an entire image, and the illumination map can be a good guider.

One may wonder what if directly removing E from the input I? For one thing, the unbalance issue still remains. By viewing from another point, the intrinsic details will be unequally confounded with the noise. For another thing, different from the reflectance, we no longer have proper references for degradation removal in this manner, since L varies. Analogous analysis serves other types of degradation, like color-distortion.

本段給出了分解的數學表達式。並指出,反射率 reflectance 的恢復不能在整個圖像上進行均勻處理,而光照圖 illumination 可以作爲一個很好的嚮導()。

那爲什麼不直接從弱光圖像 I 中直接去除噪聲 E 呢?一方面,失衡問題仍然存在,圖像中內在的細節與噪音混淆在一起。另一方面,去噪沒有合適的參考圖像。

 

  • Arbitrary Illumination Manipulation.

The favorite illumination strengths of different persons/applications may be pretty diverse. Therefore, a practical system needs to provide an interface for arbitrary illumination manipulation. In the literature, three main ways for enhancing light conditions are fusion, light level appointment, and gamma correction. The fusion-based methods, due to the fixed fusion mode, lack in the functionality of light adjustment. If adopting the second option, the training dataset has to contain images with target levels, limiting its flexibility. For gamma correction, although it can achieve the goal by setting different γ values, it may be unable to reflect the relationship between different light (exposure) levels. This paper advocates to learn a flexible mapping function from real data, which accepts users to appoint arbitrary levels of light/exposure.

不同的人/應用程序最喜歡的照明強度可能是非常不同的。因此,一個實際的系統需要提供一個任意照明操作的接口。在文獻中,增強光條件的三種主要方法是融合、光級預約和伽馬校正。基於融合的方法由於融合模式固定,缺乏光調節功能。如果採用第二種方法,訓練數據集必須包含目標級別的圖像,這限制了它的靈活性。對於伽馬校正,雖然可以通過設置不同的靜置值來達到目的,但可能無法反映不同光(曝光度)水平之間的關係。本文提倡從真實數據中學習一個靈活的映射函數,允許用戶指定任意的光/曝光級別。

 

KinD Network

Inspired by the consideration and motivation, we build a deep neural network, denoted as KinD, for kindling the darkness. Below, we describe the three subnets in details from the functional perspective.

從三個網絡結構展開介紹:圖層分解網絡、反射率恢復網絡和亮度調節網絡。

  • Layer Decomposition Net.

本段分爲兩部分,先介紹了損失函數的定義;然後介紹了網絡結構。

損失函數:

Recovering two components from one image is a highly ill-posed problem. Having no ground-truth information guided, a loss with well-designed constraints is important. Fortunately, we have paired images with different light/exposure configurations [I_l , I_h]. Recall that the reflectance of a certain scene should be shared across different images, we regularize the decomposed reflectance pair [R_l , R_h] to be close (ideally the same if degradation-free). Furthermore, the illumination maps [L_l , L_h] should be piece-wise smooth and mutually consistent.

The following terms are adopted. We simply use \mathcal{L}^{LD}_{rs} := \left\| R_l-R_h \right \|_1 to regularize the reflectance similarity, where \left \| \cdot \right \| means the \ell^1 norm.

The illumination smoothness is constrained by \mathcal{L}^{LD}_{is} := \left\| \dfrac{\nabla L_l}{max(|\nabla I_l, \epsilon )} \right \|_1 + \left\| \dfrac{\nabla L_h}{max(|\nabla I_h, \epsilon )} \right \|_1, where \nabla stands for the first order derivative operator containing \nabla x (horizontal) and \nabla y (vertical) directions. In addition, \epsilon is a small positive constant (0.01 in this work) for avoiding zero denominator, and |\cdot | means the absolute value operator. This smoothness term measures the relative structure of the illumination with respect to the input. For a location on an edge in I, the penalty on L is small; while for a location in a flat region in I, the penalty turns to be large.

As for the mutual consistency, we employ \mathcal{L}^{LD}_{mc}:= \| \mathcal{M}\circ \exp(-c\cdot \mathcal{M})\|_1 with \mathcal{M} := |\nabla L_l | + |\nabla L_h |. Figure 4 depicts the function behavior of u \cdot \exp(-c \cdot u), where c is the parameter controlling the shape of function. As can be seen from Figure 4, the penalty first goes up but then drops towards 0 as u increases. This characteristic well fits the mutual consistency, i.e. strong mutual edges should be preserved while weak ones depressed. We notice that setting c = 0 leads to a simple \ell^1 loss on \mathcal{M}.

Besides, the decomposed two layers should reproduce the input, which is constrained by the reconstruction error, say \mathcal{L}^{LD}_{ec} := \| I_l-R_l \circ L_l \|_1 + \| I_h-R_h \circ L_h \|_1.

As a result, the loss function of layer decomposition net is as follows:

\mathcal{L}^{LD}:=\mathcal{L}^{LD}_{rec}+0.01\mathcal{L}^{LD}_{rs}+0.15\mathcal{L}^{LD}_{is}+0.2\mathcal{L}^{LD}_{mc}.

Figure 4: The behavior of function v=u \cdot \exp(-c \cdot u). The parameter c controls the shape of function.

包括四個部分(損失函數的定義,是根據一些先驗知識給出的):

1. reflectance similarity :前面已經分析過了,對於強光圖像和弱光圖像,二者的反射率是近似相同的(二者只是光照不同罷了),因此損失函數定義爲 \mathcal{L}^{LD}_{rs} := \left\| R_l-R_h \right \|_1

2. illumination smoothness :前面分析過,光照圖像可以用輸入圖像進行引導,在輸入圖像強邊緣區,光照發生較大變化;在弱邊緣區,光照可以認爲也是平滑的,因此損失函數定義爲 \mathcal{L}^{LD}_{is} := \left\| \dfrac{\nabla L_l}{max(|\nabla I_l, \epsilon )} \right \|_1 + \left\| \dfrac{\nabla L_h}{max(|\nabla I_h, \epsilon )} \right \|_1。注意到,當 \nabla I 較大時(邊緣),使得損失函數值很小,此時對 \nabla L 的約束較輕;當 \nabla I 較小時(平滑),使得損失函數值增大,此時要求 \nabla L 必須很小,才能減小損失函數值。這樣,光照圖像 L 就和輸入圖像 I 有一個相關的結構。

3. mutual consistency :這個損失函數是說,[L_l , L_h] 二者的結構應該是一致的。那爲什麼要定義成 \mathcal{L}^{LD}_{mc}:= \| \mathcal{M}\circ \exp(-c\cdot \mathcal{M})\|_1 這個樣子呢?圖 4 給出的是一個一維的例子。發現,當 u 近似爲 0 或者比較大時,這個損失函數的值比較小,對應到二維情況就是 \mathcal{M} := |\nabla L_l | + |\nabla L_h | 近似 0  或者較大時([L_l , L_h] 二者的梯度都很小或者都很大時),約束較小;而當 u 在 0 和較大之間時,這個損失函數的值比較大,對應到二維情況就是 \mathcal{M} := |\nabla L_l | + |\nabla L_h | 在 0 和較大之間時([L_l , L_h] 二者的梯度一個比較小,一個比較大),此時約束就很大,迫使 [L_l , L_h] 其中一個和另一個相近。

4. reconstruction error :即生成的 [R_l , R_h] 和 [L_l , L_h] 反過來合成的兩個新圖,應分別於 [I_l , I_h] 相似,即 \mathcal{L}^{LD}_{ec} := \| I_l-R_l \circ L_l \|_1 + \| I_h-R_h \circ L_h \|_1

最後,圖層分解網絡的損失函數就是 4 者相加。

下面是圖層分解網絡結構:

The layer decomposition network contains two branches corresponding to the reflectance and illumination, respectively. The reflectance branch adopts a typical 5-layer U-Net [25], followed by a convolutional (conv) layer and a Sigmoid layer. While the illumination branch is composed of two conv+ReLU layers and a conv layer on concatenated feature maps from the reflectance branch (for possibly excluding textures from the illumination), finally followed by a Sigmoid layer.

圖層分解網絡分爲兩個路徑:

1. reflectance branch :5-layer U-Net + a conv layer + Sigmoid

2.  illumination branch :two (conv+ReLU layers) + a conv layer(級聯從 reflectance branch 來的特徵圖,目的是爲了 從光照中排除紋理)+ Sigmoid。

 

  • Reflectance Restoration Net.

The reflectance maps from lowlight images, as shown in Figures 3 and 5, are more interfered by degradations than those from bright-light ones. Employing the clearer reflectance to act as the reference (informal ground-truth) for the messy one is our principle.

For seeking a restoration function, the objective turns to be simple as follows:

\mathcal{L}^{RR}:=\|\hat{R}-R_h\|^2_2-SSIM(\hat{R},R_h)+\|\nabla \hat{R}-\nabla R_h\|^2_2,

where SSIM(\cdot, \cdot) is the structural similarity measurement, \hat{R} corresponds to the restored reflectance, and \| \cdot \|_2 means the \ell^2 norm (MSE). The third term concentrates on the closeness in terms of textures.

This subnet is similar to the reflectance branch in the layer decomposition subnet, but deeper. The schematic configuration is given in Figure 2.

We recall that the degradation distributes in the reflectance complexly, which strongly depends on the illumination distribution. Thus, we bring the illumination information into the restoration net together with the degraded reflectance.

The effectiveness of this operation can be observed in Figure 5. In the two reflectance maps with different degradation (light) levels, the results by BM3D can fairly remove noise (without regarding the color distortion in nature). The blur effect exists almost everywhere. In our results, the textures (the dust/water-based stains for example) of the window region, which is originally bright and barely polluted, keeps clear and sharp, while the degradations in the dark region get largely removed with details (e.g. the characters on the bottles) very well maintained. Besides, the color distortion is also cured by our method.

\leftarrowFigure 3: Left column: Lower light input and its decomposed illumination and (degraded) reflectance maps. Right column: Brighter input and its corresponding maps. Three rows respectively correspond to inputs, illumination maps, and reflectance maps. These are testing images.

\rightarrowFigure 5: The polluted reflectance maps (top), and their results by BM3D (middle) and our reflectance restoration net (bottom). The right column corresponds to a heavier degradation (a lower light) level than the left. These are testing images.

反射率恢復網絡:整個一大段,看起來比較費勁,我這裏把它分解爲 5 部分內容

1. 網絡的原則是:採用較清晰的反射率作爲較雜亂的反射率的參考。

2. 損失函數:\mathcal{L}^{RR}:=\|\hat{R}-R_h\|^2_2-SSIM(\hat{R},R_h)+\|\nabla \hat{R}-\nabla R_h\|^2_2,

3. 網絡結構:U-Net (更多層)。

4. 需要注意的是,文章解釋了爲什麼反射率恢復網絡還有引入亮度圖像 [L_l , L_h] (從圖2可以看到這個連接)。這是因爲,前面說過,噪聲和顏色失真最主要出現在弱光照的區域,即衰減的分佈依賴於照明分佈。因此,將光照信息與反射係數降低一起帶入恢復網中。

5. 最後,用圖 5 進行了效果說明。傳統的 BM3D 會使圖像出現模糊現象。而本文的方法,保持圖像的清晰和銳化。

 

  • Illumination Adjustment Net.

There does not exist a ground-truth light level for images. Therefore, for fulfilling diverse requirements, we need a mechanism to flexibly convert one light condition to another. We have paired illumination maps. Even though without knowing the exact relationship between the paired illuminations, we can roughly calculate their ratio of strength, i.e. \alpha by mean(L_t /L_s ) where the division is element-wise. This ratio can be used as an indicator to train an adjustment function from a source light L_s to a target one L_t. If adjusting a lower level of light to a higher one, \alpha > 1, otherwise \alpha \leq 1. In the testing phase, \alpha can be specified by users.

The network is lightweight, containing 3 conv layers (two conv+ReLu, and one conv) and 1 Sigmoid layer. We notice that the indicator \alpha is expanded to a feature map, acting as a part of input for the net.

The following is the loss for illumination adjustment net:

\mathcal{L}^{IA}:=\|\hat{L}-L_t\|^2_2 + \| |\nabla \hat{L}_t|-|\nabla L_t| \|^2_2,

where L_t can be L_h or L_l , and \hat{L} is the adjusted illumination map from the source light (L_h or L_l) towards the target one.

Figure 6 shows the difference between our learned adjustment function and gamma correction. For comparison fairness, we tune the parameter \gamma for gamma correction to reach a similar overall light strength with ours via \gamma = \| log(\hat{L}) \|_1/\| log(L_s ) \|_1. We consider two adjustments without loss of generality, including one light down and one light up. Figure 6 (a) depicts the source illumination, (b) and (d) are the adjusted results by gamma correction, while (c) and (e) are ours. To more clearly show the difference, we plot the 1D intensity curves at x = 100, 200, 400. As for the light-down case, our learned manner decreases more than gamma correction in intensity on relatively bright regions, while less or about the same on dark regions. Regarding the light-up case, the opposite trend appears. In other words, our method increases less the light on relatively dark regions, while more or about the same on bright regions. The learned manner is more corroborative with actual situations.

Furthermore, the \alpha fashion is more convenient than the \gamma way for users to manipulate. For instance, setting \alpha to 2 means turns the light 2X up.

(a)Original illu.            (b)\gamma correction          (c) Ours                     (d)\gamma correction          (e) Ours

(f)Light-Down, x=100                            (g)Light-Down, x=200                          (h)Light-Down, x=400

(i)Light-Up, x=100                                (j)Light-Up, x=200                              (k)Light-Up, x=400

 

亮度調劑網絡:也是一整段,我把它分解爲 5 部分:

1. 參數 \alpha :由於給定的兩個圖像是相對強弱的。那麼,輸出的圖像,是以強光圖像爲目標呢,還是以弱光圖像爲目標呢?如果用戶是想將弱光圖像強化,就設置強光圖像爲目標,反之,以弱光圖像爲目標。這個操作可以根據用戶需求而自己設置。怎麼設置呢?就是通過參數 \alpha =mean(L_t /L_s ) 來實現。其中,L_t 表示目標圖像;L_s 表示原圖像(例如,若對弱光圖像強化,則 L_t=L_h, ~L_s=L_l)。

2. 亮度調劑網絡結構: two (conv+ReLu )+ one conv + Sigmoid 。注意到 \alpha 被擴展爲一個特徵圖,作爲網絡輸入的一部分。

3. 亮度調劑損失函數:\mathcal{L}^{IA}:=\|\hat{L}-L_t\|^2_2 + \| |\nabla \hat{L}_t|-|\nabla L_t| \|^2_2, 即輸出圖像 \hat{L} 應和目標圖像相似,且邊緣也相似。

4. 與 \gamma 變換的對比:圖 6 是本文亮度調節方法與 \gamma 變換結果的對比。對比實驗包括亮度降低(以弱光圖像爲目標)和亮度提升(以強光圖像爲目標)兩個方面。爲了更清晰說明情況,(f)-(k) 的曲線圖給出了各個圖像中 x = 100, 200, 400 這三列像素的曲線對比。

從 (f)-(h) 可以看出,對於亮度降低情況中,在相對明亮的區域,KinD 學習的方式在強度上比 \gamma 變換減少更多,而在黑暗的區域減少較小或與 \gamma 變換差不多相同。

從 (i)-(k) 可以看出,對於亮度提升情況中,KinD 方法在相對暗的區域對光的增強小於 \gamma 變換,而在明亮的區域的光強調整比 \gamma 變換增加更多或差不多相同。

總之,KinD 的方法在亮度調節上,比  \gamma 變換得到的亮度對比度更高。

5. 作者最後指出,亮度調節可以通過調節 \alpha 實現。\alpha 是參與網絡訓練的, \alpha 被擴展爲一個特徵圖,作爲網絡輸入的一部分。例如,當 L_t=L_h, ~L_s=L_l, 設置 \alpha =2,表示圖像的亮度增加 2 倍。

 

 


Experimental Validation

Implementation Details

We use the LOL dataset as the training dataset, which includes 500 low/normal-light image pairs. In the training, we merely employ 450 image pairs, and no synthetic images are used.

For the layer decomposition net, batch size is set to be 10 and patch-size to be 48x48.

While for the reflectance restoration net and illumination adjustment net, batch size is set to be 4 and patch-size to be 384x384.

We use the stochastic gradient descent (SGD) technique for optimization. The entire network is trained on a Nvidia GTX 2080Ti GPU and Intel Core i7-8700 3.20GHz CPU using the Tensorflow framework.

 

Performance Evaluation

We evaluate our method on widely-adopted datasets, including LOL [30], LIME [16], NPE [28], and MEF [7]. Four metrics are adopted for quantitative comparison, which are PSNR, SSIM, LOE [28], and NIQE [23]. A higher value in terms of PSNR and SSIM indicates better quality, while, in LOE and NIQE, the lower the better. The state-of-the-art methods of BIMEF [33], SRIE [12], CRM [34], Dong [11], LIME [16], MF [14], RRM [21], Retinex-Net [30], GLAD [29], MSR [18] and NPE [28] are involved as the competitors.

Figure 7: Visual comparison with state-of-the-art low-light image enhancement methods.

Figure 8: Visual comparison with state-of-the-art low-light image enhancement methods.

Table 2: Quantitative comparison on LIME, NPE, and MEF datasets in terms of NIQE. The best results are highlighted in bold.

 

數據集:

LOL [30] : Deep Retinex Decomposition for Low-Light Enhancement. 2018 British Machine Vision Conference. 

LIME [16] : LIME: Low-light Image Enhancement via Illumination Map Estimation. IEEE TIP (2017).

NPE [28] : Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE TIP (2013).

MEF [7] : Powerconstrained contrast enhancement for emissive displays based on histogram equalization. IEEE TIP (2012).

評價方法:

LOE [28] : Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE TIP (2013).

NIQE [23] : Making a completely blind image quality analyzer. IEEE Signal Processing Letters (2013).

對比方法:

BIMEF [33] : A Bio-Inspired Multi-Exposure Fusion Framework for Low-light Image Enhancement. arXiv (2017). (code)

SRIE [12] :A Weighted Variational Model for Simultaneous Reflectance and Illumination Estimation. 2016 CVPR. (code)

CRM [34] : A New Low-Light Image Enhancement Algorithm Using Camera Response Model. 2018 ICCVW.  (code)

Dong [11] : Fast efficient algorithm for enhancement of low lighting video. 2011 ICME. (code)

LIME [16] : LIME: Low-light Image Enhancement via Illumination Map Estimation. IEEE TIP (2017). (code)

MF [14] : A fusion-based enhancing method for weakly illuminated images. Signal Processing (2016). (code)

RRM [21] : Structure-Revealing Low-Light Image Enhancement Via Robust Retinex Model. IEEE TIP (2018). (code)

Retinex-Net [30] : Deep Retinex Decomposition for Low-Light Enhancement. 2018 British Machine Vision Conference. (code)

GLAD [29] : GLADNet: Low-Light Enhancement Network with Global Awareness. 2018 In IEEE International Conference on Automatic Face & Gesture Recognition. (code)

MSR [18] : A multiscale retinex for bridging the gap between color images and the human observation of scenes. IEEE TIP (2012). (code)

NPE [28] : Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE TIP (2013). (code)

——————————————————————————

補充方法:

DeepUPE : Underexposed photo enhancement using deep illumination estimatio. 2019 CVPR. (code

Zero-reference deep curve estimation for low-light image enhancement. 2020 arxiv. (code) (project)

——————————————————————————


KinD++

網絡結構對比

從整體上看,跟 KinD 基本相同。注意 ratio \alpha 在這個圖中畫出來了(前面圖 2 作者忘記畫這個輸入了)。

KinD++ 與 KinD 不同的地方,表現在 反射率恢復網絡,如下圖。 

上圖(a)和(b)是 KinD++ 與 KinD 之間的對比。

在 KinD++ 中,反射率恢復網絡沒有采用 U-Net,整個網絡過程圖像的空間分辨率保持不變,並引入了 多尺度亮度注意力(MSIA)模塊,如圖(c)所示。

 

NIQE code

Non-reference metric NIQE is adopted for quantitative comparison. The original code for computing NIQE is here. To improve the robustness, we follow the author's code and retrain the model parameters by extending 100 high-resolution natural images from PIRM dataset. Put the original 125 images and additional 100 images (dir: PIRM_dataset\Validation\Original) into one folder 'data', then run

[mu_prisparam cov_prisparam]  = estimatemodelparam('data',96,96,0,0,0.75);

After retrained, the file 'modelparameters_new.mat' will be generated. We use this model to evaluate all results.

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章