TensorLayer官方中文文檔1.7.4:API – 數據預處理


所屬分類:TensorLayer

API - 數據預處理

我們提供大量的數據增強及處理方法,使用 Numpy, Scipy, Threading 和 Queue。
不過,我們建議你直接使用 TensorFlow 提供的 operator,如 tf.image.central_crop ,更多關於 TensorFlow 的信息請見
這裏tutorial_cifar10_tfrecord.py.
這個包的一部分代碼來自Keras。

threading_data([data, fn, thread_count])Return a batch of result by given data.
rotation(x[, rg, is_random, row_index, ...])Rotate an image randomly or non-randomly.
rotation_multi(x[, rg, is_random, ...])Rotate multiple images with the same arguments, randomly or non-randomly.
crop(x, wrg, hrg[, is_random, row_index, ...])Randomly or centrally crop an image.
crop_multi(x, wrg, hrg[, is_random, ...])Randomly or centrally crop multiple images.
flip_axis(x[, axis, is_random])Flip the axis of an image, such as flip left and right, up and down, randomly or non-randomly,
flip_axis_multi(x, axis[, is_random])Flip the axises of multiple images together, such as flip left and right, up and down, randomly or non-randomly,
shift(x[, wrg, hrg, is_random, row_index, ...])Shift an image randomly or non-randomly.
shift_multi(x[, wrg, hrg, is_random, ...])Shift images with the same arguments, randomly or non-randomly.
shear(x[, intensity, is_random, row_index, ...])Shear an image randomly or non-randomly.
shear_multi(x[, intensity, is_random, ...])Shear images with the same arguments, randomly or non-randomly.
shear2(x[, shear, is_random, row_index, ...])Shear an image randomly or non-randomly.
shear_multi2(x[, shear, is_random, ...])Shear images with the same arguments, randomly or non-randomly.
swirl(x[, center, strength, radius, ...])Swirl an image randomly or non-randomly, see scikit-image swirl API and example.
swirl_multi(x[, center, strength, radius, ...])Swirl multiple images with the same arguments, randomly or non-randomly.
elastic_transform(x, alpha, sigma[, mode, ...])Elastic deformation of images as described in [Simard2003] .
elastic_transform_multi(x, alpha, sigma[, ...])Elastic deformation of images as described in [Simard2003].
zoom(x[, zoom_range, is_random, row_index, ...])Zoom in and out of a single image, randomly or non-randomly.
zoom_multi(x[, zoom_range, is_random, ...])Zoom in and out of images with the same arguments, randomly or non-randomly.
brightness(x[, gamma, gain, is_random])Change the brightness of a single image, randomly or non-randomly.
brightness_multi(x[, gamma, gain, is_random])Change the brightness of multiply images, randomly or non-randomly.
illumination(x[, gamma, contrast, ...])Perform illumination augmentation for a single image, randomly or non-randomly.
rgb_to_hsv(rgb)Input RGB image [0~255] return HSV image [0~1].
hsv_to_rgb(hsv)Input HSV image [0~1] return RGB image [0~255].
adjust_hue(im[, hout, is_offset, is_clip, ...])Adjust hue of an RGB image.
imresize(x[, size, interp, mode])Resize an image by given output size and method.
pixel_value_scale(im[, val, clip, is_random])Scales each value in the pixels of the image.
samplewise_norm(x[, rescale, ...])Normalize an image by rescale, samplewise centering and samplewise centering in order.
featurewise_norm(x[, mean, std, epsilon])Normalize every pixels by the same given mean and std, which are usually compute from all examples.
channel_shift(x, intensity[, is_random, ...])Shift the channels of an image, randomly or non-randomly, see numpy.rollaxis.
channel_shift_multi(x, intensity[, ...])Shift the channels of images with the same arguments, randomly or non-randomly, see numpy.rollaxis .
drop(x[, keep])Randomly set some pixels to zero by a given keeping probability.
transform_matrix_offset_center(matrix, x, y)Return transform matrix offset center.
apply_transform(x, transform_matrix[, ...])Return transformed images by given transform_matrix from transform_matrix_offset_center.
projective_transform_by_points(x, src, dst)Projective transform by given coordinates, usually 4 coordinates.
array_to_img(x[, dim_ordering, scale])Converts a numpy array to PIL image object (uint8 format).
find_contours(x[, level, fully_connected, ...])Find iso-valued contours in a 2D array for a given level value, returns list of (n, 2)-ndarrays see skimage.measure.find_contours .
pt2map([list_points, size, val])Inputs a list of points, return a 2D image.
binary_dilation(x[, radius])Return fast binary morphological dilation of an image.
dilation(x[, radius])Return greyscale morphological dilation of an image, see skimage.morphology.dilation.
binary_erosion(x[, radius])Return binary morphological erosion of an image, see skimage.morphology.binary_erosion.
erosion(x[, radius])Return greyscale morphological erosion of an image, see skimage.morphology.erosion.
obj_box_coord_rescale([coord, shape])Scale down one coordinates from pixel unit to the ratio of image size i.e.
obj_box_coords_rescale([coords, shape])Scale down a list of coordinates from pixel unit to the ratio of image size i.e.
obj_box_coord_scale_to_pixelunit(coord[, shape])Convert one coordinate [x, y, w (or x2), h (or y2)] in ratio format to image coordinate format.
obj_box_coord_centroid_to_upleft_butright(coord)Convert one coordinate [x_center, y_center, w, h] to [x1, y1, x2, y2] in up-left and botton-right format.
obj_box_coord_upleft_butright_to_centroid(coord)Convert one coordinate [x1, y1, x2, y2] to [x_center, y_center, w, h].
obj_box_coord_centroid_to_upleft(coord)Convert one coordinate [x_center, y_center, w, h] to [x, y, w, h].
obj_box_coord_upleft_to_centroid(coord)Convert one coordinate [x, y, w, h] to [x_center, y_center, w, h].
parse_darknet_ann_str_to_list(annotation)Input string format of class, x, y, w, h, return list of list format.
parse_darknet_ann_list_to_cls_box(annotation)Input list of [[class, x, y, w, h], ...], return two list of [class ...] and [[x, y, w, h], ...].
obj_box_left_right_flip(im[, coords, ...])Left-right flip the image and coordinates for object detection.
obj_box_imresize(im[, coords, size, interp, ...])Resize an image, and compute the new bounding box coordinates.
obj_box_crop(im[, classes, coords, wrg, ...])Randomly or centrally crop an image, and compute the new bounding box coordinates.
obj_box_shift(im[, classes, coords, wrg, ...])Shift an image randomly or non-randomly, and compute the new bounding box coordinates.
obj_box_zoom(im[, classes, coords, ...])Zoom in and out of a single image, randomly or non-randomly, and compute the new bounding box coordinates.
pad_sequences(sequences[, maxlen, dtype, ...])Pads each sequence to the same length: the length of the longest sequence.
remove_pad_sequences(sequences[, pad_id])Remove padding.
process_sequences(sequences[, end_id, ...])Set all tokens(ids) after END token to the padding value, and then shorten (option) it to the maximum sequence length in this batch.
sequences_add_start_id(sequences[, ...])Add special start token(id) in the beginning of each sequence.
sequences_add_end_id(sequences[, end_id])Add special end token(id) in the end of each sequence.
sequences_add_end_id_after_pad(sequences[, ...])Add special end token(id) in the end of each sequence.
sequences_get_mask(sequences[, pad_val])Return mask for sequences.

並行 Threading

tensorlayer.prepro.threading_data(data=None, fn=None, thread_count=None, **kwargs)[源代碼]

Return a batch of result by given data.
Usually be used for data augmentation.

Parameters:

data : numpy array, file names and etc, see Examples below.

thread_count : the number of threads to use

fn : the function for data processing.

more args : the args for fn, see Examples below.

References

Examples

  • Single array
>>> X --> [batch_size, row, col, 1] greyscale
>>> results = threading_data(X, zoom, zoom_range=[0.5, 1], is_random=True)
... results --> [batch_size, row, col, channel]
>>> tl.visualize.images2d(images=np.asarray(results), second=0.01, saveable=True, name='after', dtype=None)
>>> tl.visualize.images2d(images=np.asarray(X), second=0.01, saveable=True, name='before', dtype=None)
  • List of array (e.g. functions with multi)
>>> X, Y --> [batch_size, row, col, 1]  greyscale
>>> data = threading_data([_ for _ in zip(X, Y)], zoom_multi, zoom_range=[0.5, 1], is_random=True)
... data --> [batch_size, 2, row, col, 1]
>>> X_, Y_ = data.transpose((1,0,2,3,4))
... X_, Y_ --> [batch_size, row, col, 1]
>>> tl.visualize.images2d(images=np.asarray(X_), second=0.01, saveable=True, name='after', dtype=None)
>>> tl.visualize.images2d(images=np.asarray(Y_), second=0.01, saveable=True, name='before', dtype=None)
  • Single array split across thread_count threads (e.g. functions with multi)
>>> X, Y --> [batch_size, row, col, 1]  greyscale
>>> data = threading_data(X, zoom_multi, 8, zoom_range=[0.5, 1], is_random=True)
... data --> [batch_size, 2, row, col, 1]
>>> X_, Y_ = data.transpose((1,0,2,3,4))
... X_, Y_ --> [batch_size, row, col, 1]
>>> tl.visualize.images2d(images=np.asarray(X_), second=0.01, saveable=True, name='after', dtype=None)
>>> tl.visualize.images2d(images=np.asarray(Y_), second=0.01, saveable=True, name='before', dtype=None)
  • Customized function for image segmentation
>>> def distort_img(data):
...     x, y = data
...     x, y = flip_axis_multi([x, y], axis=0, is_random=True)
...     x, y = flip_axis_multi([x, y], axis=1, is_random=True)
...     x, y = crop_multi([x, y], 100, 100, is_random=True)
...     return x, y
>>> X, Y --> [batch_size, row, col, channel]
>>> data = threading_data([_ for _ in zip(X, Y)], distort_img)
>>> X_, Y_ = data.transpose((1,0,2,3,4))

圖像

  • 這些函數只對一個圖像做處理, 使用 threading_data 函數來實現多線程處理,請參考 tutorial_image_preprocess.py
  • 所有函數都有一個 is_random
  • 所有結尾是 multi 的函數通常用於圖像分隔,因爲輸入和輸出的圖像必需是匹配的。

旋轉

tensorlayer.prepro.rotation(x, rg=20, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Rotate an image randomly or non-randomly.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

rg : int or float

Degree to rotate, usually 0 ~ 180.

is_random : boolean, default False

If True, randomly rotate.

row_index, col_index, channel_index : int

Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

fill_mode : string

Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’

cval : scalar, optional

Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0

order : int, optional

The order of interpolation. The order has to be in the range 0-5. See apply_transform.

Examples

>>> x --> [row, col, 1] greyscale
>>> x = rotation(x, rg=40, is_random=False)
>>> tl.visualize.frame(x[:,:,0], second=0.01, saveable=True, name='temp',cmap='gray')
tensorlayer.prepro.rotation_multi(x, rg=20, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Rotate multiple images with the same arguments, randomly or non-randomly.
Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see rotation.

Examples

>>> x, y --> [row, col, 1]  greyscale
>>> x, y = rotation_multi([x, y], rg=90, is_random=False)
>>> tl.visualize.frame(x[:,:,0], second=0.01, saveable=True, name='x',cmap='gray')
>>> tl.visualize.frame(y[:,:,0], second=0.01, saveable=True, name='y',cmap='gray')

裁剪

tensorlayer.prepro.crop(x, wrg, hrg, is_random=False, row_index=0, col_index=1, channel_index=2)[源代碼]

Randomly or centrally crop an image.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

wrg : int

Size of width.

hrg : int

Size of height.

is_random : boolean, default False

If True, randomly crop, else central crop.

row_index, col_index, channel_index : int

Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

tensorlayer.prepro.crop_multi(x, wrg, hrg, is_random=False, row_index=0, col_index=1, channel_index=2)[源代碼]

Randomly or centrally crop multiple images.

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see crop.

翻轉

tensorlayer.prepro.flip_axis(x, axis=1, is_random=False)[源代碼]

Flip the axis of an image, such as flip left and right, up and down, randomly or non-randomly,

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

axis : int

  • 0, flip up and down
  • 1, flip left and right
  • 2, flip channel

is_random : boolean, default False

If True, randomly flip.

tensorlayer.prepro.flip_axis_multi(x, axis, is_random=False)[源代碼]

Flip the axises of multiple images together, such as flip left and right, up and down, randomly or non-randomly,

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see flip_axis.

位移

tensorlayer.prepro.shift(x, wrg=0.1, hrg=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Shift an image randomly or non-randomly.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

wrg : float

Percentage of shift in axis x, usually -0.25 ~ 0.25.

hrg : float

Percentage of shift in axis y, usually -0.25 ~ 0.25.

is_random : boolean, default False

If True, randomly shift.

row_index, col_index, channel_index : int

Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

fill_mode : string

Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’.

cval : scalar, optional

Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0.

order : int, optional

The order of interpolation. The order has to be in the range 0-5. See apply_transform.

tensorlayer.prepro.shift_multi(x, wrg=0.1, hrg=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Shift images with the same arguments, randomly or non-randomly.
Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see shift.

切變

tensorlayer.prepro.shear(x, intensity=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Shear an image randomly or non-randomly.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

intensity : float

Percentage of shear, usually -0.5 ~ 0.5 (is_random==True), 0 ~ 0.5 (is_random==False),
you can have a quick try by shear(X, 1).

is_random : boolean, default False

If True, randomly shear.

row_index, col_index, channel_index : int

Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

fill_mode : string

Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’.

cval : scalar, optional

Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0.

order : int, optional

The order of interpolation. The order has to be in the range 0-5. See apply_transform.

References

tensorlayer.prepro.shear_multi(x, intensity=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Shear images with the same arguments, randomly or non-randomly.
Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see tl.prepro.shear.

切變 V2

tensorlayer.prepro.shear2(x, shear=(0.1, 0.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Shear an image randomly or non-randomly.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

shear : tuple of two floats

Percentage of shear for height and width direction (0, 1).

is_random : boolean, default False

If True, randomly shear.

row_index, col_index, channel_index : int

Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

fill_mode : string

Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’.

cval : scalar, optional

Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0.

order : int, optional

The order of interpolation. The order has to be in the range 0-5. See apply_transform.

References

tensorlayer.prepro.shear_multi2(x, shear=(0.1, 0.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Shear images with the same arguments, randomly or non-randomly.
Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see tl.prepro.shear2.

漩渦

tensorlayer.prepro.swirl(x, center=None, strength=1, radius=100, rotation=0, output_shape=None, order=1, mode='constant', cval=0, clip=True, preserve_range=False, is_random=False)[源代碼]

Swirl an image randomly or non-randomly, see scikit-image swirl API
and example.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

center : (row, column) tuple or (2,) ndarray, optional

Center coordinate of transformation.

strength : float, optional

The amount of swirling applied.

radius : float, optional

The extent of the swirl in pixels. The effect dies out rapidly beyond radius.

rotation : float, (degree) optional

Additional rotation applied to the image, usually [0, 360], relates to center.

output_shape : tuple (rows, cols), optional

Shape of the output image generated. By default the shape of the input image is preserved.

order : int, optional

The order of the spline interpolation, default is 1. The order has to be in the range 0-5. See skimage.transform.warp for detail.

mode : {‘constant’, ‘edge’, ‘symmetric’, ‘reflect’, ‘wrap’}, optional

Points outside the boundaries of the input are filled according to the given mode, with ‘constant’ used as the default. Modes match the behaviour of numpy.pad.

cval : float, optional

Used in conjunction with mode ‘constant’, the value outside the image boundaries.

clip : bool, optional

Whether to clip the output to the range of values of the input image. This is enabled by default, since higher order interpolation may produce values outside the given input range.

preserve_range : bool, optional

Whether to keep the original range of values. Otherwise, the input image is converted according to the conventions of img_as_float.

is_random : boolean, default False

If True, random swirl.
  • random center = [(0 ~ x.shape[0]), (0 ~ x.shape[1])]
  • random strength = [0, strength]
  • random radius = [1e-10, radius]
  • random rotation = [-rotation, rotation]

Examples

>>> x --> [row, col, 1] greyscale
>>> x = swirl(x, strength=4, radius=100)
tensorlayer.prepro.swirl_multi(x, center=None, strength=1, radius=100, rotation=0, output_shape=None, order=1, mode='constant', cval=0, clip=True, preserve_range=False, is_random=False)[源代碼]

Swirl multiple images with the same arguments, randomly or non-randomly.
Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see swirl.

局部扭曲(Elastic transform)

tensorlayer.prepro.elastic_transform(x, alpha, sigma, mode='constant', cval=0, is_random=False)[源代碼]

Elastic deformation of images as described in [Simard2003] .

Parameters:

x : numpy array, a greyscale image.

alpha : scalar factor.

sigma : scalar or sequence of scalars, the smaller the sigma, the more transformation.

Standard deviation for Gaussian kernel. The standard deviations of the Gaussian filter are given for each axis as a sequence, or as a single number, in which case it is equal for all axes.

mode : default constant, see scipy.ndimage.filters.gaussian_filter.

cval : float, optional. Used in conjunction with mode ‘constant’, the value outside the image boundaries.

is_random : boolean, default False

References

Examples

>>> x = elastic_transform(x, alpha = x.shape[1] * 3, sigma = x.shape[1] * 0.07)
tensorlayer.prepro.elastic_transform_multi(x, alpha, sigma, mode='constant', cval=0, is_random=False)[源代碼]

Elastic deformation of images as described in [Simard2003].

Parameters:

x : list of numpy array

others : see elastic_transform.

縮放

tensorlayer.prepro.zoom(x, zoom_range=(0.9, 1.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Zoom in and out of a single image, randomly or non-randomly.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

zoom_range : list or tuple

  • If is_random=False, (h, w) are the fixed zoom factor for row and column axies, factor small than one is zoom in.
  • If is_random=True, it is (min zoom out, max zoom out) for x and y with different random zoom in/out factor.

e.g (0.5, 1) zoom in 1~2 times.

is_random : boolean, default False

If True, randomly zoom.

row_index, col_index, channel_index : int

Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

fill_mode : string

Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’.

cval : scalar, optional

Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0.

order : int, optional

The order of interpolation. The order has to be in the range 0-5. See apply_transform.

tensorlayer.prepro.zoom_multi(x, zoom_range=(0.9, 1.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Zoom in and out of images with the same arguments, randomly or non-randomly.
Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see zoom.

亮度

tensorlayer.prepro.brightness(x, gamma=1, gain=1, is_random=False)[源代碼]

Change the brightness of a single image, randomly or non-randomly.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

gamma : float, small than 1 means brighter.

Non negative real number. Default value is 1, smaller means brighter.

  • If is_random is True, gamma in a range of (1-gamma, 1+gamma).

gain : float

The constant multiplier. Default value is 1.

is_random : boolean, default False

  • If True, randomly change brightness.

References

tensorlayer.prepro.brightness_multi(x, gamma=1, gain=1, is_random=False)[源代碼]

Change the brightness of multiply images, randomly or non-randomly.
Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see brightness.

亮度, 飽和度, 對比度

tensorlayer.prepro.illumination(x, gamma=1.0, contrast=1.0, saturation=1.0, is_random=False)[源代碼]

Perform illumination augmentation for a single image, randomly or non-randomly.

Parameters:

x : numpy array

an image with dimension of [row, col, channel] (default).

gamma : change brightness (the same with tl.prepro.brightness)

  • if is_random=False, one float number, small than one means brighter, greater than one means darker.
  • if is_random=True, tuple of two float numbers, (min, max).

contrast : change contrast

  • if is_random=False, one float number, small than one means blur.
  • if is_random=True, tuple of two float numbers, (min, max).

saturation : change saturation

  • if is_random=False, one float number, small than one means unsaturation.
  • if is_random=True, tuple of two float numbers, (min, max).

is_random : whether the parameters are randomly set.

Examples

  • Random
>>> x = illumination(x, gamma=(0.5, 5.0), contrast=(0.3, 1.0), saturation=(0.7, 1.0), is_random=True)
- Non-random
>>> x = illumination(x, 0.5, 0.6, 0.8, is_random=False)

RGB 轉 HSV

tensorlayer.prepro.rgb_to_hsv(rgb)[源代碼]

Input RGB image [0~255] return HSV image [0~1].

Parameters:rgb : should be a numpy arrays with values between 0 and 255.

HSV 轉 RGB

tensorlayer.prepro.hsv_to_rgb(hsv)[源代碼]

Input HSV image [0~1] return RGB image [0~255].

Parameters:hsv : should be a numpy arrays with values between 0.0 and 1.0

調整色調(Hue)

tensorlayer.prepro.adjust_hue(im, hout=0.66, is_offset=True, is_clip=True, is_random=False)[源代碼]

Adjust hue of an RGB image. This is a convenience method that converts an RGB image to float representation, converts it to HSV, add an offset to the hue channel, converts back to RGB and then back to the original data type.
For TF, see tf.image.adjust_hue and tf.image.random_hue.

Parameters:

im : should be a numpy arrays with values between 0 and 255.

hout : float.

  • If is_offset is False, set all hue values to this value. 0 is red; 0.33 is green; 0.66 is blue.
  • If is_offset is True, add this value as the offset to the hue channel.

is_offset : boolean, default True.

is_clip : boolean, default True.

  • If True, set negative hue values to 0.

is_random : boolean, default False.

References

Examples

  • Random, add a random value between -0.2 and 0.2 as the offset to every hue values.
>>> im_hue = tl.prepro.adjust_hue(image, hout=0.2, is_offset=True, is_random=False)
  • Non-random, make all hue to green.
>>> im_green = tl.prepro.adjust_hue(image, hout=0.66, is_offset=False, is_random=False)

調整大小

tensorlayer.prepro.imresize(x, size=[100, 100], interp='bicubic', mode=None)[源代碼]

Resize an image by given output size and method. Warning, this function
will rescale the value to [0, 255].

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

size : int, float or tuple (h, w)

  • int, Percentage of current size.
  • float, Fraction of current size.
  • tuple, Size of the output image.

interp : str, optional

Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’ or ‘cubic’).

mode : str, optional

The PIL image mode (‘P’, ‘L’, etc.) to convert arr before resizing.

Returns:

imresize : ndarray

The resized array of image.

References

像素值縮放

tensorlayer.prepro.pixel_value_scale(im, val=0.9, clip=[], is_random=False)[源代碼]

Scales each value in the pixels of the image.

Parameters:

im : numpy array for one image.

val : float.

  • If is_random=False, multiply this value with all pixels.
  • If is_random=True, multiply a value between [1-val, 1+val] with all pixels.

Examples

  • Random
>>> im = pixel_value_scale(im, 0.1, [0, 255], is_random=True)
  • Non-random
>>> im = pixel_value_scale(im, 0.9, [0, 255], is_random=False)

正規化

tensorlayer.prepro.samplewise_norm(x, rescale=None, samplewise_center=False, samplewise_std_normalization=False, channel_index=2, epsilon=1e-07)[源代碼]

Normalize an image by rescale, samplewise centering and samplewise centering in order.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

rescale : rescaling factor.

If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided (before applying any other transformation)

samplewise_center : set each sample mean to 0.

samplewise_std_normalization : divide each input by its std.

epsilon : small position value for dividing standard deviation.

Notes

When samplewise_center and samplewise_std_normalization are True.

  • For greyscale image, every pixels are subtracted and divided by the mean and std of whole image.
  • For RGB image, every pixels are subtracted and divided by the mean and std of this pixel i.e. the mean and std of a pixel is 0 and 1.

Examples

>>> x = samplewise_norm(x, samplewise_center=True, samplewise_std_normalization=True)
>>> print(x.shape, np.mean(x), np.std(x))
... (160, 176, 1), 0.0, 1.0
tensorlayer.prepro.featurewise_norm(x, mean=None, std=None, epsilon=1e-07)[源代碼]

Normalize every pixels by the same given mean and std, which are usually
compute from all examples.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

mean : value for subtraction.

std : value for division.

epsilon : small position value for dividing standard deviation.

通道位移

tensorlayer.prepro.channel_shift(x, intensity, is_random=False, channel_index=2)[源代碼]

Shift the channels of an image, randomly or non-randomly, see numpy.rollaxis.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

intensity : float

Intensity of shifting.

is_random : boolean, default False

If True, randomly shift.

channel_index : int

Index of channel, default 2.

tensorlayer.prepro.channel_shift_multi(x, intensity, is_random=False, channel_index=2)[源代碼]

Shift the channels of images with the same arguments, randomly or non-randomly, see numpy.rollaxis .
Usually be used for image segmentation which x=[X, Y], X and Y should be matched.

Parameters:

x : list of numpy array

List of images with dimension of [n_images, row, col, channel] (default).

others : see channel_shift.

噪聲

tensorlayer.prepro.drop(x, keep=0.5)[源代碼]

Randomly set some pixels to zero by a given keeping probability.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] or [row, col].

keep : float (0, 1)

The keeping probability, the lower more values will be set to zero.

矩陣圓心轉換到圖中央

tensorlayer.prepro.transform_matrix_offset_center(matrix, x, y)[源代碼]

Return transform matrix offset center.

Parameters:

matrix : numpy array

Transform matrix

x, y : int

Size of image.

Examples

  • See rotation, shear, zoom.

基於矩陣的仿射變換

tensorlayer.prepro.apply_transform(x, transform_matrix, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[源代碼]

Return transformed images by given transform_matrix from transform_matrix_offset_center.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

transform_matrix : numpy array

Transform matrix (offset center), can be generated by transform_matrix_offset_center

channel_index : int

Index of channel, default 2.

fill_mode : string

Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’

cval : scalar, optional

Value used for points outside the boundaries of the input if mode='constant'. Default is 0.0

order : int, optional

The order of interpolation. The order has to be in the range 0-5:

Examples

  • See rotation, shift, shear, zoom.

基於座標點的的投影變換

tensorlayer.prepro.projective_transform_by_points(x, src, dst, map_args={}, output_shape=None, order=1, mode='constant', cval=0.0, clip=True, preserve_range=False)[源代碼]

Projective transform by given coordinates, usually 4 coordinates. see scikit-image.

Parameters:

x : numpy array

An image with dimension of [row, col, channel] (default).

src : list or numpy

The original coordinates, usually 4 coordinates of (width, height).

dst : list or numpy

The coordinates after transformation, the number of coordinates is the same with src.

map_args : dict, optional

Keyword arguments passed to inverse_map.

output_shape : tuple (rows, cols), optional

Shape of the output image generated. By default the shape of the input image is preserved. Note that, even for multi-band images, only rows and columns need to be specified.

order : int, optional

The order of interpolation. The order has to be in the range 0-5:

  • 0 Nearest-neighbor
  • 1 Bi-linear (default)
  • 2 Bi-quadratic
  • 3 Bi-cubic
  • 4 Bi-quartic
  • 5 Bi-quintic

mode : {‘constant’, ‘edge’, ‘symmetric’, ‘reflect’, ‘wrap’}, optional

Points outside the boundaries of the input are filled according to the given mode. Modes match the behaviour of numpy.pad.

cval : float, optional

Used in conjunction with mode ‘constant’, the value outside the image boundaries.

clip : bool, optional

Whether to clip the output to the range of values of the input image. This is enabled by default, since higher order interpolation may produce values outside the given input range.

preserve_range : bool, optional

Whether to keep the original range of values. Otherwise, the input image is converted according to the conventions of img_as_float.

References

Examples

>>> Assume X is an image from CIFAR 10, i.e. shape == (32, 32, 3)
>>> src = [[0,0],[0,32],[32,0],[32,32]]     # [w, h]
>>> dst = [[10,10],[0,32],[32,0],[32,32]]
>>> x = projective_transform_by_points(X, src, dst)

Numpy 與 PIL

tensorlayer.prepro.array_to_img(x, dim_ordering=(0, 1, 2), scale=True)[源代碼]

Converts a numpy array to PIL image object (uint8 format).

Parameters:

x : numpy array

A image with dimension of 3 and channels of 1 or 3.

dim_ordering : list or tuple of 3 int

Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).

scale : boolean, default is True

If True, converts image to [0, 255] from any range of value like [-1, 2].

References

找輪廓

tensorlayer.prepro.find_contours(x, level=0.8, fully_connected='low', positive_orientation='low')[源代碼]

Find iso-valued contours in a 2D array for a given level value, returns list of (n, 2)-ndarrays
see skimage.measure.find_contours .

Parameters:

x : 2D ndarray of double. Input data in which to find contours.

level : float. Value along which to find contours in the array.

fully_connected : str, {‘low’, ‘high’}. Indicates whether array elements below the given level value are to be considered fully-connected (and hence elements above the value will only be face connected), or vice-versa. (See notes below for details.)

positive_orientation : either ‘low’ or ‘high’. Indicates whether the output contours will produce positively-oriented polygons around islands of low- or high-valued elements. If ‘low’ then contours will wind counter-clockwise around elements below the iso-value. Alternately, this means that low-valued elements are always on the left of the contour.

一列點到圖

tensorlayer.prepro.pt2map(list_points=[], size=(100, 100), val=1)[源代碼]

Inputs a list of points, return a 2D image.

Parameters:

list_points : list of [x, y].

size : tuple of (w, h) for output size.

val : float or int for the contour value.

二值膨脹

tensorlayer.prepro.binary_dilation(x, radius=3)[源代碼]

Return fast binary morphological dilation of an image.
see skimage.morphology.binary_dilation.

Parameters:

x : 2D array image.

radius : int for the radius of mask.

灰度膨脹

tensorlayer.prepro.dilation(x, radius=3)[源代碼]

Return greyscale morphological dilation of an image,
see skimage.morphology.dilation.

Parameters:

x : 2D array image.

radius : int for the radius of mask.

二值腐蝕

tensorlayer.prepro.binary_erosion(x, radius=3)[源代碼]

Return binary morphological erosion of an image,
see skimage.morphology.binary_erosion.

Parameters:

x : 2D array image.

radius : int for the radius of mask.

灰度腐蝕

tensorlayer.prepro.erosion(x, radius=3)[源代碼]

Return greyscale morphological erosion of an image,
see skimage.morphology.erosion.

Parameters:

x : 2D array image.

radius : int for the radius of mask.

目標檢測

教程-圖像增強

您好,這是基於VOC數據集的一個圖像增強例子,請閱讀這篇 知乎文章

import tensorlayer as tl

## 下載 VOC 2012 數據集
imgs_file_list, _, _, _, classes, _, _,\
    _, objs_info_list, _ = tl.files.load_voc_dataset(dataset="2012")

## 圖片標記預處理爲列表形式
ann_list = []
for info in objs_info_list:
    ann = tl.prepro.parse_darknet_ann_str_to_list(info)
    c, b = tl.prepro.parse_darknet_ann_list_to_cls_box(ann)
    ann_list.append([c, b])

# 讀取一張圖片,並保存
idx = 2  # 可自行選擇圖片
image = tl.vis.read_image(imgs_file_list[idx])
tl.vis.draw_boxes_and_labels_to_image(image, ann_list[idx][0],
     ann_list[idx][1], [], classes, True, save_name='_im_original.png')

# 左右翻轉
im_flip, coords = tl.prepro.obj_box_left_right_flip(image,
        ann_list[idx][1], is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_flip, ann_list[idx][0],
        coords, [], classes, True, save_name='_im_flip.png')

# 調整圖片大小
im_resize, coords = tl.prepro.obj_box_imresize(image,
        coords=ann_list[idx][1], size=[300, 200], is_rescale=True)
tl.vis.draw_boxes_and_labels_to_image(im_resize, ann_list[idx][0],
        coords, [], classes, True, save_name='_im_resize.png')

# 裁剪
im_crop, clas, coords = tl.prepro.obj_box_crop(image, ann_list[idx][0],
         ann_list[idx][1], wrg=200, hrg=200,
         is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_crop, clas, coords, [],
         classes, True, save_name='_im_crop.png')

# 位移
im_shfit, clas, coords = tl.prepro.obj_box_shift(image, ann_list[idx][0],
        ann_list[idx][1], wrg=0.1, hrg=0.1,
        is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_shfit, clas, coords, [],
        classes, True, save_name='_im_shift.png')

# 高寬縮放
im_zoom, clas, coords = tl.prepro.obj_box_zoom(image, ann_list[idx][0],
        ann_list[idx][1], zoom_range=(1.3, 0.7),
        is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_zoom, clas, coords, [],
        classes, True, save_name='_im_zoom.png')

實際中,你可能希望如下使用多線程方式來處理一個batch的數據。

import tensorlayer as tl
import random

batch_size = 64
im_size = [416, 416]
n_data = len(imgs_file_list)
jitter = 0.2
def _data_pre_aug_fn(data):
    im, ann = data
    clas, coords = ann
    ## 隨機改變圖片亮度、對比度和飽和度
    im = tl.prepro.illumination(im, gamma=(0.5, 1.5),
             contrast=(0.5, 1.5), saturation=(0.5, 1.5), is_random=True)
    ## 隨機左右翻轉
    im, coords = tl.prepro.obj_box_left_right_flip(im, coords,
             is_rescale=True, is_center=True, is_random=True)
    ## 隨機調整大小並裁剪出指定大小的圖片,這同時達到了隨機縮放的效果
    tmp0 = random.randint(1, int(im_size[0]*jitter))
    tmp1 = random.randint(1, int(im_size[1]*jitter))
    im, coords = tl.prepro.obj_box_imresize(im, coords,
            [im_size[0]+tmp0, im_size[1]+tmp1], is_rescale=True,
             interp='bicubic')
    im, clas, coords = tl.prepro.obj_box_crop(im, clas, coords,
             wrg=im_size[1], hrg=im_size[0], is_rescale=True,
             is_center=True, is_random=True)
    ## 把數值範圍從 [0, 255] 轉到 [-1, 1] (可選)
    im = im / 127.5 - 1
    return im, [clas, coords]

# 隨機讀取一個batch的圖片及其標記
idexs = tl.utils.get_random_int(min=0, max=n_data-1, number=batch_size)
b_im_path = [imgs_file_list[i] for i in idexs]
b_images = tl.prepro.threading_data(b_im_path, fn=tl.vis.read_image)
b_ann = [ann_list[i] for i in idexs]

# 多線程處理
data = tl.prepro.threading_data([_ for _ in zip(b_images, b_ann)],
              _data_pre_aug_fn)
b_images2 = [d[0] for d in data]
b_ann = [d[1] for d in data]

# 保存每一組圖片以供體會
for i in range(len(b_images)):
    tl.vis.draw_boxes_and_labels_to_image(b_images[i],
             ann_list[idexs[i]][0], ann_list[idexs[i]][1], [],
             classes, True, save_name='_bbox_vis_%d_original.png' % i)
    tl.vis.draw_boxes_and_labels_to_image((b_images2[i]+1)*127.5,
             b_ann[i][0], b_ann[i][1], [], classes, True,
             save_name='_bbox_vis_%d.png' % i)

座標-像素單位到比例單位

tensorlayer.prepro.obj_box_coord_rescale(coord=[], shape=[100, 200])[源代碼]

Scale down one coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1].
It is the reverse process of obj_box_coord_scale_to_pixelunit.

Parameters:

coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], ...]

shape : list of 2 integers for [height, width] of the image.

Examples

>>> coord = obj_box_coord_rescale(coord=[30, 40, 50, 50], shape=[100, 100])
... [[0.3, 0.4, 0.5, 0.5]]

座標-像素單位到比例單位 (多個座標)

tensorlayer.prepro.obj_box_coords_rescale(coords=[], shape=[100, 200])[源代碼]

Scale down a list of coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1].

Parameters:

coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], ...]

shape : list of 2 integers for [height, width] of the image.

Examples

>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50], [10, 10, 20, 20]], shape=[100, 100])
>>> print(coords)
... [[0.3, 0.4, 0.5, 0.5], [0.1, 0.1, 0.2, 0.2]]
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[50, 100])
>>> print(coords)
... [[0.3, 0.8, 0.5, 1.0]]
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[100, 200])
>>> print(coords)
... [[0.15, 0.4, 0.25, 0.5]]

座標-比例單位到像素單位

tensorlayer.prepro.obj_box_coord_scale_to_pixelunit(coord, shape=(100, 100, 3))[源代碼]

Convert one coordinate [x, y, w (or x2), h (or y2)] in ratio format to image coordinate format.
It is the reverse process of obj_box_coord_rescale.

Parameters:

coord : list of float, [x, y, w (or x2), h (or y2)] in ratio format, i.e value range [0~1].

shape : tuple of (height, width, channel (optional))

Examples

>>> x, y, x2, y2 = obj_box_coord_scale_to_pixelunit([0.2, 0.3, 0.5, 0.7], shape=(100, 200, 3))
... (40, 30, 100, 70)

座標-[x_center, x_center, w, h]到左上-右下單位

tensorlayer.prepro.obj_box_coord_centroid_to_upleft_butright(coord, to_int=False)[源代碼]

Convert one coordinate [x_center, y_center, w, h] to [x1, y1, x2, y2] in up-left and botton-right format.

Examples

>>> coord = obj_box_coord_centroid_to_upleft_butright([30, 40, 20, 20])
... [20, 30, 40, 50]

座標-左上-右下單位到[x_center, x_center, w, h]

tensorlayer.prepro.obj_box_coord_upleft_butright_to_centroid(coord)[源代碼]

Convert one coordinate [x1, y1, x2, y2] to [x_center, y_center, w, h].
It is the reverse process of obj_box_coord_centroid_to_upleft_butright.

座標-[x_center, x_center, w, h]到左上-高寬單位

tensorlayer.prepro.obj_box_coord_centroid_to_upleft(coord)[源代碼]

Convert one coordinate [x_center, y_center, w, h] to [x, y, w, h].
It is the reverse process of obj_box_coord_upleft_to_centroid.

座標-左上-高寬單位到[x_center, x_center, w, h]

tensorlayer.prepro.obj_box_coord_upleft_to_centroid(coord)[源代碼]

Convert one coordinate [x, y, w, h] to [x_center, y_center, w, h].
It is the reverse process of obj_box_coord_centroid_to_upleft.

Darknet格式-字符轉列表

tensorlayer.prepro.parse_darknet_ann_str_to_list(annotation)[源代碼]

Input string format of class, x, y, w, h, return list of list format.

Darknet格式-分開列表的類別和座標

tensorlayer.prepro.parse_darknet_ann_list_to_cls_box(annotation)[源代碼]

Input list of [[class, x, y, w, h], ...], return two list of [class ...] and [[x, y, w, h], ...].

圖像-翻轉

tensorlayer.prepro.obj_box_left_right_flip(im, coords=[], is_rescale=False, is_center=False, is_random=False)[源代碼]

Left-right flip the image and coordinates for object detection.

Parameters:

im : numpy array

An image with dimension of [row, col, channel] (default).

coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], ...]

is_rescale : boolean, default False

Set to True, if the input coordinates are rescaled to [0, 1].

is_center : boolean, default False

Set to True, if the x and y of coordinates are the centroid. (i.e. darknet format)

is_random : boolean, default False

If True, randomly flip.

Examples

>>> im = np.zeros([80, 100])    # as an image with shape width=100, height=80
>>> im, coords = obj_box_left_right_flip(im, coords=[[0.2, 0.4, 0.3, 0.3], [0.1, 0.5, 0.2, 0.3]], is_rescale=True, is_center=True, is_random=False)
>>> print(coords)
... [[0.8, 0.4, 0.3, 0.3], [0.9, 0.5, 0.2, 0.3]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[0.2, 0.4, 0.3, 0.3]], is_rescale=True, is_center=False, is_random=False)
>>> print(coords)
... [[0.5, 0.4, 0.3, 0.3]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_rescale=False, is_center=True, is_random=False)
>>> print(coords)
... [[80, 40, 30, 30]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_rescale=False, is_center=False, is_random=False)
>>> print(coords)
... [[50, 40, 30, 30]]

圖像-調整大小

tensorlayer.prepro.obj_box_imresize(im, coords=[], size=[100, 100], interp='bicubic', mode=None, is_rescale=False)[源代碼]

Resize an image, and compute the new bounding box coordinates.

Parameters:

im : numpy array

An image with dimension of [row, col, channel] (default).

coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], ...]

size, interp, mode : see tl.prepro.imresize for details.

is_rescale : boolean, default False

Set to True, if the input coordinates are rescaled to [0, 1], then return the original coordinates.

Examples

>>> im = np.zeros([80, 100, 3])    # as an image with shape width=100, height=80
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30], [10, 20, 20, 20]], size=[160, 200], is_rescale=False)
>>> print(coords)
... [[40, 80, 60, 60], [20, 40, 40, 40]]
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[40, 100], is_rescale=False)
>>> print(coords)
... [20, 20, 30, 15]
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[60, 150], is_rescale=False)
>>> print(coords)
... [30, 30, 45, 22]
>>> im2, coords = obj_box_imresize(im, coords=[[0.2, 0.4, 0.3, 0.3]], size=[160, 200], is_rescale=True)
>>> print(coords, im2.shape)
... [0.2, 0.4, 0.3, 0.3] (160, 200, 3)

圖像-裁剪

tensorlayer.prepro.obj_box_crop(im, classes=[], coords=[], wrg=100, hrg=100, is_rescale=False, is_center=False, is_random=False, thresh_wh=0.02, thresh_wh2=12.0)[源代碼]

Randomly or centrally crop an image, and compute the new bounding box coordinates.
Objects outside the cropped image will be removed.

Parameters:

im : numpy array

An image with dimension of [row, col, channel] (default).

classes : list of class ID (int).

coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], ...]

wrg, hrg, is_random : see tl.prepro.crop for details.

is_rescale : boolean, default False

Set to True, if the input coordinates are rescaled to [0, 1].

is_center : boolean, default False

Set to True, if the x and y of coordinates are the centroid. (i.e. darknet format)

thresh_wh : float

Threshold, remove the box if its ratio of width(height) to image size less than the threshold.

thresh_wh2 : float

Threshold, remove the box if its ratio of width to height or vice verse higher than the threshold.

圖像-位移

tensorlayer.prepro.obj_box_shift(im, classes=[], coords=[], wrg=0.1, hrg=0.1, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1, is_rescale=False, is_center=False, is_random=False, thresh_wh=0.02, thresh_wh2=12.0)[源代碼]

Shift an image randomly or non-randomly, and compute the new bounding box coordinates.
Objects outside the cropped image will be removed.

Parameters:

im : numpy array

An image with dimension of [row, col, channel] (default).

classes : list of class ID (int).

coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], ...]

wrg, hrg, row_index, col_index, channel_index, is_random, fill_mode, cval, order : see tl.prepro.shift.

is_rescale : boolean, default False

Set to True, if the input coordinates are rescaled to [0, 1].

is_center : boolean, default False

Set to True, if the x and y of coordinates are the centroid. (i.e. darknet format)

thresh_wh : float

Threshold, remove the box if its ratio of width(height) to image size less than the threshold.

thresh_wh2 : float

Threshold, remove the box if its ratio of width to height or vice verse higher than the threshold.

圖像-縮放

tensorlayer.prepro.obj_box_zoom(im, classes=[], coords=[], zoom_range=(0.9, 1.1), row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1, is_rescale=False, is_center=False, is_random=False, thresh_wh=0.02, thresh_wh2=12.0)[源代碼]

Zoom in and out of a single image, randomly or non-randomly, and compute the new bounding box coordinates.
Objects outside the cropped image will be removed.

Parameters:

im : numpy array

An image with dimension of [row, col, channel] (default).

classes : list of class ID (int).

coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], ...]

zoom_range, row_index, col_index, channel_index, is_random, fill_mode, cval, order : see tl.prepro.zoom.

is_rescale : boolean, default False

Set to True, if the input coordinates are rescaled to [0, 1].

is_center : boolean, default False

Set to True, if the x and y of coordinates are the centroid. (i.e. darknet format)

thresh_wh : float

Threshold, remove the box if its ratio of width(height) to image size less than the threshold.

thresh_wh2 : float

Threshold, remove the box if its ratio of width to height or vice verse higher than the threshold.

序列

更多相關函數,請見 tensorlayer.nlp

Padding

tensorlayer.prepro.pad_sequences(sequences, maxlen=None, dtype='int32', padding='post', truncating='pre', value=0.0)[源代碼]

Pads each sequence to the same length:
the length of the longest sequence.
If maxlen is provided, any sequence longer
than maxlen is truncated to maxlen.
Truncation happens off either the beginning (default) or
the end of the sequence.
Supports post-padding and pre-padding (default).

Parameters:

sequences : list of lists where each element is a sequence

maxlen : int, maximum length

dtype : type to cast the resulting sequence.

padding : 'pre' or 'post', pad either before or after each sequence.

truncating : 'pre' or 'post', remove values from sequences larger than

maxlen either in the beginning or in the end of the sequence

value : float, value to pad the sequences to the desired value.

Returns:

x : numpy array with dimensions (number_of_sequences, maxlen)

Examples

>>> sequences = [[1,1,1,1,1],[2,2,2],[3,3]]
>>> sequences = pad_sequences(sequences, maxlen=None, dtype='int32',
...                  padding='post', truncating='pre', value=0.)
... [[1 1 1 1 1]
...  [2 2 2 0 0]
...  [3 3 0 0 0]]

Remove Padding

tensorlayer.prepro.remove_pad_sequences(sequences, pad_id=0)[源代碼]

Remove padding.

Parameters:

sequences : list of list.

pad_id : int.

Examples

>>> sequences = [[2,3,4,0,0], [5,1,2,3,4,0,0,0], [4,5,0,2,4,0,0,0]]
>>> print(remove_pad_sequences(sequences, pad_id=0))
... [[2, 3, 4], [5, 1, 2, 3, 4], [4, 5, 0, 2, 4]]

Process

tensorlayer.prepro.process_sequences(sequences, end_id=0, pad_val=0, is_shorten=True, remain_end_id=False)[源代碼]

Set all tokens(ids) after END token to the padding value, and then shorten (option) it to the maximum sequence length in this batch.

Parameters:

sequences : numpy array or list of list with token IDs.

e.g. [[4,3,5,3,2,2,2,2], [5,3,9,4,9,2,2,3]]

end_id : int, the special token for END.

pad_val : int, replace the end_id and the ids after end_id to this value.

is_shorten : boolean, default True.

Shorten the sequences.

remain_end_id : boolean, default False.

Keep an end_id in the end.

Examples

>>> sentences_ids = [[4, 3, 5, 3, 2, 2, 2, 2],  <-- end_id is 2
...                  [5, 3, 9, 4, 9, 2, 2, 3]]  <-- end_id is 2
>>> sentences_ids = precess_sequences(sentences_ids, end_id=vocab.end_id, pad_val=0, is_shorten=True)
... [[4, 3, 5, 3, 0], [5, 3, 9, 4, 9]]

Add Start ID

tensorlayer.prepro.sequences_add_start_id(sequences, start_id=0, remove_last=False)[源代碼]

Add special start token(id) in the beginning of each sequence.

Examples

>>> sentences_ids = [[4,3,5,3,2,2,2,2], [5,3,9,4,9,2,2,3]]
>>> sentences_ids = sequences_add_start_id(sentences_ids, start_id=2)
... [[2, 4, 3, 5, 3, 2, 2, 2, 2], [2, 5, 3, 9, 4, 9, 2, 2, 3]]
>>> sentences_ids = sequences_add_start_id(sentences_ids, start_id=2, remove_last=True)
... [[2, 4, 3, 5, 3, 2, 2, 2], [2, 5, 3, 9, 4, 9, 2, 2]]
  • For Seq2seq
>>> input = [a, b, c]
>>> target = [x, y, z]
>>> decode_seq = [start_id, a, b] <-- sequences_add_start_id(input, start_id, True)

Add End ID

tensorlayer.prepro.sequences_add_end_id(sequences, end_id=888)[源代碼]

Add special end token(id) in the end of each sequence.

Parameters:

sequences : list of list.

end_id : int.

Examples

>>> sequences = [[1,2,3],[4,5,6,7]]
>>> print(sequences_add_end_id(sequences, end_id=999))
... [[1, 2, 3, 999], [4, 5, 6, 999]]

Add End ID after pad

tensorlayer.prepro.sequences_add_end_id_after_pad(sequences, end_id=888, pad_id=0)[源代碼]

Add special end token(id) in the end of each sequence.

Parameters:

sequences : list of list.

end_id : int.

pad_id : int.

Examples

>>> sequences = [[1,2,0,0], [1,2,3,0], [1,2,3,4]]
>>> print(sequences_add_end_id_after_pad(sequences, end_id=99, pad_id=0))
... [[1, 2, 99, 0], [1, 2, 3, 99], [1, 2, 3, 4]]

Get Mask

tensorlayer.prepro.sequences_get_mask(sequences, pad_val=0)[源代碼]

Return mask for sequences.

Examples

>>> sentences_ids = [[4, 0, 5, 3, 0, 0],
...                  [5, 3, 9, 4, 9, 0]]
>>> mask = sequences_get_mask(sentences_ids, pad_val=0)
... [[1 1 1 1 0 0]
...  [1 1 1 1 1 0]]

Tensor Opt

註解

這幾個函數將被棄用, 關於如何使用 Tensor Operator 請參考 tutorial_cifar10_tfrecord.py

艾伯特(http://www.aibbt.com/)國內第一家人工智能門戶
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章