API - Layersο
Table of Layersο
Augmentation 2D
Performs the AugMix data augmentation technique. |
|
RandAugment performs the Rand Augment operation on input images. |
|
TrivialAugmentWide performs the Wide version of Trivial Augment operation on input images. |
|
Randomly affines transformation of the images keeping center invariant. |
|
Randomly crops the input images. |
|
Randomly crops a part of an image and resizes it to provided size. |
|
Randomly flips the input images. |
|
Randomly resizes the images in a batch manner. |
|
Randomly rotates the input images. |
|
RandomZoomAndCrop implements resize with scale distortion. |
|
Shuffles channels of the input images. |
|
Randomly blurs the images using random-sized kernels. |
|
Randomly shift values for each channel of the input images. |
|
Randomly applies Contrast Limited Adaptive Histogram Equalization to the input images. |
|
Randomly applies brightness, contrast, saturation and hue image processing operation sequentially and randomly on the input images. |
|
Randomly adjusts gamma of the input images. |
|
Applies a Gaussian Blur with random strength to an image. |
|
Randomly adjusts the hue, saturation and value on given images. |
|
Randomly applies jpeg compression artifacts to the input images. |
|
Randomly reduces the number of bits for each color channel. |
|
Randomly performs the sharpness operation on given images. |
|
Randomly applies |
|
CutMix implements the CutMix data augmentation technique. |
|
The MixUp data augmentation technique. |
|
The Mosaic data augmentation technique used by YOLO series. |
|
Randomly drop channels of the input images. |
|
Randomly cut out rectangles from images and fill them. |
|
Randomly erase rectangles from images and fill them. |
|
RandomGridMask performs the Grid Mask operation on input images. |
|
Apply randomly an augmentation or a list of augmentations with a given probability. |
|
RandomChoice constructs a pipeline based on provided arguments. |
|
RepeatedAugment augments each image in a batch multiple times. |
Preprocessing 2D
Center crops the images. |
|
Pads the images if needed. |
|
Resizes the images. |
|
Performs the AutoContrast operation on the input images. |
|
Performs histogram equalization on a channel-wise basis. |
|
Grayscale transforms RGB images to grayscale images. |
|
Inverts the inputs. |
|
Normalizes the mean and std on given images. |
|
Rescales the inputs to a new range. |
|
Applies nothing to the inputs. |
|
Remove degenerate/invalid bounding boxes. |
Base 2D
Abstract base layer for vectorized image augmentation. |
Augmentation 2Dο
Auto
- class keras_aug.layers.AugMix(value_range, severity=[0.01, 0.3], num_chains=3, chain_depth=[1, 3], alpha=1.0, seed=None, **kwargs)[source]ο
Performs the AugMix data augmentation technique.
AugMix aims to produce images with variety while preserving the image semantics and local statistics. During the augmentation process, each image is augmented
num_chainsdifferent ways, each way consisting ofchain_depthaugmentations. Augmentations are sampled from the list: [translation, shearing, rotation, posterization, histogram equalization, solarization and auto contrast]. The results of each chain are then mixed together with the original image based on random samples from a Dirichlet distribution.- Parameters:
value_range (Sequence[float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.severity (float|(float, float)|keras_aug.FactorSampler, optional) β The range of the strength of augmentations. When represented as a single float, the factor will be picked between
[0.01, upper]. Defaults to[0.01, 0.3].num_chains (int, optional) β The number of different chains to be mixed. Defaults to
3.chain_depth (int, Sequence[int], optional) β The range of the number of transformations in the chains. Defaults to
[1, 3].alpha (float, optional) β The probability coefficients for the Beta and Dirichlet distributions. Defaults to
1.0.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandAugment(value_range, augmentations_per_image=2, magnitude=10, magnitude_stddev=0, translation_multiplier=150.0 / 331.0, use_geometry=True, interpolation='nearest', fill_mode='reflect', fill_value=0, exclude_ops=None, bounding_box_format=None, seed=None, **kwargs)[source]ο
RandAugment performs the Rand Augment operation on input images.
RandAugment can be thought of as an all-in-one image augmentation layer. The policy implemented by RandAugment has been benchmarked extensively and is effective on a wide variety of datasets.
The input images will be converted to the range [0, 255], performed RandAugment and then converted back to the original value range.
For object detection tasks, you should set
fill_mode="constant"andfill_value=128to avoid artifacts. Moreover, you can setuse_geometry=Falseto turn off all geometric augmentations if the distortion of the bounding boxes is too large.- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.augmentations_per_image (int, optional) β The number of layers to use in the rand augment policy. Defaults to
2.magnitude (float, optional) β The shared magnitude across all augmentation operations. Represented as M in the paper. Usually best values are in the range
[5, 10]. Defaults to10.magnitude_stddev (float, optional) β The randomness of the severity as proposed by the authors of the timm library. Defaults to
0. When enabled, A gaussian noise withmagnitude_stddevas sigma will be added tomagnitude.translation_multiplier (float, optional) β The multiplier for applying translation. Defaults to
150.0 / 331.0which is for ImageNet classification model. For CIFAR, it is set to10.0 / 32.0. Usually best value is in the range[1.0 / 3.0, 1.0 / 2.0].use_geometry (bool, optional) β whether to include geometric augmentations. This should be set to
Falsewhen performing object detection. Defaults toTrue.interpolation (str, optional) β The interpolation mode. Supported values:
"nearest", "bilinear". Defaults to βnearestβ.fill_mode (str, optional) β The fill mode. Supported values:
"constant", "reflect", "wrap", "nearest". Defaults to"reflect".fill_value (int|float, optional) β The value to be filled outside the boundaries when
fill_mode="constant". Defaults to0.exclude_ops (list(str), optional) β Exclude selected operations. Defaults to
None.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.TrivialAugmentWide(value_range, use_geometry=True, interpolation='nearest', fill_mode='reflect', fill_value=0, exclude_ops=None, bounding_box_format=None, seed=None, **kwargs)[source]ο
TrivialAugmentWide performs the Wide version of Trivial Augment operation on input images.
TrivialAugmentWide can be thought of as an all-in-one image augmentation layer. The policy implemented by TrivialAugmentWide has been benchmarked extensively and is effective on a wide variety of datasets.
The input images will be converted to the range [0, 255], performed TrivialAugment and then converted back to the original value range.
For object detection tasks, you should set
fill_mode="constant"andfill_value=128to avoid artifacts. Moreover, you can setuse_geometry=Falseto turn off all geometric augmentations if the distortion of the bounding boxes is too large.- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.use_geometry (bool, optional) β whether to include geometric augmentations. This should be set to
Falsewhen performing object detection. Defaults toTrue.interpolation (str, optional) β The interpolation mode. Supported values:
"nearest", "bilinear". Defaults to βnearestβ.fill_mode (str, optional) β The fill mode. Supported values:
"constant", "reflect", "wrap", "nearest". Defaults to"reflect".fill_value (int|float, optional) β The value to be filled outside the boundaries when
fill_mode="constant". Defaults to0.exclude_ops (list(str), optional) β Exclude selected operations. Defaults to
None.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
Geometry
- class keras_aug.layers.RandomAffine(rotation_factor=None, translation_height_factor=None, translation_width_factor=None, zoom_height_factor=None, zoom_width_factor=None, shear_height_factor=None, shear_width_factor=None, same_zoom_factor=False, interpolation='bilinear', fill_mode='constant', fill_value=0, bounding_box_format=None, bounding_box_min_area_ratio=None, bounding_box_max_aspect_ratio=None, seed=None, **kwargs)[source]ο
Randomly affines transformation of the images keeping center invariant.
Randomly affines by rotation, translation, zoom and shear. RandomAffine processes the images by combined transformation matrix, so it is fast.
- Parameters:
rotation_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the degree for random rotation. When represented as a single float, the factor will be picked between
[0.0 - lower, 0.0 + upper]. A positive value means rotating counter clock-wise, while a negative value means clock-wise. Defaults toNone.translation_height_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range for random vertical translation. When represented as a single float, the factor will be picked between
[0.0 - lower, 0.0 + upper]. A negative value means shifting image up, while a positive value means shifting image down. Defaults toNone.translation_width_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range for random horizontal translation. When represented as a single float, the factor will be picked between
[0.0 - lower, 0.0 + upper]. A negative value means shifting image left, while a positive value means shifting image right. Defaults toNone.zoom_height_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range for random vertical zoom. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper]. A negative value means zooming in while a positive value means zooming out. Defaults toNone.zoom_width_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range for random horizontal zoom. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper]. A negative value means zooming in while a positive value means zooming out. Defaults toNone.shear_height_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range for random vertical shear. When represented as a single float, the factor will be picked between
[0.0 - lower, 0.0 + upper]. Defaults toNone.shear_width_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range for random horizontal shear. When represented as a single float, the factor will be picked between
[0.0 - lower, 0.0 + upper]. Defaults toNone.same_zoom_factor (bool, optional) β If True, the zoom factor sampled from
zoom_height_factorwill be applied to both height and width. It is useful to keep aspect ratio. Defaults toFalse.interpolation (str, optional) β The interpolation mode. Supported values:
"nearest", "bilinear". Defaults to βbilinearβ.fill_mode (str, optional) β The fill mode. Supported values:
"constant", "reflect", "wrap", "nearest". Defaults to"constant".fill_value (int|float, optional) β The value to be filled outside the boundaries when
fill_mode="constant". Defaults to0.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
bounding_box_min_area_ratio (float, optional) β The threshold to apply sanitize_bounding_boxes. Defaults to
None.bounding_box_max_aspect_ratio (float, optional) β The threshold to apply sanitize_bounding_boxes. Defaults to
None.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomCrop(height, width, interpolation='bilinear', bounding_box_format=None, bounding_box_min_area_ratio=None, bounding_box_max_aspect_ratio=None, seed=None, **kwargs)[source]ο
Randomly crops the input images.
This layer will randomly choose a location to crop images down to
(height, width). If an input image is smaller than the(height, width), the input will be resized and cropped to return the largest possible window in the image that matches the target aspect ratio.- Parameters:
height (int) β The height of result image.
width (int) β The width of result image.
interpolation (str, optional) β The interpolation mode. Supported values:
"nearest", "bilinear". Defaults to βbilinearβ.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
bounding_box_min_area_ratio (float, optional) β The threshold to apply sanitize_bounding_boxes. Defaults to
None.bounding_box_max_aspect_ratio (float, optional) β The threshold to apply sanitize_bounding_boxes. Defaults to
None.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomCropAndResize(height, width, crop_area_factor, aspect_ratio_factor, interpolation='bilinear', bounding_box_format=None, seed=None, **kwargs)[source]ο
Randomly crops a part of an image and resizes it to provided size.
This implementation takes an intuitive approach, where we crop the images to a random height and width, and then resize them. To do this, we first sample a random value for area using crop_area_factor and a value for aspect ratio using aspect_ratio_factor. Further we get the new height and width by dividing and multiplying the old height and width by the random area respectively. We then sample offsets for height and width and clip them such that the cropped area does not exceed image boundaries. Finally, we do the actual cropping operation and resize the image to (height, width).
- Parameters:
height (int) β The height of result image.
width (int) β The width of result image.
crop_area_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the area of the cropped part to that of original image. For self-supervised pretraining a common value for this parameter is
(0.08, 1.0). For fine-tuning and classification a common value is(0.8, 1.0).aspect_ratio_factor (float|Sequence[float]|keras_aug.FactorSampler) β The ratio of width to height of the cropped image. When represented as a single float, the factor will be picked between
[1.0 - factor, 1.0]. For most tasks, this should be(3/4, 4/3).interpolation (str, optional) β The interpolation mode. Supported values:
"nearest", "bilinear". Defaults to βbilinearβ.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomFlip(mode='horizontal', bounding_box_format=None, seed=None, **kwargs)[source]ο
Randomly flips the input images.
This layer will flip the images horizontally and or vertically based on the
modeattribute.- Parameters:
mode (str, optional) β The flip mode to use. Supported values:
"horizontal", "vertical", "horizontal_and_vertical". Defaults to"horizontal"."horizontal"is a left-right flip and"vertical"is a top-bottom flip.rate (float, optional) β The frequency of flipping.
1.0indicates that images are always flipped.0.0indicates no flipping. Defaults to0.5.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomResize(heights, widths=None, interpolation='bilinear', antialias=False, bounding_box_format=None, seed=None, **kwargs)[source]ο
Randomly resizes the images in a batch manner.
This layer is useful for multi-scale training.
Notes
The aspect ratio might be different from the original images.
- Parameters:
heights (list(int)) β The heights to be sampled for the result image.
widths (list(int), optional) β The widths to be sampled for the result image. Defaults to
None. If settingNone,widthswill be the same asheights.interpolation (str, optional) β The interpolation mode. Supported values:
"nearest", "bilinear". Defaults to"bilinear".antialias (bool, optional) β Whether to use antialias. Defaults to
False.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
- class keras_aug.layers.RandomRotate(factor, interpolation='bilinear', fill_mode='constant', fill_value=0, bounding_box_format=None, seed=None, **kwargs)[source]ο
Randomly rotates the input images.
The unit of the factor is degree. A positive value means rotating counter clock-wise, while a negative value means clock-wise.
- Parameters:
factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the degree for random rotation. When represented as a single float, the factor will be picked between
[0.0 - lower, 0.0 + upper]. A positive value means rotating counter clock-wise, while a negative value means clock-wise.interpolation (str, optional) β The interpolation mode. Supported values:
"nearest", "bilinear". Defaults to βbilinearβ.fill_mode (str, optional) β The fill mode. Supported values:
"constant", "reflect", "wrap", "nearest". Defaults to"constant".fill_value (int|float, optional) β The value to be filled outside the boundaries when
fill_mode="constant". Defaults to0.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomZoomAndCrop(height, width, scale_factor, crop_height=None, crop_width=None, interpolation='bilinear', antialias=False, postion='center', padding_value=0, bounding_box_format=None, seed=None, **kwargs)[source]ο
RandomZoomAndCrop implements resize with scale distortion.
RandomZoomAndCrop takes a three-step approach to size-distortion based image augmentation. This technique is specifically tuned for object detection pipelines. The layer takes an input of images and bounding boxes, both of which may be ragged. It outputs a dense image tensor, ready to feed to a model for training. As such this layer will commonly be the final step in an augmentation pipeline.
The augmentation process is as follows: The image is first scaled according to a randomly sampled scale factor. The width and height of the image are then resized according to the sampled scale. This is done to introduce noise into the local scale of features in the image. A subset of the image is then cropped randomly according to
(crop_height, crop_width). This crop is then padded to be(height, width). Bounding boxes are translated and scaled according to the random scaling and random cropping.- Parameters:
height (int) β The height of result image.
width (int) β The width of result image.
scale_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the scale factor that is used to scale the input image. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper]. To reproduce the results of the MaskRCNN paper pass(0.8, 1.25).crop_height (int, optional) β The height of the image to crop from the scaled image. Defaults to
heightwhen not provided.crop_width (int, optional) β The width of the image to crop from the scaled image. Defaults to
widthwhen not provided.interpolation (str, optional) β The interpolation method. Supported values:
"nearest", "bilinear", "bicubic", "area", "lanczos3", "lanczos5", "gaussian", "mitchellcubic". Defaults to"bilinear".antialias (bool, optional) β Whether to use antialias. Defaults to
False.position (str, optional) β The padding method. Supported values:
"center", "top_left", "top_right", "bottom_left", "bottom_right", "random". Defaults to"center".padding_value (int|float, optional) β The padding value. Defaults to
0.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
Intensity
- class keras_aug.layers.ChannelShuffle(groups=3, seed=None, **kwargs)[source]ο
Shuffles channels of the input images.
- Parameters:
groups (int, optional) β The number of the groups to divide the input channels. Defaults to
3.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomBlur(factor, seed=None, **kwargs)[source]ο
Randomly blurs the images using random-sized kernels.
This layer applies a mean filter with varying kernel sizes to blur the images. The sampled kernel sizes are always odd numbers.
- Parameters:
factor (int|Sequence[int]|keras_aug.FactorSampler) β The kernel size range for blurring the input image. If the factor is a single value, the range will be
(1, factor). The value range of the factor should be in(1, +inf). When sampled kernel size=``1``, there is no blur effect.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomChannelShift(value_range, factor, channels=3, seed=None, **kwargs)[source]ο
Randomly shift values for each channel of the input images.
- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the channel shift factor. When represented as a single float, the factor will be picked between
[0.0 - lower, 0.0 + upper].0.0gives the original image.channels (int, optional) β The number of channels to shift. Defaults to
3corresponds to RGB shift.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomCLAHE(value_range, factor=(4, 4), tile_grid_size=(8, 8), seed=None, **kwargs)[source]ο
Randomly applies Contrast Limited Adaptive Histogram Equalization to the input images.
- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.factor (int|Sequence[int]|keras_aug.FactorSampler) β The range of the threshold values for contrast limiting. If the factor is a single float value, the range will be
(1, clip_limit). Defaults to(4, 4).tile_grid_size (Sequence[int]) β The size of grid for histogram equalization. Defaults to
(8, 8).seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomColorJitter(value_range, brightness_factor=None, contrast_factor=None, saturation_factor=None, hue_factor=None, seed=None, **kwargs)[source]ο
Randomly applies brightness, contrast, saturation and hue image processing operation sequentially and randomly on the input images. It expects input as RGB image.
- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.brightness_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the brightness factor. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper].0.0will make image be black.1.0will make image be white. Defaults toNone.contrast_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the contrast factor. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper].0.0gives solid gray image.1.0gives the original image while2.0increases the contrast by a factor of 2. Defaults toNone.saturation_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the saturation factor. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper].1.0will give the original image.0.0makes the image to be fully grayscale.2.0will enhance the saturation by a factor of 2. Defaults toNone.hue_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the hue factor. When represented as a single float, the factor will be picked between
[0.0 - lower, 0.0 + upper].0.0means no shift.-0.5or0.5gives an image with complementary colors. Defaults toNone.seed (int|float, optional) β The random seed. Defaults to
None.
References
Tensorflow Model augment <https://github.com/tensorflow/models/blob/master/official/vision/ops/augment.py>
- class keras_aug.layers.RandomGamma(value_range, factor, seed=None, **kwargs)[source]ο
Randomly adjusts gamma of the input images.
This layer will randomly increase/reduce the gamma for the input images by the equation:
y = x ** factor. Gamma is adjusted independently of each image. The image is adjusted by converting the pixel value range to[0, 1]and applying RandomGamma. The image is then converted back to the original value range.- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the gamma factor. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper].1.0will give the original image.seed (int|float, optional) β The random seed. Defaults to
None.
- class keras_aug.layers.RandomGaussianBlur(kernel_size, factor, seed=None, **kwargs)[source]ο
Applies a Gaussian Blur with random strength to an image.
- Parameters:
kernel_size (int|Sequence[int]) β The x and y dimensions for the kernel used.
factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the factor that controls the extent to which the image is blurred. When represented as a single float, the factor will be picked between
[0.0, 0.0 + upper]. Mathematically,factorrepresents the sigma value in a gaussian blur.0.0makes this layer perform a no-op operation. High values make the blur stronger.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomHSV(value_range, hue_factor=None, saturation_factor=None, value_factor=None, seed=None, **kwargs)[source]ο
Randomly adjusts the hue, saturation and value on given images.
This layer will randomly increase/reduce the hue, saturation and value for the input RGB images. The image hue, saturation and value is adjusted by converting the image(s) to HSV and rotating the hue channel (H) by hue factor, multiplying the saturation channel (S) by saturation factor and multiplying the value channel (V) by value factor. The image is then converted back to RGB.
- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.hue_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the hue factor. When represented as a single float, the factor will be picked between
[0.5 - lower, 0.5 + upper].0.0means no shift.-0.5or0.5gives an image with complementary colors. Defaults toNone.saturation_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the saturation factor. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper].1.0will give the original image.0.0makes the image to be fully grayscale.2.0will enhance the saturation by a factor of 2. Defaults toNone.value_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the value factor. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper].1.0will give the original image.0.0makes the image to be zero values.2.0will enhance the value by a factor of 2. Defaults toNone.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomJpegQuality(value_range, factor, seed=None, **kwargs)[source]ο
Randomly applies jpeg compression artifacts to the input images.
Performs the jpeg compression algorithm on the image. This layer can be used in order to ensure your model is robust to artifacts introduced by JPEG compression.
- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.factor (int|Sequence[int]|keras_aug.FactorSampler) β The range of the compression factor. When represented as a single int, the factor will be randomly picked between
[100 - factor, 100].50will give the image with 50% JPEG compression.100will still give a lossy compresson. This value is passed totf.image.adjust_jpeg_quality().seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomPosterize(value_range, factor, seed=None, **kwargs)[source]ο
Randomly reduces the number of bits for each color channel.
- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.factor (int|Sequence[int]|keras_aug.FactorSampler) β The number of bits to keep for each channel. Must be a value between
[1, 8].factor=(5, 8)means RandomPosterize will randomly keep 5 to 8 bits for the image.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomSharpness(value_range, factor, seed=None, **kwargs)[source]ο
Randomly performs the sharpness operation on given images.
The sharpness operation first performs a blur operation, then blends between the original image and the blurred image. This operation makes the edges of an image less sharp than they were in the original image.
- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the sharpness factor. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper].1.0will give the original image.0.0makes the image to be blurred.2.0will enhance the sharpness by a factor of 2.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomSolarize(value_range, threshold_factor, addition_factor=0, seed=None, **kwargs)[source]ο
Randomly applies
(max_value - pixel + min_value)for each pixel in the input images.When created without
threshold_factorparameter, the layer performs solarization to all values. When created with specifiedthreshold_factorthe layer only augments pixels that are above thethreshold_factorvalue.- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.threshold_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the threshold factor. Only the pixel values above the threshold will be solarized. When represented as a single float, the factor will be picked between
[0, upper].255means no thresholding.addition_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the addition factor that is added to each pixel before solarization and thresholding. When represented as a single float, the factor will be picked between
[0, upper].0means no addition. Defaults to0.seed (int|float, optional) β The random seed. Defaults to
None.
References
Mix
- class keras_aug.layers.CutMix(alpha=1.0, seed=None, **kwargs)[source]ο
CutMix implements the CutMix data augmentation technique.
CutMix only supports dense images as inputs.
- Parameters:
alpha (float, optional) β The inverse scale parameter between 0 to +inf for the gamma distribution. This controls the shape of the distribution from which the smoothing values are sampled. Defaults to
1.0, which is a recommended value when training an ImageNet classification model.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.MixUp(alpha=0.2, seed=None, **kwargs)[source]ο
The MixUp data augmentation technique.
The MixUp data augmentation technique involves taking 2 images from a given batch and fusing them together using a ratio sampled from a beta distribution. Labels are applied by same ratio ratio. Bounding boxes are concated according to the position of the 2 images.
- Parameters:
alpha (float, optional) β The inverse scale parameter between 0 to +inf for the gamma distribution. This controls the shape of the distribution from which the smoothing values are sampled. Defaults to
0.2, which is a recommended value when training an ImageNet classification model. For object detection, it is recommended to use a larger value. For example YOLOV8 uses32.0.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.Mosaic(height, width, offset=(0.25, 0.75), fill_value=0, bounding_box_format=None, seed=None, **kwargs)[source]ο
The Mosaic data augmentation technique used by YOLO series.
The Mosaic data augmentation first takes 4 images from the batch and makes a grid. After that based on the offset, a crop is taken to form the mosaic image. Labels are in the same ratio as the area of their images in the output image. Bounding boxes are translated according to the position of the 4 images.
- Parameters:
height (int) β The height of result image.
width (int) β The width of result image.
offset (float|Sequence[float]|keras_aug.FactorSampler) β The offset of the mosaic center from the top-left corner of the mosaic. If a tuple is used, the x and y coordinates of the mosaic center are sampled between the two values for every image augmented. When represented as a single float, the values will be picked between
[0.5 - offset, 0.5 + offset]. Defaults to(0.25, 0.75).fill_value (int|float, optional) β The value to be filled outside the boundaries when
fill_mode="constant". Defaults to0.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
Regularization
- class keras_aug.layers.RandomChannelDropout(factor=(0, 2), fill_value=0, seed=None, **kwargs)[source]ο
Randomly drop channels of the input images.
- Parameters:
factor (float|Sequence[float]|keras_aug.FactorSampler) β The range from which we choose the number of channels to drop. Defaults to
(0, 2).fill_value (int|float, optional) β The value to be filled for dropped channel. Defaults to
0.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomCutout(height_factor, width_factor, fill_mode='constant', fill_value=0, bbox_removal_threshold=0.6, bounding_box_format=None, seed=None, **kwargs)[source]ο
Randomly cut out rectangles from images and fill them.
- Parameters:
height_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the height factor that controls the height of the cutout rectangle. When represented as a single float, the factor will be picked between
[0.0, 0.0 + upper].0.0means the rectangle will be of size 0% of the image height.0.1means the rectangle will have a size of 10% of the image height.width_factor (float|Sequence[float]|keras_aug.FactorSampler) β The range of the width factor that controls the width of the cutout rectangle. When represented as a single float, the factor will be picked between
[0.0, 0.0 + upper].0.0means the rectangle will be of size 0% of the image width.0.1means the rectangle will have a size of 10% of the image width.fill_mode (str, optional) β Pixels inside the cutout rectangle are filled according to the given mode. Supported values:
"constant", "gaussian_noise". Defaults to"constant".fill_value (int|float, optional) β The value to be filled in the cutout rectangle when
fill_mode="constant". Defaults to0.bbox_removal_threshold (float, optional) β The bounding boxes having content cut above the threshold will be removed. Defaults to
0.6which is applied by ultralytics/yolo series.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomErase(area_factor=(0.02, 0.4), aspect_ratio_factor=(0.3, 1.0 / 0.3), fill_mode='constant', fill_value=(125, 123, 114), seed=None, **kwargs)[source]ο
Randomly erase rectangles from images and fill them.
- Parameters:
area_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the area factor that controls the area of the erasing. When represented as a single float, the factor will be picked between
[0.0, 0.0 + upper].0.0means the rectangle will be of size 0% of the image area.0.1means the rectangle will have a size of 10% of the image area. Defaults to(0.02, 0.4)aspect_ratio_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the aspect ratio factor that controls the aspect ratio of the erasing. When represented as a single float, the factor will be picked between
[1.0 - lower, 1.0 + upper].1.0means the erasing will be square. Defaults to(0.3, 1.0 / 0.3).fill_mode (str, optional) β Pixels inside the erasing are filled according to the given mode. Supported values:
"constant", "gaussian_noise". Defaults to"constant".fill_value (tuple(float), optional) β The values to be filled in the erasing when
fill_mode="constant". Defaults to(125, 123, 114)which is the means of ImageNet.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomGridMask(size_factor=(96.0 / 224.0, 224.0 / 224.0), ratio_factor=(0.6, 0.6), rotation_factor=(-180, 180), fill_mode='constant', fill_value=0.0, seed=None, **kwargs)[source]ο
RandomGridMask performs the Grid Mask operation on input images.
- Parameters:
size_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The relative size for grid masks. When represented as a single float, the factor will be picked between
[0.0, 0.0 + upper]. Represented as d1, d2 in the paper. Defaults to(96/224, 224/224)which is for ImageNet classification model. For COCO object detection, it is set to(0.01, 1.0)ratio_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The ratio from spacings to grid masks. When represented as a single float, the factor will be picked between
[0.0, 0.0 + upper]. Represented as ratio in the paper. Lower values make the grid size smaller, and higher values make the grid mask large.0.5indicates that grid and spacing will be of equal size. Defaults to(0.6, 0.6)which is for ImageNet classification model. For COCO object detection, it is set to(0.5, 0.5)rotation_factor (float|Sequence[float]|keras_aug.FactorSampler, optional) β The range of the degree that will be used to rotate the grid_mask. When represented as a single float, the factor will be picked between
[0.0 - lower, 0.0 + upper]. A positive value means rotating counter clock-wise, while a negative value means clock-wise. Defaults to(-180, 180)which is for ImageNet classification model. For COCO object detection, it is set to(0, 0).fill_mode (str, optional) β The fill mode. Supported values:
"constant", "gaussian_noise", "random". Defaults to"constant".fill_value (int|float, optional) β The value to be filled inside the gridblock when
fill_mode="constant". Defaults to0.seed (int|float, optional) β The random seed. Defaults to
None.
References
Utility
- class keras_aug.layers.RandomApply(layer, rate=0.5, batchwise=False, seed=None, **kwargs)[source]ο
Apply randomly an augmentation or a list of augmentations with a given probability.
Notes
The shape and type of the outputs must be the same of the inputs.
- Parameters:
layer (VectorizedBaseRandomLayer|keras.Layer|keras.Sequential) β This layer will be applied to the batch when the sampled
prob < rate. Layer should not modify the shape of the inputs.rate (float, optional) β The value that controls the frequency of applying the layer.
1.0means thelayerwill always apply.0.0means no op. Defaults to0.5.batchwise (bool, optional) β Whether to pass entire batches to the underlying layer. When set to
True, each batch is passed to a single layer, instead of each sample to an independent layer. This is useful when usingMixUp(),CutMix(),Mosaic(), etc. Defaults toFalse.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RandomChoice(layers, batchwise=False, seed=None, **kwargs)[source]ο
RandomChoice constructs a pipeline based on provided arguments.
The implemented policy does the following: for each input provided in call`(), the policy selects a random layer from the provided list of `layers. It then calls the layer() on the inputs.
Notes
The shape and type of the outputs must be the same of the inputs.
- Parameters:
layers (list(VectorizedBaseRandomLayer|keras.Layer|keras.Sequential)) β The list of the layers that will be picked randomly for the pipeline.
batchwise (bool, optional) β Whether to pass entire batches to the underlying layer. When set to
True, each batch is passed to a single layer, instead of each sample to an independent layer. This is useful when usingMixUp(),CutMix(),Mosaic(), etc. Defaults toFalse.seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.RepeatedAugment(layers, shuffle=True, seed=None, **kwargs)[source]ο
RepeatedAugment augments each image in a batch multiple times.
This technique exists to emulate the behavior of stochastic gradient descent within the context of mini-batch gradient descent. When training large vision models, choosing a large batch size can introduce too much noise into aggregated gradients causing the overall batchβs gradients to be less effective than gradients produced using smaller gradients. RepeatedAugment handles this by re-using the same image multiple times within a batch creating correlated samples.
Notes
This layer increases your batch size by a factor of
len(layers).- Parameters:
layers (list(keras_aug.layers.*)) β The list of the layers to use to augment the inputs.
shuffle (bool, optional) β Whether to shuffle the results. Essential when using an asynchronous distribution strategy such as ParameterServerStrategy. Defaults to
True.seed (int|float, optional) β The random seed. Defaults to
None.
References
Preprocessing 2Dο
Geometry
- class keras_aug.layers.CenterCrop(height, width, padding_value=0, bounding_box_format=None, bounding_box_min_area_ratio=None, bounding_box_max_aspect_ratio=None, seed=None, **kwargs)[source]ο
Center crops the images.
CenterCrop crops the central portion of the images to a specified
(height, width). If an image is smaller than the target size, it will be padded and then cropped.- Parameters:
height (int) β The height of result image.
width (int) β The width of result image.
padding_value (int|float, optional) β The padding value. Defaults to
0.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
bounding_box_min_area_ratio (float, optional) β The threshold to apply sanitize_bounding_boxes. Defaults to
None.bounding_box_max_aspect_ratio (float, optional) β The threshold to apply sanitize_bounding_boxes. Defaults to
None.seed (int|float, optional) β The random seed. Defaults to
None.
- class keras_aug.layers.PadIfNeeded(min_height=None, min_width=None, height_divisor=None, width_divisor=None, position='center', padding_value=0, bounding_box_format=None, seed=None, **kwargs)[source]ο
Pads the images if needed.
PadIfNeeded can be configured by specifying the height/width or the divisors. The images will be padded to
(min_height, min_width)or the size of the both sides to be divisible byheight_divisorandwidth_divisor. PadIfNeeded is required to specifymin_heightorheight_divisorandmin_widthorwidth_divisor, respectively.- Parameters:
min_height (int, optional) β The height of result image.
min_width (int, optional) β The width of result image.
height_divisor (int, optional) β The divisor that ensures image height is divisible by.
width_divisor (int, optional) β The divisor that ensures image width is divisible by.
position (str, optional) β The padding method. Supported values:
"center", "top_left", "top_right", "bottom_left", "bottom_right", "random". Defaults to"center".padding_value (int|float, optional) β The padding value. Defaults to
0.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
- class keras_aug.layers.Resize(height, width, interpolation='bilinear', antialias=False, crop_to_aspect_ratio=False, pad_to_aspect_ratio=False, postion='center', padding_value=0, bounding_box_format=None, seed=None, **kwargs)[source]ο
Resizes the images.
Resize will resize the images to
(height, width). Setcrop_to_aspect_ratioorpad_to_aspect_ratiotoTrueto keep the aspect ratio.When
crop_to_aspect_ratioorpad_to_aspect_ratiois set toTrue. You can control the cropping position or padding position by settingposition.- Parameters:
height (int) β The height of result image.
width (int) β The width of result image.
interpolation (str, optional) β The interpolation mode. Supported values:
"nearest", "bilinear". Defaults to βbilinearβ.antialias (bool, optional) β Whether to use antialias. Defaults to
False.crop_to_aspect_ratio (bool, optional) β If
True, the output images will be cropped to return the largest possible window in the images. Defaults toFalse.pad_to_aspect_ratio (bool, optional) β If
True, the output images will be padded to return the largest possible resize of the images. Defaults toFalse.position (str, optional) β The padding method. Supported values:
"center", "top_left", "top_right", "bottom_left", "bottom_right", "random". Defaults to"center".padding_value (int|float, optional) β The padding value. Defaults to
0.bounding_box_format (str, optional) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
seed (int|float, optional) β The random seed. Defaults to
None.
References
Intensity
- class keras_aug.layers.AutoContrast(value_range, **kwargs)[source]ο
Performs the AutoContrast operation on the input images.
Auto contrast stretches the values of an image across the entire available
value_range. This makes differences between pixels more obvious. An example of this is if an image only has values[0, 1]out of the range[0, 255], auto contrast will change the1values to be255.- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.
References
- class keras_aug.layers.Equalize(value_range, bins=256, **kwargs)[source]ο
Performs histogram equalization on a channel-wise basis.
- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.bins (int, optional) β The number of bins to use in histogram equalization. Should be in the range
[0, 256]. Defaults to256.
References
- class keras_aug.layers.Grayscale(output_channels=3, **kwargs)[source]ο
Grayscale transforms RGB images to grayscale images.
- Parameters:
output_channels (int, optional) β The number of the color channels of the outputs. Defaults to
3.
References
- class keras_aug.layers.Invert(value_range, **kwargs)[source]ο
Inverts the inputs.
Inverts the pixel value by equation:
y = max_pixel_value - x.- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.
- class keras_aug.layers.Normalize(value_range, mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), **kwargs)[source]ο
Normalizes the mean and std on given images.
Normalize applies following equation to the input images:
y = (x - mean * max_pixel_value) / (std * max_pixel_value)- Parameters:
value_range (Sequence[int|float]) β The range of values the incoming images will have. This is typically either
[0, 1]or[0, 255]depending on how your preprocessing pipeline is set up.mean (list(float)) β The mean values. Defaults to
(0.485, 0.456, 0.406)which is the mean values from ImageNet.std (list(float)) β The std values. Defaults to
(0.229, 0.224, 0.225)which is the std values from ImageNetseed (int|float, optional) β The random seed. Defaults to
None.
- class keras_aug.layers.Rescale(scale, offset=0.0, **kwargs)[source]ο
Rescales the inputs to a new range.
Rescale rescales every value of the inputs (often the images) by the equation:
y = x * scale + offset.- Parameters:
scale (int|float) β The scale to apply to the inputs.
offset (int|float, optional) β The offset to apply to the inputs. Defaults to
0.0
References
Utility
- class keras_aug.layers.SanitizeBoundingBox(min_size, bounding_box_format, **kwargs)[source]ο
Remove degenerate/invalid bounding boxes.
- Parameters:
min_size (int) β The minimum size of the smaller side of bounding boxes.
bounding_box_format (str) β The format of bounding boxes of input dataset. Refer https://github.com/james77777778/keras-aug/blob/main/keras_aug/datapoints/bounding_box/converter.py for more details on supported bounding box formats.
References
Base 2Dο
- class keras_aug.layers.VectorizedBaseRandomLayer(seed=None, **kwargs)[source]ο
Abstract base layer for vectorized image augmentation.
This layer contains base functionalities for preprocessing layers which augment image related data, e.g. image and in the future, label and bounding boxes. The subclasses could avoid making certain mistakes and reduce code duplications.
This layer requires you to implement one method
augment_images(), which augments one single image during the training. There are a few additional methods that you can implement for added functionality on the layer.augment_ragged_image()andcompute_ragged_image_signature(), which handles ragged images augmentation if the layer supports that.augment_labels(), which handles label augmentation if the layer supports that.augment_bounding_boxes(), which handles the bounding box augmentation, if the layer supports that.augment_keypoints(), which handles the keypoints augmentation, if the layer supports that.augment_segmentation_masks(), which handles the segmentation masks augmentation, if the layer supports that.augment_custom_annotations(), which handles the custom annotations augmentation, if the layer supports that. This is useful to implement augmentation for special annotatinos.get_random_transformations(), which should produce a batch of random transformation settings. The transformation object, which must be a batched Tensor or a dictionary where each input is a batched Tensor, will be passed toaugment_images,augment_labelsand augment_bounding_boxes, to coordinate the randomness behavior, eg, in the RandomFlip layer, the image and bounding_boxes should be changed in the same way.The
call()method support two formats of inputs:1. Single image tensor with 3D (HWC) or 4D (NHWC) format. 2. A dict of tensors with stable keys. The supported keys are ``"images"``, ``"labels"``, ``"bounding_boxes"``, ``segmentation_masks``, ``keypoints`` and ``custom_annotations`` at the moment. We might add more keys in future when we support more types of augmentation.The output of the
call()will be in two formats, which will be the same structure as the inputs.The
call()will unpack the inputs, forward to the correct function, and pack the output back to the same structure as the inputs.By default, the dense or ragged status of the output will be preserved. However, you can override this behavior by setting
self.force_output_dense_images = Truein your__init__()method. When enabled, images and segmentation masks will be converted to dense tensor byto_tensor()if ragged.class SubclassLayer(VectorizedBaseImageAugmentationLayer): def __init__(self): super().__init__() self.force_output_dense_images = True
Note that since the randomness is also a common functionality, this layer also includes a keras.src.backend.RandomGenerator, which can be used to produce the random numbers. The random number generator is stored in the self._random_generator attribute.
References
Base 3Dο
WIP