import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
The TensorFlow framework has two components:
This study note will focus on defining computational graphs, so we are at defining stage in the workflow diagram.
A computational graph is a type of directed graph where:
Building computational graphs requires three elements:
keras.Input()
: start model by define an Input objectshape=()
have to be defined at input layerslayers
: chain layer calls to specify the model's forward passshape=()
have to be defined at output layerskeras.Model()
: groups layers into an object with training and inference featuresinputs=[]
outputs=[]
Feature | sequential API | Functional API |
---|---|---|
description | plain stack of layers | complex graph topologies |
model/layers inputs | one | multiple |
model/layers outputs | one | multiple |
layer sharing | False | True |
topology | linear | non-linear |
special usage | - | residual connection, a multi-branch model |
model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))
keras.utils.plot_model(model, "multi_input_and_output_model.png", show_shapes=True)
main_input = keras.Input(shape=(4,), name="main_input")
aux_input = keras.Input(shape=(2,), name="aux_input")
x = layers.concatenate([main_input, aux_input])
main_output = layers.Dense(1, name="main_output")(x)
aux_output = layers.Dense(2, name="aux_output")(x)
model = keras.Model(
inputs=[main_input, aux_input],
outputs=[main_output, aux_output],
)
keras.utils.plot_model(model, "multi_input_and_output_model.png", show_shapes=True)
Layers are functions with a known mathematical structure that can be reused and have trainable variables.
class layer
has def __call__(self, inputs, **kwargs):
, so it is callable and be written into double brackets.
x = layers.Dense(64, activation='relu')(x)
equvalent to:
fc = Dense(64, activation='relu')
x = fc(x)
Since the input shape is the only one you need to define, Keras will demand it in the first layer. But in this definition, Keras ignores the first dimension, which is the batch size. Your model should be able to deal with any batch size, so you define only the other dimensions:
input_shape=(50,50,3)
(30,50,50,3)
(None,50,50,3)
for input_shape=(50,50,3)
:None
Each type of layer requires the input with a certain number of dimensions:
(batch_size, input_size)
or (batch_size, optional,...,optional, input_size)
(batch_size, imageside1, imageside2, channels)
(batch_size, channels, imageside1, imageside2)
(batch_size, sequence_length, features)
Two modules in Tensorflow provides layers API with different levels. Using conv2d
as example for comparison:
tf.nn
: Wrappers for primitive Neural Net (NN) Operations. This lower level API is there for people with special needs, or who wishes to keep a finer control of what is going ontf.nn.conv2d
: has to manually declare weights, biases, regularization, activationtf.keras.layers
: high level wrapper built upon tf.nn. This higher level API is to provide functions that greatly simplify the design of the most common neural nets.tf.keras.layers.conv2d
: a single line of code to create a convolutional layer, with default setting for weights, biases, regularization, activation.source of layer illustration: https://raw.githubusercontent.com/rstudio/cheatsheets/master/keras.pdf
layer | description | illustration |
---|---|---|
tf.keras.layers.InputLayer | Layer to be used as an entry point into a Network (a graph of layers) | |
tf.keras.layers.Dense | regular densely-connected NN layer | |
tf.keras.layers.Activation | Apply an activation function to an output | |
tf.keras.layers.Dropout | Applies Dropoutto the input | |
tf.keras.layers.Reshape | Reshapes an output to a certain shape | |
tf.keras.layers.Permute | Permute the dimensions of an input according to a given pattern | |
tf.keras.layers.RepeatVector | Repeats the input n times | |
tf.keras.layers.Lambda | Wraps arbitrary expression as a layer | |
tf.keras.layers.ActivityRegularization | Layer that applies an update to the cost function based input activity | |
tf.keras.layers.Masking | Masks a sequence by using a mask value to skip timesteps | |
tf.keras.layers.Flatten | Flattens an input | |
tf.keras.layers.Concatenate | concatenates a list of inputs. |
layer | description | illustration |
---|---|---|
tf.keras.layers.Conv1D | 1D, e.g.temporal convolution | |
tf.keras.layers.Conv1DTranspose | Transposed convolution layer (sometimes called Deconvolution) | |
tf.keras.layers.Conv2D | 2D, e.g. spatial convolution over images | |
tf.keras.layers.Conv2DTranspose | Transposed 2D (deconvolution) | |
tf.keras.layers.Conv3D | 3D, e.g. spatial convolution over volumes | |
tf.keras.layers.Conv3DTranspose | Transposed 3D (deconvolution) | |
tf.keras.layers.ConvLSTM2D | Convolutional LSTM | |
tf.keras.layers.SeparableConv1D tf.keras.layers.SeparableConv2D |
Depthwise separable 2D | |
tf.keras.layers.UpSampling1D tf.keras.layers.UpSampling2D tf.keras.layers.UpSampling3D |
Upsampling layer | |
tf.keras.layers.ZeroPadding1D tf.keras.layers.ZeroPadding2D tf.keras.layers.ZeroPadding3D |
Zero-padding layer | |
tf.keras.layers.Cropping1D tf.keras.layers.Cropping2D tf.keras.layers.Cropping3D |
Cropping layer |
layer | description |
---|---|
tf.keras.layers.GlobalMaxPool1D tf.keras.layers.GlobalMaxPool2D tf.keras.layers.GlobalMaxPool3D |
Maximum pooling for 1D to 3D |
tf.keras.layers.AveragePooling1D tf.keras.layers.AveragePooling2D tf.keras.layers.AveragePooling3D |
Average pooling for 1D to 3D |
tf.keras.layers.GlobalMaxPool1D tf.keras.layers.GlobalMaxPool2D tf.keras.layers.GlobalMaxPool3D |
Global maximum pooling |
tf.keras.layers.GlobalAveragePooling1D tf.keras.layers.GlobalAveragePooling2D tf.keras.layers.GlobalAveragePooling3D |
Global average pooling |
layer | description |
---|---|
tf.keras.layers.Activation(object, activation) | Apply an activation function to an output |
tf.keras.layers.LeakyReLU | Leaky version of a rectified linear unit |
tf.keras.layers.PReLU | Parametric rectified linear unit |
tf.keras.layers.ThresholdedReLU | Thresholded rectified linear unit |
tf.keras.layers.ELU | Exponential linear unit |
tf.keras.layers.Softmax | Softmax activation function |
layer | description |
---|---|
tf.keras.layers.Dropout | Applies dropout to the input |
tf.keras.layers.SpatialDropout1D tf.keras.layers.SpatialDropout2D tf.keras.layers.SpatialDropout3D |
Spatial 1D to 3D version of dropout |
layer | description |
---|---|
tf.keras.layers.SimpleRNN | Fully-connected RNN where the output is to be fed back to input |
tf.keras.layers.GRU | Gated recurrent unit - Cho et al |
tf.keras.layers.LSTM | Long-Short Term Memory unit - Hochreiter 1997 |
tf.keras.layers.ConvLSTM1D tf.keras.layers.ConvLSTM2D tf.keras.layers.ConvLSTM3D |
Similar to an LSTM layer, but the input transformations and recurrent transformations are both convolutional. |
layer | description |
---|---|
tf.keras.layers.LocallyConnected1D tf.keras.layers.LocallyConnected2D |
Similar to convolution, but weights are not shared, i.e. different filters for each patch |
layer | description |
---|---|
tf.keras.layers.Dot | Layer that computes a dot product between samples in two tensors. |
tf.linalg.matmul | Multiplies matrix a by matrix b, producing a * b |
tf.keras.layers.Attention | Dot-product attention layer, a.k.a. Luong-style attention |
tf.keras.layers.AdditiveAttention | Additive attention layer, a.k.a. Bahdanau-style attention. |
tf.keras.layers.MultiHeadAttention | MultiHeadAttention layer |
layer | description |
---|---|
tf.keras.layers.Add | Layer that adds (element-wise) a list of inputs. |
tf.keras.layers.Subtract | Layer that subtracts (element-wise) two inputs |
tf.keras.layers.Multiply | Layer that multiplies (element-wise) a list of inputs. |
tf.keras.layers.Maximum | Layer that computes the maximum (element-wise) a list of inputs |
tf.keras.layers.Minimum | Layer that computes the minimum (element-wise) a list of inputs |
function | description |
---|---|
tf.keras.layers.add | function that adds (element-wise) a list of inputs. |
tf.keras.layers.subtract | function that subtracts (element-wise) two inputs |
tf.keras.layers.multiply | function that multiplies (element-wise) a list of inputs. |
tf.keras.layers.maximum | function that computes the maximum (element-wise) a list of inputs |
tf.keras.layers.minimum | function that computes the minimum (element-wise) a list of inputs |
layer | description |
---|---|
tf.math.reduce_sum | Computes the sum of elements across dimensions of a tensor. |
tf.math.reduce_prod | Computes tf.math.multiply of elements across dimensions of a tensor. |
tf.math.reduce_mean | Computes the mean of elements across dimensions of a tensor. |
tf.math.reduce_max | Computes tf.math.maximum of elements across dimensions of a tensor. |
tf.math.reduce_min | Computes the tf.math.minimum of elements across dimensions of a tensor. |
tf.math.reduce_variance | Computes the variance of elements across dimensions of a tensor. |
tf.math.reduce_std | Computes the standard deviation of elements across dimensions of a tensor. |