Personal tools

DL Architectures

Deep Learning Architecture_052324A
[Deep Learning Architecture - IBM]
 

- Overview

While DL is certainly not new, it is experiencing explosive growth because of the intersection of deeply layered neural networks and the use of GPUs to accelerate their execution. 

Big data has also fed this growth. Because DL relies on training neural networks with example data and rewarding them based on their success, the more data, the better to build these deep learning structures.

The number of architectures and algorithms that are used in DL is wide and varied. Artificial neural network (ANN) is the underlying architecture behind DL. Based on ANN, several variations of the algorithms have been invented.

DL models, including various architectures like recurrent neural networks (RNN), convolutional neural networks (CNN), and deep belief networks (DBN), are structured in a specific manner to enable learning from complex data and making predictions or classifications. 

The architecture of a DL model typically consists of several interconnected layers, each serving a specific purpose in processing and transforming input data to generate useful outputs. 

 

- Recurrent Neural Network (RNN)

A recurrent neural network (RNN) is a deep learning model that converts sequential data input into a specific sequential data output. RNNs are a type of artificial neural network that use sequential data to solve temporal problems. They are characterized by their memory, which means they use information from previous inputs to influence the current input and output.

RNNs are commonly used for: Language translation, Natural language processing (NLP), Speech recognition, Image captioning, Siri, Voice search, and Google Translate.

RNNs are derived from feedforward networks and exhibit similar behavior to how human brains function. They produce predictive results in sequential data that other algorithms can't. RNNs were created because feed-forward neural networks can't handle sequential data, only consider the current input, and can't memorize previous inputs.

 

- Convolutional Neural Network (CNN)

A convolutional neural network (CNN) is a network architecture for deep learning (DL) that learns directly from data.

CNNs are particularly useful for finding patterns in images to recognize objects, classes, and categories. They can also be quite effective for classifying audio, time-series, and signal data.

A CNN can have tens or hundreds of layers that each learn to detect different features of an image. Filters are applied to each training image at different resolutions, and the output of each convolved image is used as the input to the next layer. The filters can start as very simple features, such as brightness and edges, and increase in complexity to features that uniquely define the object.

Convolutional Neural Network (CNN) is the extended version of artificial neural networks (ANN) which is predominantly used to extract the feature from the grid-like matrix dataset. For example visual datasets like images or videos where data patterns play an extensive role.

CNN consists of multiple layers like the input layer, Convolutional layer, Pooling layer, and fully connected layers. 

The Convolutional layer applies filters to the input image to extract features, the Pooling layer downsamples the image to reduce computation, and the fully connected layer makes the final prediction. The network learns the optimal filters through backpropagation and gradient descent.

 

- Deep Belief Networks (DBNs)

Deep Belief Networks (DBNs) are sophisticated artificial neural networks used in the field of deep learning, a subset of machine learning. They are designed to discover and learn patterns within large sets of data automatically. Imagine them as multi-layered networks, where each layer is capable of making sense of the information received from the previous one, gradually building up a complex understanding of the overall data. 

DBNs are composed of multiple layers of stochastic, or randomly determined, units. These units are known as Restricted Boltzmann Machines (RBMs) or other similar structures. Each layer in a DBN aims to extract different features from the input data, with lower layers identifying basic patterns and higher layers recognizing more abstract concepts. 

This structure allows DBNs to effectively learn complex representations of data, which makes them particularly useful for tasks like image and speech recognition, where the input data is high-dimensional and requires a deep level of understanding. 

The architecture of DBNs also makes them good at unsupervised learning, where the goal is to understand and label input data without explicit guidance. This characteristic is particularly useful in scenarios where labelled data is scarce or when the goal is to explore the structure of the data without any preconceived labels.

 

Mountain Goats_Alaska_052622A
[Mountain Goats, Alaska]

- Autoencoders

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data, typically for the purpose of dimensionality reduction. An autoencoder employs unsupervised learning to learn a representation (encoding) for a set of data, typically for the purpose of reducing the dimensionality of the data. 

The network is trained to compress the input into a lower-dimensional code and then reconstruct the output from this representation to match the original input as closely as possible, hence the name 'autoencoder'. 

The architecture of an autoencoder is designed to be symmetrical with a bottleneck in the middle. The network consists of two main parts:

  • Encoder: This is the part of the network that compresses the input into a latent-space representation. It encodes the input data as a compressed representation in a reduced dimension. The encoder layer maps the input data to the hidden layer.
  • Decoder: This part aims to reconstruct the input data from the latent space representation. The decoder layer maps the hidden layer to the reconstruction of the input data.

The bottleneck, which is the layer that contains the code, represents the compressed knowledge of the input data. The key idea is that the autoencoder learns to ignore the noise and capture the most salient features of the data.

 

- Self Organizing Maps

The concept of a self-organizing map (SOM) was first put forth by Kohonen. It is a way to reduce data dimensions since it is an unsupervised neural network that is trained using unsupervised learning techniques to build a low-dimensional, discretized representation from the input space of the training samples. This representation is known as a map. 

A SOM was influenced by 1970s neural systems’ biological models. It employs an unsupervised learning methodology and uses a competitive learning algorithm to train its network. 

To minimize complex issues for straightforward interpretation, SOM is utilized for mapping and clustering (or dimensionality reduction) procedures to map multidimensional data onto lower-dimensional spaces. The output layer and the input layer are the two layers that make up the SOM. This is also known as the Kohonen Map. 
 
 

[More to come ...]

 

Document Actions