ai-cookbook / src /theory /layers.qmd
Sébastien De Greef
chore: Update colorFrom in README.md and index.qmd
e3bf489
raw
history blame
5.55 kB
## **1. Input Layers**
* Usage: Receive input data, propagate it to subsequent layers
* Description: The first layer in a neural network that receives input data
* Strengths: Essential for processing input data, easy to implement
* Weaknesses: Limited functionality, no learning occurs in this layer
## **2. Dense Layers (Fully Connected Layers)**
* Usage: Feature extraction, classification, regression
* Description: A layer where every input is connected to every output, using a weighted sum
* Strengths: Excellent for feature extraction, easy to implement, fast computation
* Weaknesses: Can be prone to overfitting, computationally expensive for large inputs
## **3. Convolutional Layers (Conv Layers)**
* Usage: Image classification, object detection, image segmentation
* Description: A layer that applies filters to small regions of the input data, scanning the input data horizontally and vertically
* Strengths: Excellent for image processing, reduces spatial dimensions, retains spatial hierarchy
* Weaknesses: Computationally expensive, require large datasets
## **4. Pooling Layers (Downsampling Layers)**
* Usage: Image classification, object detection, image segmentation
* Description: A layer that reduces spatial dimensions by taking the maximum or average value across a region
* Strengths: Reduces spatial dimensions, reduces number of parameters, retains important features
* Weaknesses: Loses some information, can be sensitive to hyperparameters
## **5. Recurrent Layers (RNNs)**
* Usage: Natural Language Processing (NLP), sequence prediction, time series forecasting
* Description: A layer that processes sequential data, using hidden state to capture temporal dependencies
* Strengths: Excellent for sequential data, can model long-term dependencies
* Weaknesses: Suffers from vanishing gradients, difficult to train, computationally expensive
## **6. Long Short-Term Memory (LSTM) Layers**
* Usage: NLP, sequence prediction, time series forecasting
* Description: A type of RNN that uses memory cells to learn long-term dependencies
* Strengths: Excellent for sequential data, can model long-term dependencies, mitigates vanishing gradients
* Weaknesses: Computationally expensive, require large datasets
## **7. Gated Recurrent Unit (GRU) Layers**
* Usage: NLP, sequence prediction, time series forecasting
* Description: A simpler alternative to LSTM, using gates to control the flow of information
* Strengths: Faster computation, simpler than LSTM, easier to train
* Weaknesses: May not perform as well as LSTM, limited capacity to model long-term dependencies
## **8. Batch Normalization Layers**
* Usage: Normalizing inputs, stabilizing training, improving performance
* Description: A layer that normalizes inputs, reducing internal covariate shift
* Strengths: Improves training stability, accelerates training, improves performance
* Weaknesses: Requires careful tuning of hyperparameters, can be computationally expensive
## **9. Dropout Layers**
* Usage: Regularization, preventing overfitting
* Description: A layer that randomly drops out neurons during training, reducing overfitting
* Strengths: Effective regularization technique, reduces overfitting, improves generalization
* Weaknesses: Can slow down training, requires careful tuning of hyperparameters
## **10. Flatten Layers**
* Usage: Reshaping data, preparing data for dense layers
* Description: A layer that flattens input data into a one-dimensional array
* Strengths: Essential for preparing data for dense layers, easy to implement
* Weaknesses: Limited functionality, no learning occurs in this layer
## **11. Embedding Layers**
* Usage: NLP, word embeddings, language modeling
* Description: A layer that converts categorical data into dense vectors
* Strengths: Excellent for NLP tasks, reduces dimensionality, captures semantic relationships
* Weaknesses: Require large datasets, can be computationally expensive
## **12. Attention Layers**
* Usage: NLP, machine translation, question answering
* Description: A layer that computes weighted sums of input data, focusing on relevant regions
* Strengths: Excellent for sequential data, can model long-range dependencies, improves performance
* Weaknesses: Computationally expensive, require careful tuning of hyperparameters
## **13. Upsampling Layers**
* Usage: Image segmentation, object detection, image generation
* Description: A layer that increases spatial dimensions, using interpolation or learned upsampling filters
* Strengths: Excellent for image processing, improves spatial resolution, enables image generation
* Weaknesses: Computationally expensive, require careful tuning of hyperparameters
## **14. Normalization Layers**
* Usage: Normalizing inputs, stabilizing training, improving performance
* Description: A layer that normalizes inputs, reducing internal covariate shift
* Strengths: Improves training stability, accelerates training, improves performance
* Weaknesses: Requires careful tuning of hyperparameters, can be computationally expensive
## **15. Activation Functions**
* Usage: Introducing non-linearity, enhancing model capacity
* Description: A function that introduces non-linearity into the model, enabling complex representations
* Strengths: Enables complex representations, improves model capacity, enhances performance
* Weaknesses: Requires careful tuning of hyperparameters, can be computationally expensive