From 5c508a4d686890a2a94e4df529a3874ff23102ac Mon Sep 17 00:00:00 2001 From: Rea Fernandes <zob06qih@rhrk.uni-kl.de> Date: Sun, 26 Nov 2023 21:29:44 +0100 Subject: [PATCH] Solution 2.1.3 --- exercises/exercise2/exercise2.tex | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/exercises/exercise2/exercise2.tex b/exercises/exercise2/exercise2.tex index 0063188..8863c49 100644 --- a/exercises/exercise2/exercise2.tex +++ b/exercises/exercise2/exercise2.tex @@ -79,6 +79,25 @@ We adopt the convention that for a 2D array $X$ (or $K$), $X_{i,j}$ denotes the \subsection{What is the difference between 'valid' and 'same' padding?} +Solution + +\textbf{VALID Padding:} +\begin{itemize} + \item No padding is added to the input image. + \item The filter window always stays inside the input image. + \item Loss of information may occur, especially on the right and bottom edges. + \item Output size is generally smaller than the input size. +\end{itemize} + +\textbf{SAME Padding:} +\begin{itemize} + \item Input is half padded to ensure the filter is applied to all input elements. + \item Output size is the same as the input size when the stride is 1. + \item "SAME" is commonly used during model training for computational convenience. + \item Output size calculation: $output\_spatial\_shape[i] = \lceil \frac{input\_spatial\_shape[i]}{strides[i]} \rceil$ +\end{itemize} + + \subsection{Why do we prefer Convolutional Neural Networks (CNNs) over a Multi-Layer Perceptron (MLP) for image data?} In contrast to CNN, which uses a tensor as input, Multi-Layer Perceptron uses a vector. Consequently, CNN has a superior understanding of the spatial relation—the relationship between adjacent pixels in a picture. Hence, CNN will function better for complex images. -- GitLab