Università degli Studi di Pavia

Facoltà di Ingegneria


Deep Learning

A.A. 2023-2024

Second Semester

Fri: 11:00 a.m. - 1:00 p.m., Aula C6

Fri: 2:00 p.m. - 4:00 p.m., Aula D8

Lectures & Suggested Readings:

  • Reports of errors in the resources below are always welcome
    1. 2024.03.08 (theory)

      Introduction [pdf]
      AI spring? Artificial Intelligence, Machine Learning, Deep Learning: facts, myths and a few reflections.

    2. 2024.03.08 (theory)

      Fundamentals: Artificial Neural Networks [pdf]
      Foundations of machine learning: dataset, representation, evaluation, optimization. Feed-forward neural networks as universal approximators.

    3. 2024.03.15 (theory)

      Flow Graphs and Automatic Differentiation [pdf]
      Tensorial representation, flow graphs. Automatic differentiation: primal graph, adjoint graph.

    4. 2024.03.22 (theory)

      Deep Networks [pdf]
      Deeper networks: potential advantages and new challenges. Tensorial layerwise representation. Softmax and cross-entropy.

      Aside 1: Tensor Broadcasting [pdf]

      Shannon Entropy (Wikipedia)

      Cross Entropy (Wikipedia)

    5. 2024.04.05 (theory)

      Learning as Optimization [pdf]
      Vanishing and exploding gradients. First and second order optimization, approximations, optimizers. Further tricks.

      Aside 2: Predictions [pdf]
      From in-sample optimization to out-of-sample generalization.

      Aside 3: Exponential Moving Average [pdf]

    6. 2024.04.12 (theory)

      Aside 4: Hardware for Deep Learning [pdf]
      Main differences bewtween CPUs and GPUs, SIMT parallelism, bus-oriented communication, a few caveats.

      Aside 5: Differentiating Algorithms [pdf]
      Wengert list, ahead-of-time and runtime autodiff, lazy mode, just-in-time compilation, differences among TensorFlow, PyTorch, JAX.

    7. 2024.04.19 (theory)

      Deep Convolutional Neural Networks [pdf]
      Convolutional filter, filter banks, feature maps, pooling, layerwise gradients.

    8. 2024.05.03 (theory)

      Deep Convolutional Neural Networks and Beyond [pdf]
      Some insight into what happens in convolution layers. Different DCNN architectures. Transfer learning. Segmentation and object detection.

      J Yosinski, J Clune, Y Bengio, H Lipson, "How transferable are features in deep neural networks?" in Advances in Neural Information Processing Systems (NIPS 2014) [link]

    9. 2024.05.10 (theory)

      Deep Learning and Time Series [pdf]
      Recurrent Neural Networks (RNN), temporal unfolding, LSTM Cells, GRU cells, encoder / decoder, convolution, time series analysis-

      Aside 6: Word Embedding [pdf]
      Skip-grams, probability distributions of context and center words, training and results, continuous bag of words (CBOW) model.

    10. 2024.05.17 (theory)

      Attention and Transformers [pdf]
      Attention as a kernel, attention maps, queries, key and values, attention-based encoder and decoder, transformer architecture, translator.

      A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, A N Gomez, L Kaiser, I Polosukhin, "Attention Is All You Need" in Advances in Neural Information Processing Systems (NIPS 2017) [link]

    11. 2024.05.24 (theory)

      Aside 7: Auto-Encoders [pdf]
      A very popular and powerful network architecture pattern, which is also the basis for diffusion models. The relation between Auto-Encoders and Principal Component Analysis.

      Covariance Matrix (Wikipedia)

      Principal Component Analysis (Wikipedia)

      Aside 8: Kullback-Leibler divergence [pdf]
      Shannon's entropy in the theory of information: intuition and formalism. Cross-entropy, KL divergence, intuition and formalism.

      Kullback-Leibler Divergence (Wikipedia)

    12. 2024.05.31 (theory)

      Generative Networks [pdf]
      Generative adversarial networks (GAN), Variational Auto-Encoders (VAE): structuring the latent space, Diffusion Models, multiple denoising steps, implementation details, Latent Diffusion Models, guided generation: conditioning on labels

    13. 2024.06.07 (theory)

      Aside 9: Reinforcement Learning [pdf]
      A short recap about RL foundations, Markov decision process, state value function, policy, optimality, action value function, Q-learning.

      Deep Reinforcement Learning [pdf]
      Integrating DNNs into the RL paradigm, DQN algorithm, policy gradient, Actor-Critic methods

    14. 2024.06.14 (theory)

      Monte Carlo Tree Search [pdf]
      Game trees, Monte Carlo strategy, Monte Carlo Tree Search (MCTS), Upper Confidence Bounds applied to Trees (UCT).

      Alpha Zero [pdf]
      MCTS + DNN, network architecture, replacing MCTS rollout with estimation, network training, AlphaZero in continuous spaces (hints).

      D J Mankowitz et al., "Faster sorting algorithms discovered using deep reinforcement learning", Nature 618, 257:263 (2023) [link]

    Instructor

    1. Marco Piastra

    2. Contact: marco.piastra@unipv.it


    Kiro

    1. Course info


    Exams

    1. See Faculty website


    Further resources:

    Video recordings and Colab notebooks are available on Kiro

      (There are no required textbooks for this course. The following books are recommended as optional readings)

      1. Christopher Bishop, Hugh Bishop
        Deep Learning: Foundations and Concepts
        Springer, 2024
        [Online version]

      2. Aston Zhang, Zachary Lipton, Mu Li, Alexander Smola
        Dive into Deep Learning
        Cambridge University Press, 2024
        [Online version, with exercises]

      3. Ian Goodfellow, Yoshua Bengio, Aaron Courville
        Deep Learning
        MIT Press, 2017
        [Online version]

      4. Kevin P. Murphy
        Probabilistic Machine Learning: Advanced Topics
        MIT Press, 2023
        [Pre-print]

      5. Richard s. Sutton, Andrew G. Barto
        Reinforcement Learning: An Introduction (second edition)
        MIT Press, 2018
        [Online version]


      Links

      1. Artificial Intelligence Reading Group


      1. Deep Learning, A.A. 2022-2023 and before