• How is a grayscale image represented on a computer? How about a color image?
  • How are the files and folders in the MNIST_SAMPLE dataset structured? Why?
  • Explain how the "pixel similarity" approach to classifying digits works.
  • What is a list comprehension? Create one now that selects odd numbers from a list and doubles them.
  • What is a "rank-3 tensor"?
  • What is the difference between tensor rank and shape? How do you get the rank from the shape?
  • What are RMSE and L1 norm?
  • How can you apply a calculation on thousands of numbers at once, many thousands of times faster than a Python loop?
  • Create a 3×3 tensor or array containing the numbers from 1 to 9. Double it. Select the bottom-right four numbers.
  • What is broadcasting?
  • Are metrics generally calculated using the training set, or the validation set? Why?
  • What is SGD?
  • Why does SGD use mini-batches?
  • What are the seven steps in SGD for machine learning?
  • How do we initialise the weights in a model?
  • What is "loss"?
  • Why can't we always use a high learning rate?
  • What is a "gradient"?
  • Do you need to know how to calculate gradients yourself?
  • Why can't we use accuracy as a loss function?