Examine the effect of augmentations in your browser

When working with image data, practitioners often use augmentations. Augmentations are techniques that artificially and randomly alter the data to increase diversity. Applying such transformations to the training data makes the model more robust. For image data, frequently used candidates are rotating, resizing, or blurring. The effects of the transformations are easy to see and comprehend. Even multiple augmentations can be grasped quickly, as the following example shows:

Exemplary augmentations. Image created by the author after a TensorFlow tutorial.

Such augmentations are not restricted to images (though they are pretty popular there). For audio data, there are similar ways to modify the training data. The downside is that one cannot observe…


Getting Started

Might have a minimal bias towards Deep Learning

Have you ever asked yourself where you currently are on your Machine Learning journey? And what’s there that you can still learn about?

This checklist helps you answer such questions. It provides an outline of the field, divided into three broad levels: Entry level (where everybody starts), intermediate level (where you quickly get to), and advanced level (where you stay for a long time). It does not list specific courses or software, but focuses on the general concepts:

An overview of the checklist, image by the author. It’s available on Notion here, and on GitHub here.

Let’s cover the levels in more detail, starting with the entry level.

Entry level


Getting Started

A guide to the evolution of generative networks

Photo by Anders Jildén on Unsplash.

Introduction

In short, the core idea behind generative networks is capturing the underlying distribution of the data. This distribution can not be observed directly, but has to be approximately inferred from the training data. Over the years, many techniques have emerged that aim to generate data similar to the input samples.

This post intends to give you an overview of the evolution, beginning with the AutoEncoder, describing its descendant, the Variational AutoEncoder, then taking a look at the GAN, and ending with the CycleGAN as an extension to the GAN setting.

AutoEncoders

Early deep generative approaches used AutoEncoders [1]. These networks aim…


Use online courses to create your curriculum

The internet is full of courses and offers many learning materials. You can use these resources to replicate the curriculum of an ML Master’s degree.

Photo by MD Duran on Unsplash

A Bachelor’s study usually takes six semesters; a Master’s study takes four. But this is only an outline. I’ve witnessed people doing their BA in three semesters and some taking nine semesters. Sometimes there are so many exciting courses that you voluntarily stay longer to learn it all. Therefore, I’ve loosely structured the recreated curriculum into four semesters. The primary focus is on Machine Learning and Deep Learning.

First semester

Artificial Intelligence

As the first course in your curriculum…


From entry to expert level

The field of Machine Learning is huge. You can easily be overwhelmed by the amount of information out there. To not get lost, the following list helps you estimate where you are. It provides an outline of the vast Deep Learning space and does not emphasize certain resources. Where appropriate, I have included clues to help you orientate.

An excerpt of the list, by the author. The list is available at GitHub here and on Notion here.

Since the list has gotten rather long, I have included an excerpt above; the full list is at the bottom of this post.

Entry level

The entry-level is split into 5 categories:

  • Data handling introduces you to small datasets
  • Classic Machine Learning covers key…


Transformers are at it, once again.

Day in and day out technical devices play music. Your phone can do, my computer can do, smart devices play songs on command. Therefore, why don’t we let the technical devices also create music? Not merely playing back our results but coming up with original creations?

Photo by Weston MacKinnon on Unsplash

There’s an ongoing field of research in this direction; DeepMind made a strong contribution with their WaveNet architecture, capable of generating raw audio. The drawback of generating raw audio — generating the float values resembling the audio curve — is the immense computational power involved. …


Office Hours

Use freely available information to create your own curriculum

Many universities make their curriculums publicly available, listing all required courses to attain a degree. The Computer Science field is no different. Using such freely accessible resources (see MIT (English), JMU (German), and KIT (German) as a starting point), one can create a custom schedule.

This post attempts to recreate a CS Bachelor degree, but with online resources. Some of them are available at no charge, others will cost you a small fee. All in all, they are an inexpensive alternative to learning things similar to those taught at universities. A remark: With the resources used I generally tried to…


From a single machine to multi-worker setups

After you have finally created that training script it’s time to scale things up. From a local development environment, be it an IDE or Colab, to a large computer cluster, it’s quite a stretch. The following best practices make this transition easier.

Photo by Frank Busch on Unsplash

Argument parsers

The first is to use argument parsers. Python provides such functionality with the argparse module. Generally, you want to make the batch size, the number of epochs, and any directories a selectable argument. It’s very frustrating to go through a script manually, find all places where a specific parameter is used, and change it all by hand. Thus…


Presenting the key concepts that enable RL

Recent successes, achieved with the help of Reinforcement Learning, have quite extensively been covered by the media. One can think of DeepMind’s AlphaGo algorithm: Using Reinforcement Learning (and a massive amount of expensive hardware to train it), AlphaGo learned to play the age-old game of Go and even developed its own playstyle.

Another example was demonstrated by OpenAI, whose researchers taught a robot hand to solve a Rubik’s Cube. …


Getting Started

Use the one-day-per-week principle to gradually tick it

There are many awesome courses to learn data science. And there are plenty of certificates that certify your success.

But, how do you track your progress? How do you know what you have already achieved, and what’s still there to get your hands on?

The following checklist can help you get an overview of your progress. It’s intended for users looking for a general outline of where they are on their journey. Rather than emphasizing specific guides, courses, and software packages, it focuses on general concepts:

Image by the author. Available on Notion here, as a PDF here, and on GitHub here.

Let’s cover the levels in detail, beginning with the Entry level, continuing with the…

Pascal Janetzky

CS student.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store