Reinforcement Learning Lab

A smooth start in Reinforcement Learning

3 min readDec 14, 2021

The following article is about my RL-Lab idea to make Reinforcement Learning an easier topic to learn. It is the culmination of several years of experience in the field.

The Context

After several years of involvement in Reinforcement Learning, I have come to the conclusion that no matter how much you study and research this field, you still have this awkward feeling that you don’t master it yet, even when it comes to the fundamentals.

This nasty feeling stems from the fact that RL is not a simple subject, it is hard and frustrating, even when you have done the same project several times. It is not uncommon that you look at the results in a shock at the end of hours of waiting.

This frustration is even more accentuated if you are a newcomer!
You can go to OpenAI, download and install their framework, spend hours fighting with the tools, then be able to run an example that you have found on the internet.
Then you ask yourself the question. Now what?
Well, nothing… The source code is tiny relative to the standard programming paradigm, but still, it takes longer to execute and it is even harder to know what it is doing, even when you debug it!!

The reason for this turmoil has to do with data and not with the code;
however to add insult to injury you don’t have the data like in other machine learning disciplines, the data is generated/computed by the algorithm itself.
And the ordeal goes on…

So what is the solution?

The Project

I have been looking around for free, and easy ways to make RL more approachable especially for beginners.
That’s when I come up with the idea of RL-Lab, it was something that was brewing in my mind for years. I searched for solutions on the internet, although the materials were not totally absent, they were less friendly, less playful (to keep someone interested), and less interactive.

RL-Lab concept is to make the student feel and experiment RL before writing a single line of code. It allows them to create their own scenarios, choose the algorithms, and parameter them, run the learning process and figure out from the result what went wrong and what went right.
Also, compare the performances of those algorithms and understand the twisting of the parameters, as well as develop an intuition of why some scenarios take enormously more time than others despite using the same algorithms and despite that, in appearance, they are not more complicated.

Monte Carlo method with a windy GridWorld. Notice that the agent chooses the longer path because it is the safest one since it has more guarantee to reach the win state.

Once this is done, students can start writing their own implementation one method at a time and run it on the same scenarios that they have to create, so that they compare their implementation with the built-in one.

Surely, the best way to start with is the notorious GridWorld environment, with some extra interactive flavors and sparkling features. Then comes to classic three tabular methods: Dynamic Programming, Monte Carlo, Temporal Difference.

The editor allows the student to implement their own version of the algorithms using javascript.

So these come with additional features, like downloading/uploading the scenario and the code, predefined scenarios, rich set of parameters.

The Community

Of course, this project is still in its embryonic phase and there is a lot of work down the hill in order to push it to maturity, fulfill its objective, and meet the users' expectations.

This is where I would call on the community to give suggestions and feedback so that this project moves forward. Your input will be very valuable to give directions.
I would like to thank all people who saw something in this project and who already started giving their feedback.

You can check a small tutorial in this video.

Finally, you can connect with us on these links:

Thank you very much.

Reinforcement Learning Lab

A smooth start in Reinforcement Learning

The Context

The Project

The Community

Written by Ziad SALLOUM

No responses yet