How Real-World Conditions Turned AI Racing Upside Down: Ropo Technology’s AWS DeepRacer Experience

In September 2024, Ropo’s Technology unit, consisting of approximately 40 Software Developers and Service Specialists, participated in an AWS DeepRacer event organized by AWS. This event provided an opportunity to learn about machine learning and reinforcement learning through hands-on experience with autonomous 1/18th scale race cars. Participants trained machine learning models to control the car’s behavior and competed in a fun-but-fierce race against each other.

Inside the AWS DeepRacer Experience

The car uses a type of machine learning called reinforcement learning to learn how to drive. In reinforcement learning, the car learns by trying different actions and getting rewards for good actions and penalties for bad ones. Over time, it learns the best way to drive around the track by trying to maximize the total reward.

The programming in the context of machine learning differs from the traditional way of programming. Instead of giving specific commands for each situation, the model (car) will be programmed to receive rewards based on the logic created by each participant.

“I primarily rewarded the model based on the proximity to the center line of the track. Driving on the edge of the track was also slightly rewarded, as long as all the tires stayed on the track. Another reward logic was the faster the model drove, the more reward points it received. This resulted in the car aiming for fast lap times while staying on the track. In addition, I deducted reward points for excessive steering.” — Timo Heikkinen, Developer, Ropo

Training AI: The Road to the Race

Participants’ path to machine learning started a few weeks before the actual racing event. A solutions architect from AWS introduced the concept and the basics of DeepRacer, machine learning, reinforcement learning and reward functions on a workshop. The presentation included a demo on how to train a model in the console and provided tips and tricks for the upcoming event.

Each participant got 20 hours of virtual training time, which could be used to train and evaluate the models based on the reward functions on the virtual race track. Based on the learning, the model would be driving independently i.e. there wouldn’t be remote control once the car is on the track.

The main task for each participant was to come up with the reward function, which controls how the model is rewarded based on the car’s location, speed or heading, for example. On top of that, there were a lot of car setup parameters to tinker with ensuring it was not a slam dunk to come up with fast and reliable model.

Once the reward function was in place, the model could be trained on the virtual race track, and the progress could be observed through the simulation video and on the simulation data. Based on the results, participants could try to change and improve the reward function and re-run the training.

Simulation Models Face Reality on Race Day

The actual race day was exciting since it was the first time everybody saw the physical cars and the quite massive race track (9m x 5m) with own eyes instead of virtual simulation on a computer screen.

The participants were divided to nine teams matching the actual team structure in Ropo’s Technology Unit. Each team had 4 members on average, but due time constraints, each team got to choose 2-3 of their best models for the time trials. In total, there were 23 different models competing, from which each team got to choose one model to the finals.

There was range of emotions when the actual racing began. Some cars drove as well, or even better than in the virtual environment but there were lots of surprised reactions when the real world results were totally different from what was expected based on the training.

The reason for these differences is due so-called ”simulated-to-real performance gap”. The simulation environment cannot capture all aspects of the real world accurately, which means that models trained in simulation may not perform as well in the real world.

Once the time trials were completed, it was time for the finals. On the finals, all previous lap times were ignored, and each team had the pressure to repeat their best results from previous round. Once again, reality proved to be challenging and only two teams managed to improve their lap times from the time trials. The ultimate winner, Senior Developer Petri Turunen, held his lead from the time trials all the way to the final.

Petri on his training tactics, “From the beginning, I emphasized staying on the track and driving close to the center line in the reward function. I created eight different versions of the function before realizing that giving the model more flexibility and not defining the boundaries too strictly yielded better results.

The final model version (v11), which won the competition, was trained for six hours.”

Emilia Ruusunen from AWS noted that Ropo performed relatively well, especially considering they didn’t retrain their models during the event. Compared to other customers who iterated on their models during the competition, Ropo’s performance held up well despite fewer adjustments.

Driving Forward with AI Together

The AWS DeepRacer concept and event helped Ropo’s developers and specialists to apply AI/ML skills in a practical and enjoyable way. They gained valuable insights into machine learning and reinforcement learning, and overcame various challenges through teamwork and collaboration.

AI in general is fast growing part of software development and Ropo aims to be on the forefront of AI adoptation and utilization. With help of increasingly deepening collaboration with AWS, Ropo will have the necessary tools and resources available to achieve these goals.

Ropo looks forward to future events and continued collaboration with AWS. Special thanks to participants and partners for their support and collaboration in making this event a success.

At Ropo, we are committed to staying at the forefront of AI adaptation and utilization. This competition was a valuable opportunity for our developers and specialists to put their machine learning skills to practical use, overcoming challenges through teamwork and collaboration.

If you are passionate about working in an environment where AI innovation is a key focus, we encourage you to explore our career opportunities. Visit our recruitment page to learn more about joining our team.

Antti Bruun
Head of Tech Resource Development at Ropo
LinkedI n