(Creata pagina con "==Supplementary material== The videos reported here show the progression of the learned policies at different episodes. ===Episode 135=== <HTML> <iframe src="https://polib...")
 
(Episode 605)
 
(12 versioni intermedie di uno stesso utente non sono mostrate)
Riga 1: Riga 1:
==Supplementary material==
+
=Supplementary material=
  
 
The videos reported here show the progression of the learned policies at different episodes.  
 
The videos reported here show the progression of the learned policies at different episodes.  
 +
We have selected three goals. The videos show the outcome of the experiments for each of the considered
 +
agents:
  
===Episode 135===
+
* BL: Baseline (TD3)
 +
* BL+DM: Baseline + Difficulty Manager
 +
* BL+EP: Baseline + Episodic Noise
 +
* OURS: Baseline + Difficulty Manager + Episodic noise
 +
 
 +
Notice that the goal is represented with a green arrow (the orientation of the robot is indicated by the red axis).
 +
 
 +
==Episode 135==
  
 
<HTML>
 
<HTML>
<iframe src="https://polibastreams.quavlive.com/s/61a6587a7a630ce3704a156d?t=0" width="1920" height="1080" frameborder="0" scrolling="no" marginwidth="0" allow="autoplay; fullscreen;" allowfullscreen mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>
+
<iframe src="https://quavstreams.quavlive.com/s/62ea92f3c095a50c8b0d75d9?t=0" width="640" height="360" frameborder="0" scrolling="no" marginwidth="0" allow="autoplay; fullscreen;" allowfullscreen mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>
 
</HTML>
 
</HTML>
  
===Episode 605===
+
==Episode 605==
 +
<HTML>
 +
<iframe src="https://quavstreams.quavlive.com/s/62ea930bc095a5459a0d75da" width="640" height="360" frameborder="0" scrolling="no" marginwidth="0" allow="autoplay; fullscreen;" allowfullscreen mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>
 +
</HTML>

Versione attuale delle 15:31, 3 Ago 2022

Supplementary material

The videos reported here show the progression of the learned policies at different episodes. We have selected three goals. The videos show the outcome of the experiments for each of the considered agents:

  • BL: Baseline (TD3)
  • BL+DM: Baseline + Difficulty Manager
  • BL+EP: Baseline + Episodic Noise
  • OURS: Baseline + Difficulty Manager + Episodic noise

Notice that the goal is represented with a green arrow (the orientation of the robot is indicated by the red axis).

Episode 135

Episode 605

Supplementary material[edit]

The videos reported here show the progression of the learned policies at different episodes.

Episode 135[edit]

Episode 605[edit]