Notes:


reinforcement learning: there is no signal how large the error is