trying to run enhanced training system

2025-08-10 15:31:56 +03:00
parent b3c5076e37
commit ade4e117bf
4 changed files with 203 additions and 763 deletions
--- a/_dev/dev_notes.md
+++ b/_dev/dev_notes.md
@@ -110,23 +110,4 @@ I want it more to be a part of a proper rewardfunction bias rather than a algori
 THINK REALY HARD  


-do we evaluate and reward/punish each model at each reference? we lost track of our model training metrics. in the dash we show:
-Models & Training Progress
-Loaded Models (5)
-DQN_AGENT - ACTIVE (0) [CKPT]
-Inf
-Trn
-Route
-Last: NONE (0.0%) @ N/A
-Loss: N/A
-Rate: 0.00/s | 24h: 0
-Last Inf: None | Train: None
-ENHANCED_CNN - ACTIVE (0) [CKPT]
-Inf
-Trn
-Route
-Last: NONE (0.0%) @ N/A
-Loss: 2133105152.0000 | Best: 34.2300
-Rate: 0.00/s | 24h: 0
-Last Inf: None | Train: None
-DQN_AGENT and ENHANCED_CNN were the models we had the training working well. we had to include the others but it seems we still havent or at least do not store their metrics and best checkpoints
+do we evaluate and reward/punish each model at each reference?