trying to run enhanced training system

This commit is contained in:
Dobromir Popov
2025-08-10 15:31:56 +03:00
parent b3c5076e37
commit ade4e117bf
4 changed files with 203 additions and 763 deletions

View File

@@ -110,23 +110,4 @@ I want it more to be a part of a proper rewardfunction bias rather than a algori
THINK REALY HARD
do we evaluate and reward/punish each model at each reference? we lost track of our model training metrics. in the dash we show:
Models & Training Progress
Loaded Models (5)
DQN_AGENT - ACTIVE (0) [CKPT]
Inf
Trn
Route
Last: NONE (0.0%) @ N/A
Loss: N/A
Rate: 0.00/s | 24h: 0
Last Inf: None | Train: None
ENHANCED_CNN - ACTIVE (0) [CKPT]
Inf
Trn
Route
Last: NONE (0.0%) @ N/A
Loss: 2133105152.0000 | Best: 34.2300
Rate: 0.00/s | 24h: 0
Last Inf: None | Train: None
DQN_AGENT and ENHANCED_CNN were the models we had the training working well. we had to include the others but it seems we still havent or at least do not store their metrics and best checkpoints
do we evaluate and reward/punish each model at each reference?