Optional numeric return head (predicts percent change for 1s,1m,1h,1d)

2025-08-23 15:17:04 +03:00
parent 9992b226ea
commit 81749ee18e
8 changed files with 124 additions and 30 deletions
--- a/_dev/dev_notes.md
+++ b/_dev/dev_notes.md
@@ -110,4 +110,10 @@ I want it more to be a part of a proper rewardfunction bias rather than a algori
 THINK REALY HARD  


-do we evaluate and reward/punish each model at each reference?
+do we evaluate and reward/punish each model at each reference?
+
+
+
+
+in our realtime Reinforcement learning  training how do we calculate the score (reward/penalty?) 
+Let's use the mean squared difference between the prediction and the empirical outcome. We should do a training run at each inference which will use the last inference's prediction and the current price as outcome. do that up to  6 last predictions and calculating accuracity separately to have a better picture of the ability to predict couple of timeframes in the future. additionally to the frequent inference every 1 or 5s (i forgot the curent CNN rate) do an inference at each new timeframe interval. model should get the full data (multi timeframe - ETH (main) 1s 1m 1h 1d and 1m for BTC, SPX and one more) but should also know on what timeframe it is predicting. we predict only on the main symbol - so in 4 timeframes. bur on every hour we will do 4 inferences - one for each timeframe