fix: Main Problem: Batch Corruption Across Epochs

This commit is contained in:
Dobromir Popov
2025-12-08 20:00:47 +02:00
parent cc555735e8
commit 81a7f27d2d
4 changed files with 205 additions and 15 deletions

View File

@@ -2530,11 +2530,14 @@ class RealTrainingAdapter:
OPTIMIZATION: Batches are already on GPU and grouped for efficient processing.
Each mini-batch contains 5 samples for better GPU utilization.
IMPORTANT: Yields the same batch objects across epochs (no copying).
The train_step method should not modify batch contents in-place.
IMPORTANT: Creates a shallow copy of batch dict to prevent in-place modifications
from affecting subsequent epochs. Tensors themselves are shared (not copied).
"""
for batch in grouped_batches:
yield batch
# Create shallow copy of batch dict to prevent modifications
# Tensors are shared (not cloned) for memory efficiency
batch_copy = {k: v for k, v in batch.items()}
yield batch_copy
total_batches = len(grouped_batches)