folder stricture reorganize
This commit is contained in:
139
reports/REAL_MARKET_DATA_POLICY.md
Normal file
139
reports/REAL_MARKET_DATA_POLICY.md
Normal file
@ -0,0 +1,139 @@
|
||||
# REAL MARKET DATA POLICY
|
||||
|
||||
## CRITICAL REQUIREMENT: ONLY REAL MARKET DATA
|
||||
|
||||
This trading system is designed to work EXCLUSIVELY with real market data from cryptocurrency exchanges. **NO SYNTHETIC, GENERATED, OR SIMULATED DATA IS ALLOWED** for training, testing, or inference.
|
||||
|
||||
## Policy Statement
|
||||
|
||||
### ✅ ALLOWED DATA SOURCES
|
||||
- **Binance API**: Real-time and historical OHLCV data
|
||||
- **Other Exchange APIs**: Real market data from legitimate exchanges
|
||||
- **Cached Real Data**: Previously fetched real market data stored locally
|
||||
- **TimescaleDB**: Real market data stored in time-series database
|
||||
|
||||
### ❌ PROHIBITED DATA SOURCES
|
||||
- Synthetic data generation
|
||||
- Random data generation
|
||||
- Simulated market conditions
|
||||
- Artificial price movements
|
||||
- Generated technical indicators
|
||||
- Mock data for testing
|
||||
|
||||
## Implementation Guidelines
|
||||
|
||||
### 1. Data Provider (`core/data_provider.py`)
|
||||
- Only fetches data from real exchange APIs
|
||||
- Caches real data for performance
|
||||
- Never generates or synthesizes data
|
||||
- Validates data authenticity
|
||||
|
||||
### 2. CNN Training (`models/cnn/scalping_cnn.py`)
|
||||
- `ScalpingDataGenerator` only uses real market data
|
||||
- Dynamic feature detection from actual market data
|
||||
- Training samples generated from real price movements
|
||||
- Labels based on actual future price changes
|
||||
|
||||
### 3. RL Training (`models/rl/scalping_agent.py`)
|
||||
- Environment uses real historical data for backtesting
|
||||
- State representations from real market conditions
|
||||
- Reward functions based on actual trading outcomes
|
||||
- No simulated market scenarios
|
||||
|
||||
### 4. Configuration (`config.yaml`)
|
||||
```yaml
|
||||
training:
|
||||
use_only_real_data: true # CRITICAL: Never use synthetic/generated data
|
||||
```
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
Before any training or testing session, verify:
|
||||
|
||||
- [ ] Data source is a legitimate exchange API
|
||||
- [ ] No data generation functions are called
|
||||
- [ ] All training samples come from real market history
|
||||
- [ ] Cache contains only real market data
|
||||
- [ ] No synthetic indicators or features
|
||||
|
||||
## Code Examples
|
||||
|
||||
### ✅ CORRECT: Using Real Data
|
||||
```python
|
||||
# Fetch real market data
|
||||
df = self.data_provider.get_historical_data(symbol, timeframe, limit=1000, refresh=False)
|
||||
|
||||
# Generate training cases from real data
|
||||
features, labels = self.data_generator.generate_training_cases(
|
||||
symbol, timeframes, num_samples=10000
|
||||
)
|
||||
```
|
||||
|
||||
## Logging and Monitoring
|
||||
|
||||
All data operations must log their source:
|
||||
```
|
||||
2025-05-24 02:36:16,674 - models.cnn.scalping_cnn - INFO - Generating 10000 training cases for ETH/USDT from REAL market data
|
||||
2025-05-24 02:36:17,366 - models.cnn.scalping_cnn - INFO - Loaded 1000 real candles for ETH/USDT 1s
|
||||
```
|
||||
|
||||
## Testing Guidelines
|
||||
|
||||
### Unit Tests
|
||||
- Test with small samples of real data
|
||||
- Use cached real data for reproducibility
|
||||
- Never create mock market data
|
||||
|
||||
### Integration Tests
|
||||
- Use real API endpoints (with rate limiting)
|
||||
- Validate data authenticity
|
||||
- Test with multiple timeframes and symbols
|
||||
|
||||
### Performance Tests
|
||||
- Benchmark with real market data volumes
|
||||
- Test memory usage with actual feature counts
|
||||
- Validate processing speed with real data complexity
|
||||
|
||||
## Emergency Procedures
|
||||
|
||||
If synthetic data is accidentally introduced:
|
||||
|
||||
1. **STOP** all training immediately
|
||||
2. **PURGE** any models trained with synthetic data
|
||||
3. **VERIFY** data sources and pipelines
|
||||
4. **RETRAIN** from scratch with verified real data
|
||||
5. **DOCUMENT** the incident and prevention measures
|
||||
|
||||
## Compliance Verification
|
||||
|
||||
Regular audits must verify:
|
||||
- Data source authenticity
|
||||
- Training pipeline integrity
|
||||
- Model performance on real data
|
||||
- Cache content validation
|
||||
|
||||
## Contact and Escalation
|
||||
|
||||
Any questions about data authenticity should be escalated immediately. When in doubt, **ALWAYS** choose real market data over convenience.
|
||||
|
||||
---
|
||||
|
||||
**Remember: The integrity of our trading system depends on using only real market data. No exceptions.**
|
||||
|
||||
## ❌ **EXAMPLES OF FORBIDDEN OPERATIONS**
|
||||
|
||||
### **Code Patterns to NEVER Use:**
|
||||
|
||||
```python
|
||||
# ❌ FORBIDDEN EXAMPLES - DO NOT IMPLEMENT
|
||||
|
||||
# These patterns are STRICTLY FORBIDDEN:
|
||||
# - Any random data generation
|
||||
# - Any synthetic price creation
|
||||
# - Any mock trading data
|
||||
# - Any simulated market scenarios
|
||||
|
||||
# ✅ ONLY ALLOWED: Real market data from exchanges
|
||||
real_data = binance_client.get_historical_klines(symbol, interval, limit)
|
||||
live_price = binance_client.get_ticker_price(symbol)
|
||||
```
|
Reference in New Issue
Block a user