A real-time data pipeline for stationary Bitcoin ML features.
The QuantCandle Engine is a backend service designed to solve the "non-stationarity" problem in financial machine learning. It consumes live 1m K-Line streams via WebSockets, applies vectorized mathematical transformations in real-time, and maintains a high-concurrency data bridge for the QuantCandle API.
- Real-Time Ingestion: Leverages
ThreadedWebsocketManagerfor non-blocking Binance 1m K-Line streams. - Gap-Resilient Pipeline: Features a
backfillinglogic that identifies historical data gaps using SQL window functions (LEAD) and patches them via the Binance REST API to ensure calculation continuity. - Vectorized Engineering: Transforms raw OHLCV data into scale-invariant features using
NumPy,Pandas, andTA-Lib. - Decoupled Persistence: Maintains an optimized SQLite database with independent tables for raw data (
ohlcv_1m) and engineered features (features_1m).
- Language: Python 3.12+
- Package Manager: uv
- Data Processing: Pandas, NumPy, TA-Lib
- Connectivity: python-binance (WebSocket + REST)
- Storage: SQLite3 with vectorized
update_db_pandaslogic
log_returns: Continuous compounded returns, standardized.
parkinson_volatilityVolatility estimator based on High-Low range.
log_upper_wick:
log_lower_wick:
ibs(Internal Bar Strength): Position of the Close relative to the High-Low range. Naturally stationary [0, 1].
Time is decomposed into cyclical sine/cosine pairs to preserve temporal continuity (e.g., minute 59 is close to minute 0).
minute/hour/day... sin/cos:
(Where T is the period, e.g., 60 for minutes, 24 for hours).
parkinson_volatility: A high-efficiency volatility estimator using the High/Low range.bbands: Measures the relative width of the bands (volatility expansion/contraction).
ppo_line: Percentage Price Oscillator. Normalized version of MACD, making it naturally stationary.
ppo_signal: Exponential moving average of theppo_line.ppo_histogram: Divergence between PPO line and signal.imbalance: Measures market aggressiveness.
Volume is normalized relative to a trailing window to detect anomalies relative to recent activity.
relative_volume:
(Where .env configuration)
relative_taker_buy_volume: Taker buy volume normalized by its window period moving average.
halving_sin/cos: A long-term cyclical feature representing Bitcoin's ~4-year supply shock cycle.
The engine operates on an event-driven loop: it triggers on every closed candle (kline['x']), commits raw data, verifies continuity via backfilling, and pushes new features into the database.
This project uses uv for lightning-fast environment management.
git clone https://github.com/StarlitVienna/quantcandle-engine.git
cd quantcandle-engine
uv run python main.py
- Kaggle Dataset: Bitcoin Quant Core
- Live API: QuantCandle Data on RapidAPI