Merge pull request #18 from allora-network/clement/SOLU-1362

Add real-time data fetching and configuration options
This commit is contained in:
kush-alloralabs 2024-09-04 14:26:37 -04:00 committed by GitHub
commit 70cf49d0a2
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
12 changed files with 387 additions and 160 deletions

7
.env.example Normal file
View File

@ -0,0 +1,7 @@
TOKEN=
TRAINING_DAYS=
TIMEFRAME=
MODEL=
REGION=
DATA_PROVIDER=
CG_API_KEY=

13
.gitignore vendored
View File

@ -9,6 +9,13 @@ inference-data
worker-data worker-data
config.json config.json
env /data
env_file
.env **/*.venv*
**/.cache
**/.env
**/env_file
**/.gitkeep*
**/*.csv
**/*.pkl
**/*.zip

View File

@ -1,5 +1,4 @@
# Use an official Python runtime as the base image FROM python:3.11-slim as project_env
FROM amd64/python:3.9-buster as project_env
# Set the working directory in the container # Set the working directory in the container
WORKDIR /app WORKDIR /app

View File

@ -1,12 +1,12 @@
# Basic ETH Price Prediction Node # Basic Price Prediction Node
This repository provides an example Allora network worker node, designed to offer price predictions for ETH. The primary objective is to demonstrate the use of a basic inference model running within a dedicated container, showcasing its integration with the Allora network infrastructure to contribute valuable inferences. This repository provides an example Allora network worker node, designed to offer price predictions. The primary objective is to demonstrate the use of a basic inference model running within a dedicated container, showcasing its integration with the Allora network infrastructure to contribute valuable inferences.
## Components ## Components
- **Worker**: The node that publishes inferences to the Allora chain. - **Worker**: The node that publishes inferences to the Allora chain.
- **Inference**: A container that conducts inferences, maintains the model state, and responds to internal inference requests via a Flask application. This node operates with a basic linear regression model for price predictions. - **Inference**: A container that conducts inferences, maintains the model state, and responds to internal inference requests via a Flask application. This node operates with a basic linear regression model for price predictions.
- **Updater**: A cron-like container designed to update the inference node's data by daily fetching the latest market information from Binance, ensuring the model stays current with new market trends. - **Updater**: A cron-like container designed to update the inference node's data by daily fetching the latest market information from the data provider, ensuring the model stays current with new market trends.
Check the `docker-compose.yml` file for the detailed setup of each component. Check the `docker-compose.yml` file for the detailed setup of each component.
@ -17,14 +17,45 @@ A complete working example is provided in the `docker-compose.yml` file.
### Steps to Setup ### Steps to Setup
1. **Clone the Repository** 1. **Clone the Repository**
2. **Copy and Populate Configuration** 2. **Copy and Populate Model Configuration environment file**
Copy the example .env.example file and populate it with your variables:
```sh
cp .env.example .env
```
Here are the currently accepted configurations:
- TOKEN
Must be one in ['ETH','SOL','BTC','BNB','ARB'].
Note: if you are using `Binance` as the data provider, any token could be used.
If you are using Coingecko, you should add its `coin_id` in the [token_map here](https://github.com/allora-network/basic-coin-prediction-node/blob/main/updater.py#L107). Find [more info here](https://docs.coingecko.com/reference/simple-price) and the [list here](https://docs.google.com/spreadsheets/d/1wTTuxXt8n9q7C4NDXqQpI3wpKu1_5bGVmP9Xz0XGSyU/edit?usp=sharing).
- TRAINING_DAYS
Must be an `int` >= 1.
Represents how many days of historical data to use.
- TIMEFRAME
This should be in this form: `10m`, `1h`, `1d`, etc.
Note: For Coingecko, Data granularity (candle's body) is automatic - [see here](https://docs.coingecko.com/reference/coins-id-ohlc). To avoid downsampling, it is recommanded to use with Coingecko:
- TIMEFRAME >= 30m if TRAINING_DAYS <= 2
- TIMEFRAME >= 4h if TRAINING_DAYS <= 30
- TIMEFRAME >= 4d if TRAINING_DAYS >= 31
- MODEL
Must be one in ['LinearRegression','SVR','KernelRidge','BayesianRidge'].
You can easily add support for any other models by [adding it here](https://github.com/allora-network/basic-coin-prediction-node/blob/main/model.py#L133).
- REGION
Must be `EU` or `US` - it is used for the Binance API.
- DATA_PROVIDER
Must be `Binance` or `Coingecko`. Feel free to add support for other data providers to personalize your model!
- CG_API_KEY
This is your `Coingecko` API key, if you've set `DATA_PROVIDER=coingecko`.
3. **Copy and Populate Worker Configuration**
Copy the example configuration file and populate it with your variables: Copy the example configuration file and populate it with your variables:
```sh ```sh
cp config.example.json config.json cp config.example.json config.json
``` ```
3. **Initialize Worker** 4. **Initialize Worker**
Run the following commands from the project's root directory to initialize the worker: Run the following commands from the project's root directory to initialize the worker:
```sh ```sh
@ -35,11 +66,11 @@ A complete working example is provided in the `docker-compose.yml` file.
- Automatically create Allora keys for your worker. - Automatically create Allora keys for your worker.
- Export the needed variables from the created account to be used by the worker node, bundle them with your provided `config.json`, and pass them to the node as environment variables. - Export the needed variables from the created account to be used by the worker node, bundle them with your provided `config.json`, and pass them to the node as environment variables.
4. **Faucet Your Worker Node** 5. **Faucet Your Worker Node**
You can find the offchain worker node's address in `./worker-data/env_file` under `ALLORA_OFFCHAIN_ACCOUNT_ADDRESS`. [Add faucet funds](https://docs.allora.network/devs/get-started/setup-wallet#add-faucet-funds) to your worker's wallet before starting it. You can find the offchain worker node's address in `./worker-data/env_file` under `ALLORA_OFFCHAIN_ACCOUNT_ADDRESS`. [Add faucet funds](https://docs.allora.network/devs/get-started/setup-wallet#add-faucet-funds) to your worker's wallet before starting it.
5. **Start the Services** 6. **Start the Services**
Run the following command to start the worker node, inference, and updater nodes: Run the following command to start the worker node, inference, and updater nodes:
```sh ```sh

33
app.py
View File

@ -1,43 +1,28 @@
import json import json
import pickle from flask import Flask, Response
import pandas as pd from model import download_data, format_data, train_model, get_inference
import numpy as np from config import model_file_path, TOKEN, TIMEFRAME, TRAINING_DAYS, REGION, DATA_PROVIDER
from datetime import datetime
from flask import Flask, jsonify, Response
from model import download_data, format_data, train_model
from config import model_file_path
app = Flask(__name__) app = Flask(__name__)
def update_data(): def update_data():
"""Download price data, format data and train model.""" """Download price data, format data and train model."""
download_data() files = download_data(TOKEN, TRAINING_DAYS, REGION, DATA_PROVIDER)
format_data() format_data(files, DATA_PROVIDER)
train_model() train_model(TIMEFRAME)
def get_eth_inference():
"""Load model and predict current price."""
with open(model_file_path, "rb") as f:
loaded_model = pickle.load(f)
now_timestamp = pd.Timestamp(datetime.now()).timestamp()
X_new = np.array([now_timestamp]).reshape(-1, 1)
current_price_pred = loaded_model.predict(X_new)
return current_price_pred[0][0]
@app.route("/inference/<string:token>") @app.route("/inference/<string:token>")
def generate_inference(token): def generate_inference(token):
"""Generate inference for given token.""" """Generate inference for given token."""
if not token or token != "ETH": if not token or token.upper() != TOKEN:
error_msg = "Token is required" if not token else "Token not supported" error_msg = "Token is required" if not token else "Token not supported"
return Response(json.dumps({"error": error_msg}), status=400, mimetype='application/json') return Response(json.dumps({"error": error_msg}), status=400, mimetype='application/json')
try: try:
inference = get_eth_inference() inference = get_inference(token.upper(), TIMEFRAME, REGION, DATA_PROVIDER)
return Response(str(inference), status=200) return Response(str(inference), status=200)
except Exception as e: except Exception as e:
return Response(json.dumps({"error": str(e)}), status=500, mimetype='application/json') return Response(json.dumps({"error": str(e)}), status=500, mimetype='application/json')

View File

@ -3,12 +3,12 @@
"addressKeyName": "test", "addressKeyName": "test",
"addressRestoreMnemonic": "", "addressRestoreMnemonic": "",
"alloraHomeDir": "", "alloraHomeDir": "",
"gas": "1000000", "gas": "auto",
"gasAdjustment": 1.0, "gasAdjustment": 1.5,
"nodeRpc": "http://localhost:26657", "nodeRpc": "https://allora-rpc.testnet.allora.network",
"maxRetries": 1, "maxRetries": 1,
"delay": 1, "delay": 1,
"submitTx": false "submitTx": true
}, },
"worker": [ "worker": [
{ {

View File

@ -1,5 +1,21 @@
import os import os
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
app_base_path = os.getenv("APP_BASE_PATH", default=os.getcwd()) app_base_path = os.getenv("APP_BASE_PATH", default=os.getcwd())
data_base_path = os.path.join(app_base_path, "data") data_base_path = os.path.join(app_base_path, "data")
model_file_path = os.path.join(data_base_path, "model.pkl") model_file_path = os.path.join(data_base_path, "model.pkl")
TOKEN = os.getenv("TOKEN").upper()
TRAINING_DAYS = os.getenv("TRAINING_DAYS")
TIMEFRAME = os.getenv("TIMEFRAME")
MODEL = os.getenv("MODEL")
REGION = os.getenv("REGION").lower()
if REGION in ["us", "com", "usa"]:
REGION = "us"
else:
REGION = "com"
DATA_PROVIDER = os.getenv("DATA_PROVIDER").lower()
CG_API_KEY = os.getenv("CG_API_KEY", default=None)

View File

@ -1,12 +1,14 @@
services: services:
inference: inference:
container_name: inference-basic-eth-pred container_name: inference-basic-eth-pred
env_file:
- .env
build: . build: .
command: python -u /app/app.py command: python -u /app/app.py
ports: ports:
- "8000:8000" - "8000:8000"
healthcheck: healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/inference/ETH"] test: ["CMD", "curl", "-f", "http://localhost:8000/inference/${TOKEN}"]
interval: 10s interval: 10s
timeout: 5s timeout: 5s
retries: 12 retries: 12
@ -31,7 +33,7 @@ services:
worker: worker:
container_name: worker container_name: worker
image: alloranetwork/allora-offchain-node:latest image: alloranetwork/allora-offchain-node:v0.3.0
volumes: volumes:
- ./worker-data:/data - ./worker-data:/data
depends_on: depends_on:

View File

@ -36,7 +36,7 @@ ENV_LOADED=$(grep '^ENV_LOADED=' ./worker-data/env_file | cut -d '=' -f 2)
if [ "$ENV_LOADED" = "false" ]; then if [ "$ENV_LOADED" = "false" ]; then
json_content=$(cat ./config.json) json_content=$(cat ./config.json)
stringified_json=$(echo "$json_content" | jq -c .) stringified_json=$(echo "$json_content" | jq -c .)
docker run -it --entrypoint=bash -v $(pwd)/worker-data:/data -v $(pwd)/scripts:/scripts -e NAME="${nodeName}" -e ALLORA_OFFCHAIN_NODE_CONFIG_JSON="${stringified_json}" alloranetwork/allora-chain:latest -c "bash /scripts/init.sh" docker run -it --entrypoint=bash -v $(pwd)/worker-data:/data -v $(pwd)/scripts:/scripts -e NAME="${nodeName}" -e ALLORA_OFFCHAIN_NODE_CONFIG_JSON="${stringified_json}" alloranetwork/allora-chain:v0.4.0 -c "bash /scripts/init.sh"
echo "config.json saved to ./worker-data/env_file" echo "config.json saved to ./worker-data/env_file"
else else
echo "config.json is already loaded, skipping the operation. You can set ENV_LOADED variable to false in ./worker-data/env_file to reload the config.json" echo "config.json is already loaded, skipping the operation. You can set ENV_LOADED variable to false in ./worker-data/env_file to reload the config.json"

146
model.py
View File

@ -1,46 +1,55 @@
import json
import os import os
import pickle import pickle
from zipfile import ZipFile from zipfile import ZipFile
from datetime import datetime
import pandas as pd import pandas as pd
from sklearn.model_selection import train_test_split from sklearn.kernel_ridge import KernelRidge
from sklearn.linear_model import LinearRegression from sklearn.linear_model import BayesianRidge, LinearRegression
from updater import download_binance_monthly_data, download_binance_daily_data from sklearn.svm import SVR
from config import data_base_path, model_file_path from updater import download_binance_daily_data, download_binance_current_day_data, download_coingecko_data, download_coingecko_current_day_data
from config import data_base_path, model_file_path, TOKEN, MODEL, CG_API_KEY
binance_data_path = os.path.join(data_base_path, "binance/futures-klines") binance_data_path = os.path.join(data_base_path, "binance")
training_price_data_path = os.path.join(data_base_path, "eth_price_data.csv") coingecko_data_path = os.path.join(data_base_path, "coingecko")
training_price_data_path = os.path.join(data_base_path, "price_data.csv")
def download_data(): def download_data_binance(token, training_days, region):
cm_or_um = "um" files = download_binance_daily_data(f"{token}USDT", training_days, region, binance_data_path)
symbols = ["ETHUSDT"] print(f"Downloaded {len(files)} new files")
intervals = ["1d"] return files
years = ["2020", "2021", "2022", "2023", "2024"]
months = ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"] def download_data_coingecko(token, training_days):
download_path = binance_data_path files = download_coingecko_data(token, training_days, coingecko_data_path, CG_API_KEY)
download_binance_monthly_data( print(f"Downloaded {len(files)} new files")
cm_or_um, symbols, intervals, years, months, download_path return files
)
print(f"Downloaded monthly data to {download_path}.")
current_datetime = datetime.now()
current_year = current_datetime.year
current_month = current_datetime.month
download_binance_daily_data(
cm_or_um, symbols, intervals, current_year, current_month, download_path
)
print(f"Downloaded daily data to {download_path}.")
def format_data(): def download_data(token, training_days, region, data_provider):
files = sorted([x for x in os.listdir(binance_data_path)]) if data_provider == "coingecko":
return download_data_coingecko(token, int(training_days))
elif data_provider == "binance":
return download_data_binance(token, training_days, region)
else:
raise ValueError("Unsupported data provider")
def format_data(files, data_provider):
if not files:
print("Already up to date")
return
if data_provider == "binance":
files = sorted([x for x in os.listdir(binance_data_path) if x.startswith(f"{TOKEN}USDT")])
elif data_provider == "coingecko":
files = sorted([x for x in os.listdir(coingecko_data_path) if x.endswith(".json")])
# No files to process # No files to process
if len(files) == 0: if len(files) == 0:
return return
price_df = pd.DataFrame() price_df = pd.DataFrame()
if data_provider == "binance":
for file in files: for file in files:
zip_file_path = os.path.join(binance_data_path, file) zip_file_path = os.path.join(binance_data_path, file)
@ -65,34 +74,68 @@ def format_data():
"taker_volume", "taker_volume",
"taker_volume_usd", "taker_volume_usd",
] ]
df.index = [pd.Timestamp(x + 1, unit="ms") for x in df["end_time"]] df.index = [pd.Timestamp(x + 1, unit="ms").to_datetime64() for x in df["end_time"]]
df.index.name = "date" df.index.name = "date"
price_df = pd.concat([price_df, df]) price_df = pd.concat([price_df, df])
price_df.sort_index().to_csv(training_price_data_path) price_df.sort_index().to_csv(training_price_data_path)
elif data_provider == "coingecko":
for file in files:
with open(os.path.join(coingecko_data_path, file), "r") as f:
data = json.load(f)
df = pd.DataFrame(data)
df.columns = [
"timestamp",
"open",
"high",
"low",
"close"
]
df["date"] = pd.to_datetime(df["timestamp"], unit="ms")
df.drop(columns=["timestamp"], inplace=True)
df.set_index("date", inplace=True)
price_df = pd.concat([price_df, df])
price_df.sort_index().to_csv(training_price_data_path)
def train_model(): def load_frame(frame, timeframe):
# Load the eth price data print(f"Loading data...")
df = frame.loc[:,['open','high','low','close']].dropna()
df[['open','high','low','close']] = df[['open','high','low','close']].apply(pd.to_numeric)
df['date'] = frame['date'].apply(pd.to_datetime)
df.set_index('date', inplace=True)
df.sort_index(inplace=True)
return df.resample(f'{timeframe}', label='right', closed='right', origin='end').mean()
def train_model(timeframe):
# Load the price data
price_data = pd.read_csv(training_price_data_path) price_data = pd.read_csv(training_price_data_path)
df = pd.DataFrame() df = load_frame(price_data, timeframe)
# Convert 'date' to a numerical value (timestamp) we can use for regression print(df.tail())
df["date"] = pd.to_datetime(price_data["date"])
df["date"] = df["date"].map(pd.Timestamp.timestamp)
df["price"] = price_data[["open", "close", "high", "low"]].mean(axis=1) y_train = df['close'].shift(-1).dropna().values
X_train = df[:-1]
# Reshape the data to the shape expected by sklearn print(f"Training data shape: {X_train.shape}, {y_train.shape}")
x = df["date"].values.reshape(-1, 1)
y = df["price"].values.reshape(-1, 1)
# Split the data into training set and test set # Define the model
x_train, _, y_train, _ = train_test_split(x, y, test_size=0.2, random_state=0) if MODEL == "LinearRegression":
model = LinearRegression()
elif MODEL == "SVR":
model = SVR()
elif MODEL == "KernelRidge":
model = KernelRidge()
elif MODEL == "BayesianRidge":
model = BayesianRidge()
# Add more models here
else:
raise ValueError("Unsupported model")
# Train the model # Train the model
model = LinearRegression() model.fit(X_train, y_train)
model.fit(x_train, y_train)
# create the model's parent directory if it doesn't exist # create the model's parent directory if it doesn't exist
os.makedirs(os.path.dirname(model_file_path), exist_ok=True) os.makedirs(os.path.dirname(model_file_path), exist_ok=True)
@ -102,3 +145,22 @@ def train_model():
pickle.dump(model, f) pickle.dump(model, f)
print(f"Trained model saved to {model_file_path}") print(f"Trained model saved to {model_file_path}")
def get_inference(token, timeframe, region, data_provider):
"""Load model and predict current price."""
with open(model_file_path, "rb") as f:
loaded_model = pickle.load(f)
# Get current price
if data_provider == "coingecko":
X_new = load_frame(download_coingecko_current_day_data(token, CG_API_KEY), timeframe)
else:
X_new = load_frame(download_binance_current_day_data(f"{TOKEN}USDT", region), timeframe)
print(X_new.tail())
print(X_new.shape)
current_price_pred = loaded_model.predict(X_new)
return current_price_pred[0]

View File

@ -1,7 +1,9 @@
flask[async] flask[async]
gunicorn[gthread] gunicorn[gthread]
numpy==1.26.2 numpy
pandas==2.1.3 pandas
Requests==2.32.0 Requests
scikit_learn==1.3.2 aiohttp
werkzeug>=3.0.3 # not directly required, pinned by Snyk to avoid a vulnerability multiprocess
scikit_learn
python-dotenv

View File

@ -1,59 +1,175 @@
import os import os
from datetime import date, timedelta
import pathlib
import time
import requests import requests
from requests.adapters import HTTPAdapter
from urllib3.util import Retry
from concurrent.futures import ThreadPoolExecutor from concurrent.futures import ThreadPoolExecutor
import pandas as pd
import json
# Define the retry strategy
retry_strategy = Retry(
total=4, # Maximum number of retries
backoff_factor=2, # Exponential backoff factor (e.g., 2 means 1, 2, 4, 8 seconds, ...)
status_forcelist=[429, 500, 502, 503, 504], # HTTP status codes to retry on
)
# Create an HTTP adapter with the retry strategy and mount it to session
adapter = HTTPAdapter(max_retries=retry_strategy)
# Create a new session object
session = requests.Session()
session.mount('http://', adapter)
session.mount('https://', adapter)
files = []
# Function to download the URL, called asynchronously by several child processes # Function to download the URL, called asynchronously by several child processes
def download_url(url, download_path): def download_url(url, download_path, name=None):
target_file_path = os.path.join(download_path, os.path.basename(url)) try:
if os.path.exists(target_file_path): global files
# print(f"File already exists: {url}") if name:
return file_name = os.path.join(download_path, name)
response = requests.get(url)
if response.status_code == 404:
# print(f"File not exist: {url}")
pass
else: else:
file_name = os.path.join(download_path, os.path.basename(url))
# create the entire path if it doesn't exist dir_path = os.path.dirname(file_name)
os.makedirs(os.path.dirname(target_file_path), exist_ok=True) pathlib.Path(dir_path).mkdir(parents=True, exist_ok=True)
if os.path.isfile(file_name):
with open(target_file_path, "wb") as f: # print(f"{file_name} already exists")
return
# Make a request using the session object
response = session.get(url)
if response.status_code == 404:
print(f"File does not exist: {url}")
elif response.status_code == 200:
with open(file_name, 'wb') as f:
f.write(response.content) f.write(response.content)
# print(f"Downloaded: {url} to {target_file_path}") # print(f"Downloaded: {url} to {file_name}")
files.append(file_name)
def download_binance_monthly_data(
cm_or_um, symbols, intervals, years, months, download_path
):
# Verify if CM_OR_UM is correct, if not, exit
if cm_or_um not in ["cm", "um"]:
print("CM_OR_UM can be only cm or um")
return return
base_url = f"https://data.binance.vision/data/futures/{cm_or_um}/monthly/klines" else:
print(f"Failed to download {url}")
# Main loop to iterate over all the arrays and launch child processes
with ThreadPoolExecutor() as executor:
for symbol in symbols:
for interval in intervals:
for year in years:
for month in months:
url = f"{base_url}/{symbol}/{interval}/{symbol}-{interval}-{year}-{month}.zip"
executor.submit(download_url, url, download_path)
def download_binance_daily_data(
cm_or_um, symbols, intervals, year, month, download_path
):
if cm_or_um not in ["cm", "um"]:
print("CM_OR_UM can be only cm or um")
return return
base_url = f"https://data.binance.vision/data/futures/{cm_or_um}/daily/klines" except Exception as e:
print(str(e))
# Function to generate a range of dates
def daterange(start_date, end_date):
for n in range(int((end_date - start_date).days)):
yield start_date + timedelta(n)
# Function to download daily data from Binance
def download_binance_daily_data(pair, training_days, region, download_path):
base_url = f"https://data.binance.vision/data/spot/daily/klines"
end_date = date.today()
start_date = end_date - timedelta(days=int(training_days))
global files
files = []
with ThreadPoolExecutor() as executor: with ThreadPoolExecutor() as executor:
for symbol in symbols: print(f"Downloading data for {pair}")
for interval in intervals: for single_date in daterange(start_date, end_date):
for day in range(1, 32): # Assuming days range from 1 to 31 url = f"{base_url}/{pair}/1m/{pair}-1m-{single_date}.zip"
url = f"{base_url}/{symbol}/{interval}/{symbol}-{interval}-{year}-{month:02d}-{day:02d}.zip"
executor.submit(download_url, url, download_path) executor.submit(download_url, url, download_path)
return files
def download_binance_current_day_data(pair, region):
limit = 1000
base_url = f'https://api.binance.{region}/api/v3/klines?symbol={pair}&interval=1m&limit={limit}'
# Make a request using the session object
response = session.get(base_url)
response.raise_for_status()
resp = str(response.content, 'utf-8').rstrip()
columns = ['start_time','open','high','low','close','volume','end_time','volume_usd','n_trades','taker_volume','taker_volume_usd','ignore']
df = pd.DataFrame(json.loads(resp),columns=columns)
df['date'] = [pd.to_datetime(x+1,unit='ms') for x in df['end_time']]
df['date'] = df['date'].apply(pd.to_datetime)
df[["volume", "taker_volume", "open", "high", "low", "close"]] = df[["volume", "taker_volume", "open", "high", "low", "close"]].apply(pd.to_numeric)
return df.sort_index()
def get_coingecko_coin_id(token):
token_map = {
'ETH': 'ethereum',
'SOL': 'solana',
'BTC': 'bitcoin',
'BNB': 'binancecoin',
'ARB': 'arbitrum',
# Add more tokens here
}
token = token.upper()
if token in token_map:
return token_map[token]
else:
raise ValueError("Unsupported token")
def download_coingecko_data(token, training_days, download_path, CG_API_KEY):
if training_days <= 7:
days = 7
elif training_days <= 14:
days = 14
elif training_days <= 30:
days = 30
elif training_days <= 90:
days = 90
elif training_days <= 180:
days = 180
elif training_days <= 365:
days = 365
else:
days = "max"
print(f"Days: {days}")
coin_id = get_coingecko_coin_id(token)
print(f"Coin ID: {coin_id}")
# Get OHLC data from Coingecko
url = f'https://api.coingecko.com/api/v3/coins/{coin_id}/ohlc?vs_currency=usd&days={days}&api_key={CG_API_KEY}'
global files
files = []
with ThreadPoolExecutor() as executor:
print(f"Downloading data for {coin_id}")
name = os.path.basename(url).split("?")[0].replace("/", "_") + ".json"
executor.submit(download_url, url, download_path, name)
return files
def download_coingecko_current_day_data(token, CG_API_KEY):
coin_id = get_coingecko_coin_id(token)
print(f"Coin ID: {coin_id}")
url = f'https://api.coingecko.com/api/v3/coins/{coin_id}/ohlc?vs_currency=usd&days=1&api_key={CG_API_KEY}'
# Make a request using the session object
response = session.get(url)
response.raise_for_status()
resp = str(response.content, 'utf-8').rstrip()
columns = ['timestamp','open','high','low','close']
df = pd.DataFrame(json.loads(resp), columns=columns)
df['date'] = [pd.to_datetime(x,unit='ms') for x in df['timestamp']]
df['date'] = df['date'].apply(pd.to_datetime)
df[["open", "high", "low", "close"]] = df[["open", "high", "low", "close"]].apply(pd.to_numeric)
return df.sort_index()