Cryptocurrency Price Prediction with LSTM using Machine Learning in Python
In this tutorial, we will learn about forecasting the prices of a Cryptocurrency with LSTM with the help of Machine Learning implemented in Python. Cryptocurrency prediction has now a great amount of interest in people planning to invest in them.
Let me walk you through Long Short-Term Memory (LSTM) neural network. These are special kind of (RNN), useful in series-time problems
Download data from here. The dataset contains the following features:
- Close Price – Closing price for the currency.
- Volume – Volume of currency being in the trade.
- Open Price – Market open price for the currency.
- High Price – The highest price recorded of currency.
- Low Price – The lowest price recorded for the currency.
Forecasting Cryptocurrency prices with LSTM in Python
1. Getting the Data
First of all, make sure that you have the latest version of the following libraries being installed on your system before you proceed with coding.
import json import requests import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from keras.models import Sequential from sklearn.metrics import mean_absolute_error from keras.layers import Activation, Dense, Dropout, LSTM %matplotlib inline
The data can now be loaded into a new python file.
endpt = 'https://min-api.cryptocompare.com/data/histoday' res = requests.get(endpt + '?fsym=BTC&tsym=CAD&limit=500') hist = pd.DataFrame(json.loads(res.content)['Data']) hist = hist.set_index('time') hist.index = pd.to_datetime(hist.index, unit='s') trgt_col = 'close'
2. Train-Test Split
Next, I split the data into two sets – training set and test set with 75% and 25% data respectively.
This step is very essential in order to make your accuracy higher. If you choose a very high training percentage, you will not be able to test the data properly and hence will not be able to make the required changes. Also if you choose a very low training percentage, your model will not be to learn properly, and hence making changes will be tough, and also accuracy will be poor.
def train_test_split(df, test_size=0.25): split_row = len(df) - int(test_size * len(df)) train_data = df.iloc[:split_row] test_data = df.iloc[split_row:] return train_data, test_data train, test = train_test_split(hist, test_size=0.25)
Let’s plot prices.
def line_plot(line1, line2, lbl1=None, lbl2=None, title='', lw=2): fig, ax = plt.subplots(1, figsize=(13, 7)) ax.plot(line1, label=lbl1, linewidth=lw) ax.plot(line2, label=lbl2, linewidth=lw) ax.set_ylabel('price [CAD]', fontsize=14) ax.set_title(title, fontsize=16) ax.legend(loc='best', fontsize=16) line_plot(train[trgt_col], test[trgt_col], 'training', 'test', title='')
3. Building the Model
The data was split into 7 days (an arbitrary number, a week here) and then normalize the data to zero bases.
def normalise_zero_base(df): return df / df.iloc[0] - 1 def normalise_min_max(df): return (df - df.min()) / (data.max() - df.min()) def extract_wndw_data(df, wndw_len=7, zero_base=True): wndw_data = [] for idx in range(len(df) - wndw_len): temp = df[idx: (idx + wndw_len)].copy() if zero_base: temp = normalise_zero_base(temp) wndw_data.append(temp.values) return np.array(wndw_data)
Next, split the data again into two i.e. training and testing.
def prepare_data(df, trgt_col, wndw_len=10, zero_base=True, test_size=0.25): train_data, test_data = train_test_split(df, test_size=test_size) X_train = extract_wndw_data(train_data, wndw_len, zero_base) X_test = extract_wndw_data(test_data, wndw_len, zero_base) Y_train = train_data[trgt_col][wndw_len:].values Y_test = test_data[trgt_col][wndw_len:].values if zero_base: Y_train = Y_train / train_data[trgt_col][:-wndw_len].values - 1 Y_test = Y_test / test_data[trgt_col][:-wndw_len].values - 1 return train_data, test_data, X_train, X_test, Y_train, Y_test
Using a single LSTM.
def build_lstm_model(input_data, output_size, neurons=100, activ_func='linear', dropout=0.25, loss='mse', optimizer='adam'): model = Sequential() model.add(LSTM(neurons, input_shape=(input_data.shape[1], input_data.shape[2]))) model.add(Dropout(dropout)) model.add(Dense(units=output_size)) model.add(Activation(activ_func)) model.compile(loss=loss, optimizer=optimizer) return model
Training the model.
np.random.seed(42) train, test, X_train, X_test, Y_train, Y_test = prepare_data( hist, trgt_col, wndw_len=5, zero_base=True, test_size=0.2) model = build_lstm_model( X_train, output_size=1, neurons=100, dropout=0.25, loss='mse', optimizer='adam') history = model.fit( X_train, Y_train, epochs=50, batch_size=16, verbose=1, shuffle=True)
Epoch 1/50 396/396 [==============================] - 0s 933us/step - loss: 0.0111 Epoch 2/50 396/396 [==============================] - 0s 267us/step - loss: 0.0061 Epoch 3/50 396/396 [==============================] - 0s 273us/step - loss: 0.0041 Epoch 4/50 396/396 [==============================] - 0s 266us/step - loss: 0.0060 Epoch 5/50 396/396 [==============================] - 0s 304us/step - loss: 0.0033 Epoch 6/50 396/396 [==============================] - 0s 311us/step - loss: 0.0054 Epoch 7/50 396/396 [==============================] - 0s 272us/step - loss: 0.0031 Epoch 8/50 396/396 [==============================] - 0s 277us/step - loss: 0.0031 Epoch 9/50 396/396 [==============================] - 0s 277us/step - loss: 0.0027 Epoch 10/50 396/396 [==============================] - 0s 275us/step - loss: 0.0035 Epoch 11/50 396/396 [==============================] - 0s 267us/step - loss: 0.0026 Epoch 12/50 396/396 [==============================] - 0s 272us/step - loss: 0.0027 Epoch 13/50 396/396 [==============================] - 0s 272us/step - loss: 0.0027 Epoch 14/50 396/396 [==============================] - 0s 274us/step - loss: 0.0023 Epoch 15/50 396/396 [==============================] - 0s 302us/step - loss: 0.0026 Epoch 16/50 396/396 [==============================] - 0s 289us/step - loss: 0.0024 Epoch 17/50 396/396 [==============================] - 0s 272us/step - loss: 0.0021 Epoch 18/50 396/396 [==============================] - 0s 266us/step - loss: 0.0028 Epoch 19/50 396/396 [==============================] - 0s 267us/step - loss: 0.0021 Epoch 20/50 396/396 [==============================] - 0s 280us/step - loss: 0.0020 Epoch 21/50 396/396 [==============================] - 0s 304us/step - loss: 0.0029 Epoch 22/50 396/396 [==============================] - 0s 304us/step - loss: 0.0020 Epoch 23/50 396/396 [==============================] - 0s 280us/step - loss: 0.0020 Epoch 24/50 396/396 [==============================] - 0s 313us/step - loss: 0.0019 Epoch 25/50 396/396 [==============================] - 0s 289us/step - loss: 0.0023 Epoch 26/50 396/396 [==============================] - 0s 283us/step - loss: 0.0022 Epoch 27/50 396/396 [==============================] - 0s 272us/step - loss: 0.0018 Epoch 28/50 396/396 [==============================] - 0s 276us/step - loss: 0.0017 Epoch 29/50 396/396 [==============================] - 0s 276us/step - loss: 0.0018 Epoch 30/50 396/396 [==============================] - 0s 268us/step - loss: 0.0016 Epoch 31/50 396/396 [==============================] - 0s 270us/step - loss: 0.0017 Epoch 32/50 396/396 [==============================] - 0s 279us/step - loss: 0.0020 Epoch 33/50 396/396 [==============================] - 0s 311us/step - loss: 0.0018 Epoch 34/50 396/396 [==============================] - 0s 279us/step - loss: 0.0018 Epoch 35/50 396/396 [==============================] - 0s 265us/step - loss: 0.0017 Epoch 36/50 396/396 [==============================] - 0s 266us/step - loss: 0.0022 Epoch 37/50 396/396 [==============================] - 0s 271us/step - loss: 0.0017 Epoch 38/50 396/396 [==============================] - 0s 293us/step - loss: 0.0032 Epoch 39/50 396/396 [==============================] - 0s 271us/step - loss: 0.0023 Epoch 40/50 396/396 [==============================] - 0s 267us/step - loss: 0.0021 Epoch 41/50 396/396 [==============================] - 0s 267us/step - loss: 0.0016 Epoch 42/50 396/396 [==============================] - 0s 298us/step - loss: 0.0016 Epoch 43/50 396/396 [==============================] - 0s 272us/step - loss: 0.0018 Epoch 44/50 396/396 [==============================] - 0s 290us/step - loss: 0.0019 Epoch 45/50 396/396 [==============================] - 0s 271us/step - loss: 0.0017 Epoch 46/50 396/396 [==============================] - 0s 270us/step - loss: 0.0018 Epoch 47/50 396/396 [==============================] - 0s 273us/step - loss: 0.0017 Epoch 48/50 396/396 [==============================] - 0s 275us/step - loss: 0.0015 Epoch 49/50 396/396 [==============================] - 0s 274us/step - loss: 0.0015 Epoch 50/50 396/396 [==============================] - 0s 275us/step - loss: 0.0017
4. Results
Let us now look at the predicted values and the actual values and see how accurate our model is.
target = test[trgt_col][wndw_len:] pred = model.predict(X_test).squeeze() pred = test[trgt_col].values[:-wndw_len] * (preds + 1) pred = pd.Series(index=targets.index, data=preds) line_plot(target, pred, 'actual', 'prediction', lw=3)
Let’s zoom into 1 month’s time span.
n = 30 line_plot(targets[-n:], preds[-n:], 'actual', 'prediction')
Congratulations! You have predicted the cryptocurrency prices. So cryptocurrency price prediction with LSTM using Machine Learning in Python has been done successfully.
Hope you had fun learning with me. Have a good day and happy learning
Leave a Reply