Python Forecasting Bitcoin (BTC) with Machine Learning

Click For Summary
SUMMARY

The forum discussion centers on implementing a machine learning project for forecasting Bitcoin (BTC) prices using a cascade of Support Vector Regression (SVR) and Long Short Term Memory Neural Networks (LSTM). The user expresses confusion regarding the training data split, where 80% is used for training and only 20% for prediction, questioning the efficacy of this approach. Despite following the methodology outlined in a referenced paper, the user reports poor prediction results, indicated by a high Mean Absolute Error (MAE). The discussion highlights the need for further examination of the model setup, including potential overfitting and the absence of a Monte Carlo method for hyperparameter estimation.

PREREQUISITES
  • Understanding of Support Vector Regression (SVR) and its implementation
  • Familiarity with Long Short Term Memory Neural Networks (LSTM) and their training processes
  • Knowledge of performance metrics such as Mean Absolute Error (MAE)
  • Basic concepts of machine learning model validation and overfitting
NEXT STEPS
  • Research techniques for improving SVR and LSTM model accuracy
  • Learn about hyperparameter tuning methods, including Monte Carlo simulations
  • Explore advanced error metrics for evaluating model performance in financial forecasting
  • Investigate data preprocessing techniques to enhance model training and prediction
USEFUL FOR

Data scientists, machine learning practitioners, and students focusing on financial forecasting using machine learning techniques, particularly those interested in cryptocurrency price prediction.

BRN
Messages
107
Reaction score
10
Hi everyone,
I apologize to the mod if I posted in the wrong section.

For my exam of Machine Learning, I would like to implement a part of the work presented in this paper. In this work, the authors used two ML methods in cascade for forecasting Bitcoin. Starting from the initial data, they predicted the five main BTC indicators via SVR (Support Vector Regression) and the latter were then the inputs used to predict the price of Bitcoin via LSTMNN (Long Short Term Memory Neural Networks).

There are two things that I don't understand:
  • Starting with a dataset of about 2600 rows, 80% of them are used for Trainig and only 20% for the prediction coming out of SVR. This data are ulteriorly separated by incoming in LSTMNN and follows that the prediction in this stage is made on a truly reduced sample compared to the starting one. It's not a problem?
  • If the input to LSTMNN are only the indicator predicted by SVR, how is it possible to forecasting the price of the BTCs? At this stage there is only the X matrix of the indicators, but there is not price vector...

Can anyone clarify my ideas?

Thanks so much!
 
Technology news on Phys.org
Create an artificial market have 30 to 40 simulated miners and sellers, have that market resemble how BTC has been doing, teach the Algorithm Off of that, introduce it to the real BTC market and continue to train it once the First Bit of training with the simulated market has been finished,
 
Superintendent said:
Create an artificial market have 30 to 40 simulated miners and sellers, have that market resemble how BTC has been doing, teach the Algorithm Off of that, introduce it to the real BTC market and continue to train it once the First Bit of training with the simulated market has been finished,
Thanks for the reply.

Ok, I tried to implement it. Through SVR model, I predict the technical indicators which, once inserted in LSTM model, providing the price of the BTCs. Everything works, but the results are really bad ...

The code is too long to post it here, but here there are my Jupyter notebook and the dataset file.

If someone could give me some advice on how to improve it / correct it, I would be really grateful.

Thanks so much!
 
BRN said:
Thanks for the reply.

Ok, I tried to implement it. Through SVR model, I predict the technical indicators which, once inserted in LSTM model, providing the price of the BTCs. Everything works, but the results are really bad ...

The code is too long to post it here, but here there are my Jupyter notebook and the dataset file.

If someone could give me some advice on how to improve it / correct it, I would be really grateful.

Thanks so much!
Sorry, i am not that good in the aspects of Machine learning, just thought of that during school and i thought i should share the information to you
 
Beware: 'Real Life' may totally trump logical predictions.
IIRC, Bitcoin value tumbled this morning as eg Russian miners 'cashed in', perhaps fearing financial sanctions following trouble along Ukraine's borders...

May be a 'Pump & Dump' ploy, may simply be 'Duck & Cover'...

Like the many TV ads for 'Gold Bullion', too many bit-coin mining proponents fail to mention that value of investment may go down as well as up. Worse, if playing with 'futures', your investment may be wiped out, and leave you with significant debts...

Due Care, Please ??
 
  • Like
Likes valenumr and Tom.G
I don't have to invest money in BTCs. As I have already specified in my first post, I just have to present a project for an exam.

The predicted price values are compared with the real values present in the test dataset. By calculating the MAE (Mean Absolute Error), I get a very high value. I would like to understand if the problem is how I set the model or for some other reason.

I ask to anyone who is familiar with Machine Learning and Big Data to examine my Jupyter Notebook and tell me what he thinks.
 
Ah, you assume a 'Mature Bourse'.
Go for it...
 
  • Haha
Likes Tom.G
BRN said:
The predicted price values are compared with the real values present in the test dataset. By calculating the MAE (Mean Absolute Error), I get a very high value.
Ahh! I see you are rapidly learning the vagaries of Bitcoin.

It is a bit like the Stock Market combined with a late-nite Infomercial*, but without the U.S. Securities and Exchange Commission to keep an eye on it.

(At least you are learning Economics without going broke. :wink:)

I rather liked @Nik_2213 's reply above:
Nik_2213 said:
Beware: 'Real Life' may totally trump logical predictions.

* in·fo·mer·cial
/ˈinfōˌmərSH(ə)l/

noun
a television program that promotes a product in an informative and supposedly objective way.
 
  • Like
Likes Nik_2213
BRN said:
I don't have to invest money in BTCs. As I have already specified in my first post, I just have to present a project for an exam.

The predicted price values are compared with the real values present in the test dataset. By calculating the MAE (Mean Absolute Error), I get a very high value. I would like to understand if the problem is how I set the model or for some other reason.

I ask to anyone who is familiar with Machine Learning and Big Data to examine my Jupyter Notebook and tell me what he thinks.
Humans are irrational. Especially in crypto these days. I would be highly impressed by an AI that could accurately predict crypto price trends.
 
  • #10
valenumr said:
Humans are irrational. Especially in crypto these days. I would be highly impressed by an AI that could accurately predict crypto price trends.
Yeah, AI that could accurately predict price trends in pretty much anything would make its creator exceptionally wealthy.
 
  • #11
Maybe I explained myself wrong.

I don't want an AI who is able to predict the price of the BTCs accurately. I don't have to invest in Crypto. This is just a project to be presented for my ML exam. I chose this topic because I don't want to present the usual project that generally most of the students of the course present (image recognition, sub-particles analysis, etc.). I want to present something different.

I started from this article where the authors say that the two models in cascade SVR+LSTM return a better result than the single one LSTM. Well, from how I implemented it, this result doesn't happen. For this reason I ask someone to look at my code and give me an opinion of it.

  • Is my code correct?
  • Is it correct to put SVR and LSTM in cascade in this way?
  • Is it right to make LSTM training with all the data available?
  • Did I overfitting?
  • Does not implement a Montecarlo system to estimate LSTM hyperparameters is relevant?
  • and so on...

All these questions are technical questions and no matter whether they are related to a model that manipulates crypto data, sub-particles data, potatoes or other.

If you don't believe that I should take an exam, I can't do anything. Where I live, in the universities the exams are made to obtain a degree.

I didn't think I had to clash even against prejudices...
 
  • #12
Just a quick starting question - is your code attempting to compute the exact same technical predictors, fit the exact same model, and using the exact same training and test data?
 
  • #13
BRN said:
The predicted price values are compared with the real values present in the test dataset. By calculating the MAE (Mean Absolute Error), I get a very high value
I think you should only look at relative errors for bitcoins, or share or commmodity prices. A 1% error should be weighted the same when bitcoin is at 1$ as when it is at 10000$
 
  • Like
Likes Tom.G
  • #14
Office_Shredder said:
Just a quick starting question - is your code attempting to compute the exact same technical predictors, fit the exact same model, and using the exact same training and test data?
The technical predictors are the same used by the authors of the article. I also tried to add others: MACDS, MACDH and ROI, but the result does not change.
My dataset is more updated, but I tried to use the data of the same period indicated in the article and I always get that LSTM works better than SVR + LSTM.
No I have idea of how the authors have implemented the SVR+LSTM model, because on this stage, they have not been very detailed in the description. For this reason I am asking your opinions here. I just know that my code is OK. The only difference I know is that I have not implemented a Montecarlo method to estimate LSTM hyperparameters.

willem2 said:
I think you should only look at relative errors for bitcoins, or share or commmodity prices. A 1% error should be weighted the same when bitcoin is at 1$ as when it is at 10000$

I use the Mean Absolute Error because in the article it is used, but I can also use the Relative one. I don't think it's a problem.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
Replies
63
Views
11K
  • · Replies 7 ·
Replies
7
Views
8K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 6 ·
Replies
6
Views
3K
Replies
10
Views
5K
Replies
3
Views
2K
Replies
5
Views
4K