Bitcoin price forecasting

Andrew Kalil Ortiz
5 min readMay 14, 2021


You probably have heard by now about bitcoin and how it has been taking over for the past decade. But what really is bitcoin? Well, generally speaking, bitcoin is a cryptocurrency introduced back in 2008 by an unknown person or group of people that use the name Satoni Nakamoto.

Now, what does this have to do with software engineering and machine learning? Well, everything if we are being honest. We can use software engineering to trace back the creation of this cryptocurrency and machine learning to predict the value given the history of values throughout the years.

This article focuses on one way to predict the value of bitcoin give a dataset with the history of values for the past decade. This is also known as Time Series Forecasting. I created a python program that trains an XGBoost model to predict the price given the values from 2014 to 2019. Therefore, in this article, we will be looking at:

  • Preprocessing the data
  • Visualizing this data in a graph
  • Model architecture used (XGBoost)
  • Results
  • Conclusion

Preprocessing the data

So the way the data comes is not really useful to us unless we make some changes to it. First I will demonstrate what the data looks like before it is preprocessed.

Looking at this, we can see that most of the data has a value of “NaN” and has not been filled by zeros or any other default value. Also, the there are a lot of values that are somewhat unnecessary for this the purpose of this experiment.

Therefore, we go ahead and and clean the data and choose the features which will help us obtain optimum results. For this we use only the timestamp converted from seconds to dates in intervals of 1 hour and the weighted price of bitcoin with filled in values. It should look like this…

This allows us to use the timestamp as an index for our data. If you want to know how this was implemented, it was done with the following python code:

Visualizing this data in a graph

One we have processed our data, we can go ahead and create a graph to help us understand and visualize it better. This also helps to have an idea of how well the model should do. If the predicted graph is too close to the test data, it is considered overfitting. This simply means that the model does “too much of a good job” and the results may be faulty. However, if the graph is not equal but somewhat close to the test data, this means that we are on the right track.

That may have been a bit confusing, but I will demonstrate what I mean. So this is how the the historic trail of the data should look without any extra modifications:

We then split the data into training data and test data. The training data is the data being fed into the model to learn and the testing data is what will be used to input once the model is trained to see the behavior of the model. The split of the data is done on the date June 25 2018 which marked a very important date in bitcoin history. This looks something like this:

Model architecture used

The model architecture used for this model is XGBoost which is a popular and efficient open-source implementation of the gradient descent boosted trees algorithm, according to an amazon on amazon sagemaker. One of the main advantages of XGBoost is that is has quite a lot of hyperparameters that we can play around with to obtain the best results.

For this model, I trained with different variation of parameters. This is what is known as hyperparameter tuning. But the best results were obtained when using the following:


The results obtained were as follows:

As you can see, the model made its best effort to predict the bitcoin value for the range of dates from June 25, 2018 to January 1, 2019. Looking at the graph alone, we can tell the model did an okay job. From the trained model we can then extract the mean squared error and the mean absolute error which are (4682903.985335034, 1740.4792080167) respectively.


The XGBoost did a fairly good job at predicting the bitcoin value for the second half of the year 2018. Perhaps some more extra fine tuning of parameters may have helped to obtain more desired results. But for the purpose of this blog, the results were satisfying enough. If you would like to take a look at the entire code you can check it out on my github.