Accurate Professional Football Predictions

Quentin Pradet 6, 1 16 Figuring out the proper settings to achieve the right balance is more a matter of engineering than of theory, and here we must rely on empirical results. In order to get something out of this guide, we have some articles and skills you should acquire to get the most out of it. For anyone interested in checking out the code or using Stocker themselves, it is available on GitHub. CSVs available usually seem to take a few days before they're updated with the latest results. Top Stories Past 30 Days


Structure for this Guide

Then we check to see if the game ended in a win. This is done by testing if there were more home goals than away goals. If it was, we add 1 to our counter for home underdog wins which we have called upsets. We also update our bankroll as if we would have bet on that game. Here we multiply our bet size with the odds we would have gotten. Here we also have an else function added for those times the home team does NOT win.

Now we are don with the loop and can run the program. But we also want to be able to see what our program is calculating for us and thus we should write out our variables in our interpreter. The way you can get this info is to use the print command to write whatever we want.

Now we also want to add some of our variables to the print outs, and that requires a bit extra work. What you want to look at can vary, and how you want it presented can be up to you, but here are some things that I wanted to peek at and made the program print for us:. Second I wanted to print out the number of games that was played, and how many of those included a home underdog. Next was how big our starting bankroll was, and then finally what our ending bankroll was after the season ended.

Here is the output when I run the code: This after bets placed, where 30 of those won. This is likely to be a good result based purely on luck, but it could be the basis for further analysis. Maybe you want to check some of the other seasons as well to get a bigger sample size. Another thing could be to check how it would have fared if you bet on draws where the odds were 3. There are plenty of things you can explore once you know the basics of coding, so if you have any interest in betting on sports it is highly recommended getting into coding.

Now that we have looked at some basic coding and simple betting angles, we can move on to some bigger and potentially better things. Simple angles like the ones I have shown above are usually not good for profitable betting in future games and are not predictive features, but more patterns that will emerge.

That is what we are going to try our hand at now when we move over to cover the Poisson distribution. The way that we get started creating a model is to first identify what we are looking to predict.

And for most sports this is simply determining which teams will score the most goals or points, and concede the least of them. To do this we can look at the different factors that correlates with high goal scoring, like possession of the ball, shots on goal and other relevant ones. In this example however, we will go with a much simpler approach as we will simply look at the previous scoring rates of teams and concede rates and compare them to the league averages.

A model that is often referenced when people are looking for ways to start predicting football matches is the excellent paper by Dixon and Coles, Modelling Association Football Scores and Inefficiencies in the Football Betting Market , mentioned earlier on this page. They propose that you can look at past results and scores between different teams within the same league system and from these past results be able to predict future scores and results.

We are now going to build a very basic version of this model to use for predicting future soccer results. Let us just get right into the bits and pieces of the code. You can scroll down and find the full code if you want to read it in one go.

From here I will simply explain what I have done and what the different pieces of the code does. First we import the different modules that we need in this script. For now this is the csv , math , ast and numpy. Note that you must use numpy instead of np in our code then. Most of these will already be installed on your python installations, but numpy you probably need to fetch yourself.

There are plenty of guides out there to show you how to do it, so a simple google search should be helpful. Here we are going to create our first function that we can reuse. Since we are going to use Poisson a couple of times throughout the script, we you can rather write out the code once and then use it again as often you like with just write a short line instead of the whole sequence. The variables in the parenthesis is the arguments you must provide when you use it.

Then you write the sequence you want it to run. Here is a longer list of code, but it is some simpler ones where we are merely setting up our variables, data and other inputs we are going to use. You do not need to create this as if it does not exist, Python will create it. Then we write the first line in the file, which is the start of a dictionary. This is needed to hold and update our data for each teams variables. Next we want to iterate over our data file and find all the team names that are going to be used.

This is done like the previous example where we open the csv file, skip the first line and then create a for loop. If they are not, they get added to the list. Then it goes over to another for loop where it iterates over all the teams that have been found.

For each team it will be written a line in the text file with different variables that we are going to use. In this example we will record all the home goals and away goals they score, as well as how many goals they concede both at home and away.

We also track the amount of home and away games. Last we have some variables that might not be all too obvious what is at first glance. Remember what we have set out to do here: We then write the end of the file and close it.

This is so as to save what we have written if we want to open it again. The next two lines creates our dictionary where we will hold and update our data we read. Then we write a few variables we will use throughout the script. Next we open another iteration of the data and skip the first line as usual, before we start with our main loop.

We start our for loop where we look at each individual game in our data file and do some work on each of them. Here we pull out some data we need in the team names and goals scored by each of them. We then create some more variables that we are going to use as well, most which should be self-explanatory. Now you may ask: The reason is that we want to reset these after every game we have analyzed. Then we calculate the average amount of goals scored both home and away. Note that you might see we have run an if function before this calculation, and that is because we imposed a limit on how many data points we would like before we starting calculating variables and placing bets.

This is because the model can be quite erratic when it has few data points. Let us say Arsenal win their two first matches or something, then the model may think Arsenal is crazy good, having a high scoring rate and conceding no goals and probably bet on any odds on Arsenal. This might not be wrong, but you can adjust this value yourself depending on how you feel.

I think 4 weeks waiting time is a good number. After getting our average values calculated, we then do the same for each individual team. Again we wait a set amount of weeks to start this calculation. We are looking to calculate the attacking rate of the home team and the away team, as well as both teams defensive ratings.

These values will then be used to calculate an expected value for how many goals will be scored by both the home and the away team. Now we are getting to the part where we will use Poisson to calculate some probabilities, based on previous calculated variables. This is to store the different probabilities that we are looking to create here and so to use them later.

Then we start a for loop for variable i between 0 and 10, and the same for j. These are to represent the different amounts of goals the home team and away team are scoring in different scenarios, respectively.

We then start working on calculating the probability of that happening, which is simply the chance of the home team scoring 0 goals, multiplied with the chance of the away team to score 0 goals. The odds of this score is then written to our text file, and then the next iteration is run, until we have every score up to calculated. Next we are reading the file we just wrote to sum up all the probabilities of the home team winning, all the probabilities of a draw and then all the probabilities of an away win.

No we have made our predictions for each teams chances of winning. With these probabilities we can start looking for good bets, and to do that we need to compare them to the price we can get on the different outcomes. Thus we fetch the odds from Bet I will use Bet in this example as it seems to be a popular bookmaker and use these odds to calculate the expected value EV of the three different outcomes.

Out of these we will get the one with the highest EV, as we are not going to bet on two or more different outcomes here. After this we write an if function to determine which bets we are placing.

We compare the EV to the ones calculated to find which our model likes the most and also make sure that the EV is positive. Then we calculate how our bet would have fared, as we also have the results for each game already in our data.

The last if function is simply one I have written to have the Python interpreter write out all the bets that is placed. Usually good practice to have the program write out different things here and there so it is easier to find where problems might arise or changes could be made. Hi, I am interested in your project. I am Python and Machine Learning specialist, certified by Freelancer.

I fully understand your requirements and I am sure I can help you. Let's discuss details by chat. My profile is as follows: Hi, I have read the paper and I believe that I can implement it. During my 6 years of experience in machine learning I have implemented KNN in a number of projects. Hi I am a very experienced statistician, data scientist and academic writer.

I have completed several PhD level thesis projects involving advanced statistical analysis of data. As a python and machine learning expert, i'm glad to see your project. If you are online, please contact me. If you check my profile, you can see i have deep knowledge in machine learning. Hi, there - My name is Phong. I have read your job description and I am very interested in this project because I have good experience with KNN algorithm and data prediction.

Hello, Greetings of the day.!! I have worked on several similar projects before! We can discuss details via chat. I wait for you now. I'm excited about your project, because I've really rich experience in Football Prediction Model Programming.

I've developed many projects similar to yours and excellent skills. Hi, we have experienced team of Engineers. We have understood that you need Machine Learning Expert.