In Part 1 of this two-part article, the pros and cons of each-way betting at the front of the market were examined. It was found that place betting at short prices was viable and should not be dismissed out of hand, but making a long-term profit by applying some pretty basic filters was unlikely to be achieved. It was also established that as the price lengthens the value of these bets using Bookmakers’ prices diminishes, and that the exchanges offered the best option for place betting.
I firmly believe there is scope to make a long-term profit from this area of the market if the correct modelling technique can be found. In fact, a few years ago I investigated place betting using knn methodology. The theory underlying the work was that there may exist races for which the prices available make it more suitable to place betting that other forms of betting.
Essentially, my approach was to take the race in which I was hoping to bet, then compare it against all historical races and note the results for the closest matches. The results from these races would then be used to make an informed betting decision. Naturally, making the comparison is the key to this approach.
In order to do this, it is necessary to extract key race characteristics, those which adequately classify one race as distinct from other races, then form a matching algorithm which can determine the level of similarity between races, hence the use of a “nearest neighbour” algorithm.
The software I wrote used a simple set of attributes in the comparison process, then set ranges around all of the numerical variables in order to generalise the pattern and increase the number of matched races to produce a usable sample. Basically, these attributes could be used to describe the “shape” of the race, and the comparison procedure would look for similarly shaped races.
The image shows the details the program examined. In order to define the shape of the race I used the race code (flat or jumps), the race age/type, the number of runners, going (AW/Turf), race grade and, most importantly, the prices for the first three in the market. Once the race and price data were entered, the program generated the race shape, and then calculated the average profit and loss figures for all similar historical races.
For the example, given the race was classified as a 2yo non-handicap (excluding sellers and claimers), run on turf with a field size of nine to thirteen runners where the first three in the market are priced between: Evens and 6/4; 2/1 and 3/1 and 7/2 and 11/2. The program took the raw data (11 runners, 5/4, 5/2 and 4/1) and set up a more general shape by extending these into ranges in order to generate returns based on a reasonable sample size.
The output shows that for the historical races which matched this race shape in terms of race age, grade, going, runners, and price ranges, the favourite won 44 times and was placed a total of 89 times from 123 races. However, backing all of them would have returned a loss of -18p/£ at Bookmakers’ starting price for win bets and a loss of -9p/£ for place bets (the EW Ret column heading is misleading, it actually refers to the place profit/loss).
The subscript figures (3.02 and 1.43) are the calculated value prices, based on the 123 similar races, in decimal format after being increased by 12.5% to provide some additional value. In other words, a price of 3.02 would be required on the exchanges to be considered a value bet for the favourite in the race based on the variables considered in the analysis, and 1.43 would be required for a place bet.
Since the prices on offer for both the favourite and second favourite were not available on the exchanges, and their historical profiles showed losses for all bets placed (-18p/£, -9p/£; -14p/£, -4p/£), the best option was to consider the third favourite. Backing the third favourite in the closest historical races would have returned a profit for both the win and place bets (+16p/£ and +5p/£); consequently, for this race the program was suggesting that the third favourite should be backed, certainly to win, and possibly place as well.
Now for the caveats with this approach. The race data is easy to enter, but the ranges for the number of runners, which is set by the program, needs to be optimised after testing. The price data for the historical races is based on starting prices which are, of course, not available pre-race.
Consequently, it is necessary to use price shows as close to the off time as possible in order to get comparable data. Again, the ranges set by the program are flexible and would benefit from further research. Critically, the three prices need to be distinct which rules out races with, for example, joint favourites.
In order to validate the usefulness, the program I opted for a live test of initially 1,600 non-handicaps. Adopting a live test would enable me to replicate the betting conditions as closely as possible using pre-race data. The positive advice from the program is essentially in four parts: profitable win profile, profitable place profile, value win prices and value place prices.
In other words, you can choose to bet all qualifiers for which the win and/or place returns are positive (the third favourite in the example); or if the price available on the exchanges beats the advised win and place prices (3.02, 1.43 etc in the example). For these four options the live test produced the following returns:
Beating the Win Value Price: 1555 bets; +5p/£ profit
Beating the Place Value Price: 1467 bets, +1p/£ profit
Positive Win Profile: 1399 bets, +10p/£ profit for win bets, break even for place bets
Positive Place Profile: 1637 bets, -1p/£ loss for place bets, +6p/£ profit for win bets
So, a mixed bag of test results suggesting that much more work is required. If further testing of this approach resulted in a smaller profit margin, then modifying the way the ranges are set would be the obvious first step.
The other key change would be with respect to the variables that are used to form the shape of the race. For example, one additional factor I intended to test was an indicator of whether each horse had run before. This could be important for 2yo races, which feature many unraced horses, given that the sample of races analysed in Part 1 suggests that the profit and loss figures for unraced juveniles are, on average, poorer than for experienced runners, especially for those which are odds-against.
However, increasing the number of variables used will reduce the number of matched races, so a larger database of historical races would be required. Notwithstanding these potential issues, I think the approach definitely has potential, so if anyone takes it further please let me know how you get on.