Back testing method/software

andy28 · Wed Apr 03, 2024 9:34 am

I began by splitting my data at the outset. Instead of resorting to the paper test, I utilize Practice mode to execute the bets, which I suppose constitutes my modeling process. I acknowledge that Practice mode has its drawbacks, particularly concerning queuing and staking. However, given that my strategy revolves around value bets with stakes typically under $10 for a win bet, and it's set to take the second-best price, I'm confident that it provides a fair representation of a live test.

My strategy is centered around Australian racing. Despite lacking data, I extended it to (NSW) racing for back bets, and remarkably, it yielded similar results in Practice mode. Recently, I also applied it to UK racing, again without prior data, and saw a comparable pattern. While not a perfect match, there's certainly a strong correlation.

I think I have misunderstood what people mean by modeling. I've watched videos by Captain Jack on how to construct a model, which involves using historical data. This is likely where my confusion comes from.

I think I do suffer from Arse About Face Syndrome as I always seem to go the wrong way to do something but I think maybe this time it has paid off as the live trading is 3 wining days from 3, early days I know but I gotta take the leap at some stage

Anbell · Wed Apr 03, 2024 9:54 am

andy28 wrote: ↑
Wed Apr 03, 2024 9:34 am
I began by splitting my data at the outset. Instead of resorting to the paper test, I utilize Practice mode to execute the bets, which I suppose constitutes my modeling process. I acknowledge that Practice mode has its drawbacks, particularly concerning queuing and staking. However, given that my strategy revolves around value bets with stakes typically under $10 for a win bet, and it's set to take the second-best price, I'm confident that it provides a fair representation of a live test.

My strategy is centered around Australian racing. Despite lacking data, I extended it to (NSW) racing for back bets, and remarkably, it yielded similar results in Practice mode. Recently, I also applied it to UK racing, again without prior data, and saw a comparable pattern. While not a perfect match, there's certainly a strong correlation.

I think I have misunderstood what people mean by modeling. I've watched videos by Captain Jack on how to construct a model, which involves using historical data. This is likely where my confusion comes from.

I think I do suffer from Arse About Face Syndrome as I always seem to go the wrong way to do something but I think maybe this time it has paid off as the live trading is 3 wining days from 3, early days I know but I gotta take the leap at some stage

There are a million ways to do it - but it all boils down to hypothesis>test>pretend profit>scale>actually profit

If you scale after 3 days of profit you will be broke within 2 weeks, so be patient.

andy28 · Wed Apr 03, 2024 10:54 am

Yeah I have no plans to scale at this stage, I want to run it for a year (unless it collapses) to look to see if there is a seasonal impact and the like on it. But as each day passes it will give me more results to analyse and that can't be a bad thing, I might even make a few bucks on the way

jimibt · Wed Apr 03, 2024 12:24 pm

firlandsfarm wrote: ↑
Tue Apr 02, 2024 6:08 pm
How are you defining modelling Peter. It's all based on past stats in one form or another.

+100 - this has always been my issue whenever backtesting (and especially MODELLING) is ever mentioned here (going back [no pun intended] at least 6 years).

For me, backtesting is exactly as sniffer mentioned in an earlier response, it's a confidence measure as to how well your strategy MAY perform under unseen conditions (i.e. using 80/20 data of whatever flavour supports your approaches).

I will continue to fight this corner in asserting that backtesting is crucial and if the tools are using modelled data/algos, then so be it. Likewise, if the tools are a spreadsheet, notes on a fag packet, then also so be it - whatever gets the job done.

Pattern recognition doesn't haphazardly get plucked from the outerverse - it's an observable collection of data points that on their own may not offer up a silver bullet. Hence the need to construct cohesive models that support any number of confluences that can be used to identify entry and exit points within a given approach. It's a nonsense to expect individuals to have to headspace to deal with the sheer variety of available data points in realtime and apply them without some sort of modelling (aka backtesting) of the idea.

I would however be really happy to be shown alternative approaches that didn't involve the effort that I continually undergo but still assisted me in forming rational approaches (for both including and discounting ideas).

Anyway - each to their own, i know which side of the equation i'd rather reside on.

[edit] - as an aside, my data models and the data points that are being backtested are totally unrelated to form, age, lineage or whether the horse has the word *Tulip* in its name. My approaches are all about repeatable mechanical market observations. This is why backtesting confers so much benefit.

Fugazi · Wed Apr 03, 2024 1:05 pm

jimibt wrote: ↑
Wed Apr 03, 2024 12:24 pm

firlandsfarm wrote: ↑
Tue Apr 02, 2024 6:08 pm
How are you defining modelling Peter. It's all based on past stats in one form or another.
+100 - this has always been my issue whenever backtesting (and especially MODELLING) is ever mentioned here (going back [no pun intended] at least 6 years).

For me, backtesting is exactly as sniffer mentioned in an earlier response, it's a confidence measure as to how well your strategy MAY perform under unseen conditions (i.e. using 80/20 data of whatever flavour supports your approaches).

I will continue to fight this corner in asserting that backtesting is crucial and if the tools are using modelled data/algos, then so be it. Likewise, if the tools are a spreadsheet, notes on a fag packet, then also so be it - whatever gets the job done.

Pattern recognition doesn't haphazardly get plucked from the outerverse - it's an observable collection of data points that on their own may not offer up a silver bullet. Hence the need to construct cohesive models that support any number of confluences that can be used to identify entry and exit points within a given approach. It's a nonsense to expect individuals to have to headspace to deal with the sheer variety of available data points in realtime and apply them without some sort of modelling (aka backtesting) of the idea.

I would however be really happy to be shown alternative approaches that didn't involve the effort that I continually undergo but still assisted me in forming rational approaches (for both including and discounting ideas).

Anyway - each to their own, i know which side of the equation i'd rather reside on.

[edit] - as an aside, my data models and the data points that are being backtested are totally unrelated to form, age, lineage or whether the horse has the word *Tulip* in its name. My approaches are all about repeatable mechanical market observations. This is why backtesting confers so much benefit.

Remind me Jim - you collect it with an API and C+ rather than using bet angel store data functions or python? Got lots of data i want to collect but dont know how to do it efficiently

jimibt · Wed Apr 03, 2024 2:03 pm

Fugazi wrote: ↑
Wed Apr 03, 2024 1:05 pm
Remind me Jim - you collect it with an API and C+ rather than using bet angel store data functions or python? Got lots of data i want to collect but dont know how to do it efficiently

to cut a long story short. i got back into betangel as someone on this forum asked me to take a look at how easy it would be to build a c# facade over the top of the betangel api. as it turns out, it was fairly straightfwd and in looking at it i thought i'd test a few things out, such as data collection and using the facade to build a strategy interface against. seems to work quite well but is still early days. there are tools out there such as (mentioned earlier) flumine etc. https://github.com/betcode-org/flumine

conduirez · Wed Apr 03, 2024 3:21 pm

jimibt wrote: ↑
Wed Apr 03, 2024 2:03 pm

Fugazi wrote: ↑
Wed Apr 03, 2024 1:05 pm
Remind me Jim - you collect it with an API and C+ rather than using bet angel store data functions or python? Got lots of data i want to collect but dont know how to do it efficiently
to cut a long story short. i got back into betangel as someone on this forum asked me to take a look at how easy it would be to build a c# facade over the top of the betangel api. as it turns out, it was fairly straightfwd and in looking at it i thought i'd test a few things out, such as data collection and using the facade to build a strategy interface against. seems to work quite well but is still early days. there are tools out there such as (mentioned earlier) flumine etc. https://github.com/betcode-org/flumine

If you are interested in Flumine Fugazi as you are learning Python, here is the exact link to the forum that ShaunWhite invited me to, it will help you getting Flumine working for you.

The Slack Betcode channel.

https://join.slack.com/t/betcode-org/sh ... c9hokJNu7w

Euler · Wed Apr 03, 2024 3:43 pm

Backtesting / backfitting is very different from modelling.

I think people often conflate the two. I collect data and lots of it, just to test assumptions or to understand the market better.

But I need to model the data to predict the future and deploy my strategy into it.

jimibt · Wed Apr 03, 2024 4:15 pm

Euler wrote: ↑
Wed Apr 03, 2024 3:43 pm
Backtesting / backfitting is very different from modelling.

backfitting = delusional approach to believing that historic data can predict the future
backtesting = informed approach to understanding market dynamics and appropriately using data models to ASSIST in execution

Euler wrote: ↑
Wed Apr 03, 2024 3:43 pm
I think people often conflate the two. I collect data and lots of it, just to test assumptions or to understand the market better.

But I need to model the data to predict the future and deploy my strategy into it.

if we want to be technically correct, what we actually do is the following:

1. collect and prepare data that forms attributes within a data model
2. create a strategy that is aligned to the attributes within the model
3. simulate the profile of the strategy using a blend of 80% exposed data and 20% blind data
4. run the model against the 80% exposed data (parameter adjustment permitted at this stage)
5. rerun against the blind 20% to ascertain that we have not data/parameter fitted our assumptions
6. randomly re assign a further split on the data and re run as required
7. forward test using either a test account or with small stakes
8. scale into the live market - risk dictated by metrics obtained from backtesting (drawdown etc)

I think we are all saying the same thing, except, I firmly am of the belief that there is a widespread misconception that backtesting equates to looking at form data and running a spreadsheet against that and then creating a strategy. In my world, it's a purely mechanical process which starts with an exploration of the data and then moves onto looking at ways to exploit any observations (an example might be one where the data shows that 95% of the time, there is a huge swell in volume/prices around the -120 second mark, followed by many equally spaced standard price deviations from the previous mean on key runners - etc). Using this *data first* approach, every care is taken to mitigate data fitting and only focus on outlier behaviours etc.

I'm fairly risk averse (but have courage in my convictions). So for me, based on my skillset, the above gives me confidence when i reach points #7 and 8 above.

[edit] - current example which has reached #7 above:

Screenshot 2024-04-03 162619.png

sniffer66 · Wed Apr 03, 2024 4:44 pm

I'm actually at #6 & #7 with a model I've created purely from historical data and it's doing as expected

In tandem with that I'm testing out something on the machine learning side that purely utilises and trains on live data, so kind of in reverse. Starting with the tiniest dataset I'm pulling down a pretty comprehensive set of data per minute, per live match, getting the model to predict an outcome, then capturing the outcome and re-training the model on the fly with the data it used for the prediction, kicking out an accuracy report on every update.
So, there's a feedback loop that updates the model in (almost) real time, in place for the next prediction. Basically, live learning

I just kicked this off earlier and it's interesting watching how each prediction affects the overall accuracy and new predictions. The hope is that over time it becomes accurate enough to use , however speed is going to be key. Time will tell

Fugazi · Wed Apr 03, 2024 5:14 pm

conduirez wrote: ↑
Wed Apr 03, 2024 3:21 pm

jimibt wrote: ↑
Wed Apr 03, 2024 2:03 pm

Fugazi wrote: ↑
Wed Apr 03, 2024 1:05 pm
Remind me Jim - you collect it with an API and C+ rather than using bet angel store data functions or python? Got lots of data i want to collect but dont know how to do it efficiently
to cut a long story short. i got back into betangel as someone on this forum asked me to take a look at how easy it would be to build a c# facade over the top of the betangel api. as it turns out, it was fairly straightfwd and in looking at it i thought i'd test a few things out, such as data collection and using the facade to build a strategy interface against. seems to work quite well but is still early days. there are tools out there such as (mentioned earlier) flumine etc. https://github.com/betcode-org/flumine
If you are interested in Flumine Fugazi as you are learning Python, here is the exact link to the forum that ShaunWhite invited me to, it will help you getting Flumine working for you.

The Slack Betcode channel.

https://join.slack.com/t/betcode-org/sh ... c9hokJNu7w

Thank you, and thank you Jim. I will save this info for when I've got used to python and wanted to take the next step

Fugazi · Wed Apr 03, 2024 5:22 pm

jimibt wrote: ↑
Wed Apr 03, 2024 4:15 pm

Euler wrote: ↑
Wed Apr 03, 2024 3:43 pm
Backtesting / backfitting is very different from modelling.
backfitting = delusional approach to believing that historic data can predict the future
backtesting = informed approach to understanding market dynamics and appropriately using data models to ASSIST in execution

Euler wrote: ↑
Wed Apr 03, 2024 3:43 pm
I think people often conflate the two. I collect data and lots of it, just to test assumptions or to understand the market better.

But I need to model the data to predict the future and deploy my strategy into it.
if we want to be technically correct, what we actually do is the following:

1. collect and prepare data that forms attributes within a data model
2. create a strategy that is aligned to the attributes within the model
3. simulate the profile of the strategy using a blend of 80% exposed data and 20% blind data
4. run the model against the 80% exposed data (parameter adjustment permitted at this stage)
5. rerun against the blind 20% to ascertain that we have not data/parameter fitted our assumptions
6. randomly re assign a further split on the data and re run as required
7. forward test using either a test account or with small stakes
8. scale into the live market - risk dictated by metrics obtained from backtesting (drawdown etc)

I think we are all saying the same thing, except, I firmly am of the belief that there is a widespread misconception that backtesting equates to looking at form data and running a spreadsheet against that and then creating a strategy. In my world, it's a purely mechanical process which starts with an exploration of the data and then moves onto looking at ways to exploit any observations (an example might be one where the data shows that 95% of the time, there is a huge swell in volume/prices around the -120 second mark, followed by many equally spaced standard price deviations from the previous mean on key runners - etc). Using this *data first* approach, every care is taken to mitigate data fitting and only focus on outlier behaviours etc.

I'm fairly risk averse (but have courage in my convictions). So for me, based on my skillset, the above gives me confidence when i reach points #7 and 8 above.

[edit] - current example which has reached #7 above:

Screenshot 2024-04-03 162619.png

Funny you should post this. I arrived at a similar strategy purely through trial and error and it is why im wanting to learn to data collect. As unfortunately im not quite profitable the way im doing it. Ive come close through trial and error, but even if i manage profitably it wont be anywhere near as optimal as using actual data. Without the data of when exactly the price movements happen im having to green up instead which sucks for EV and often is frequently mis timed leading to a loss.

foxwood · Wed Apr 03, 2024 5:50 pm

Fugazi wrote: ↑
Wed Apr 03, 2024 5:14 pm
I will save this info for when I've got used to python and wanted to take the next step

No matter what your programming experience, using chatgptPlus (4) is a great time saver. If you are just starting out it will point you in the right direction and if you have decades of experience and umpteen languages it will tell you the specific code solution for the current language.

It will write code for you, explain code you don't understand, design logical solutions if you state your requirements clearly. Saves all the "how do I / shall I", "how to xyz in ... language" and all the rtfm that goes with coding.

It's not always right and neither are the solutions always the best but in the main it works fine as a reference point and definitely leaves the head clearer for looking at the problem and not the code / syntax etc

jimibt · Wed Apr 03, 2024 5:52 pm

Fugazi wrote: ↑
Wed Apr 03, 2024 5:22 pm
Funny you should post this. I arrived at a similar strategy purely through trial and error and it is why im wanting to learn to data collect. As unfortunately im not quite profitable the way im doing it. Ive come close through trial and error, but even if i manage profitably it wont be anywhere near as optimal as using actual data. Without the data of when exactly the price movements happen im having to green up instead which sucks for EV and often is frequently mis timed leading to a loss.

yes -you really NEED to be partying on top of the stddev as they occur in this type of test (and tbh, there's a few other confluence items to consider, which again requires several additional ways to chop up the current and near term data elements). I found that even tho i was easily(ish!) discovering the acceleration/deceleration points on selections, i was only ever at best making 40-50% correct entries. the additional confluences managed to notch that up a bit, tho at the same time, obviously reduced the number of candidates per event.

Fugazi · Wed Apr 03, 2024 5:56 pm

foxwood wrote: ↑
Wed Apr 03, 2024 5:50 pm

Fugazi wrote: ↑
Wed Apr 03, 2024 5:14 pm
I will save this info for when I've got used to python and wanted to take the next step
No matter what your programming experience, using chatgptPlus (4) is a great time saver. If you are just starting out it will point you in the right direction and if you have decades of experience and umpteen languages it will tell you the specific code solution for the current language.

It will write code for you, explain code you don't understand, design logical solutions if you state your requirements clearly. Saves all the "how do I / shall I", "how to xyz in ... language" and all the rtfm that goes with coding.

It's not always right and neither are the solutions always the best but in the main it works fine as a reference point and definitely leaves the head clearer for looking at the problem and not the code / syntax etc

This is exactly what Im doing. Im halfway through a good udemy course (was only a tenner), I use GPT as my tutor when I dont understand. Makes the learning so easy

My next step is going to be getting GPT to help code prediction models, with basic python knowledge I will have a better understanding of what I am asking it and the nuances.

Once I have made a few models I want to move up to what Jim/Cond are doing

Back testing method/software

Login • Register