Trading Horse racing : Am I fooling myself?

The sport of kings.
User avatar
boony
Posts: 6
Joined: Mon Nov 07, 2016 8:53 pm

Sun Sep 16, 2018 2:04 pm

ruthlessimon wrote:
Sat Sep 15, 2018 5:14 pm
boony wrote:
Sat Sep 15, 2018 2:54 am
I've been back-testing a strategy and it is showing a profit over the ~3300 races I've tested so far. However, the equity curve is "choppy" to say the least. If I filter out a particular set of courses it looks a lot better (see graph).
Is there data missing from the "everything" line?

Whenever I filter a variable (i.e.course), I always see a drop in frequency - so I'm interested in how you've got them to both match
Simon, there's no data missing. I have a row for each race in Excel, one of the columns is the profit/loss for the race. I create a new column in Excel which is the 'everything' cumulative profit/loss. When I 'filter' I don't use the excel filter, but instead I have a another column with a formula that drives the profit/loss for races at the filtered courses to zero. Then I can simply plot both columns on the same chart.

Thanks to all for the feedback - much appreciated.

I'm continuing to run the back-test on more races - I want to run against a full year. The problem is my back-testing software is painfully slow so the information I seek is going to take some time to obtain... and I'm impatient :)

User avatar
ShaunWhite
Posts: 3706
Joined: Sat Sep 03, 2016 3:42 am

Sun Sep 16, 2018 5:36 pm

boony wrote:
Sun Sep 16, 2018 2:04 pm
The problem is my back-testing software is painfully slow so the information I seek is going to take some time to obtain... and I'm impatient :)
How long is a long time? I run my big backtests overnight. My 'full' test takes about 6 hours and is just about finishing when I get up again.

User avatar
ruthlessimon
Posts: 1465
Joined: Wed Mar 23, 2016 3:54 pm

Sun Sep 16, 2018 5:45 pm

ShaunWhite wrote:
Sun Sep 16, 2018 5:36 pm
How long is a long time? I run my big backtests overnight. My 'full' test takes about 6 hours and is just about finishing when I get up again.
6hrs!?!

lol - I moan if it takes anything longer than a couple of minutes

User avatar
boony
Posts: 6
Joined: Mon Nov 07, 2016 8:53 pm

Sun Sep 16, 2018 5:58 pm

ruthlessimon wrote:
Sun Sep 16, 2018 5:45 pm
ShaunWhite wrote:
Sun Sep 16, 2018 5:36 pm
How long is a long time? I run my big backtests overnight. My 'full' test takes about 6 hours and is just about finishing when I get up again.
6hrs!?!

lol - I moan if it takes anything longer than a couple of minutes
Lol

I kicked my back-test off at 01:30 on Friday. The 3300 races was a snapshot roughly 24 hours later.

Please tell me how you're doing it so quickly!!

I suspect it's the amount of data I'm processing that is the difference. I log full market depth from 30 mins out until market is suspended, including all the in-play data. My back-test then involves replaying all that data and simulating the bet placement and matching.

User avatar
ruthlessimon
Posts: 1465
Joined: Wed Mar 23, 2016 3:54 pm

Sun Sep 16, 2018 6:30 pm

boony wrote:
Sun Sep 16, 2018 5:58 pm
Please tell me how you're doing it so quickly!!

I suspect it's the amount of data I'm processing that is the difference. I log full market depth from 30 mins out until market is suspended, including all the in-play data.
30mins out + inplay! Blimely yah that'd be a lot of data ;)

If I'm looking at a "specific group" - I will initially refine my full dataset (i.e. only Hcaps) - straight away that reduces the workload on Excel

But generally, I'll be working on 3mth samples, with only the top 4 runners - this usually (max) equals between 10,000 - 20,000 rows, 600 columns (5mins price, 5mins vol)

For me personally, the majority of speed issues seem to be related to inefficient formulas

User avatar
boony
Posts: 6
Joined: Mon Nov 07, 2016 8:53 pm

Sun Sep 16, 2018 6:35 pm

So, I'm another couple thousand races processed and my analysis brings me back to my original concern as to whether it's a legit process to filter out courses.
Untitled2.png
Blue line is betting everything. Orange line is betting everything except races at the courses I decided were poor after looking at results after ~3300 races. Grey line is a new set of excluded courses which I determined looking at the results after ~5300 races.

After the first 3300 races, you can see that orange line moved about but ultimately flat-lined over the next 2000 races.

Obviously the grey line looks a lot better, but what will happen over the next 2000 races? Will it flat-line again and force me to come up with another set of course exclusions to make it profitable?

It seems to me that if the filters I come up with have no relevance on future races then they're pointless and this whole process of back-testing and applying filters is flawed.
You do not have the required permissions to view the files attached to this post.

User avatar
CallumPerry
Posts: 186
Joined: Wed Apr 19, 2017 5:12 pm
Location: Wolverhampton

Sun Sep 16, 2018 7:07 pm

I personally think at this stage you just go in with small stakes and try it out. Otherwise you run the risk of over-fitting. That's looks as good of an indicator to give it a go with a couple of quid you don't mind losing as anything I can imagine. Track everything when you go live and then after a few hundred markets you should see whether it looks promising, after a few thousand if it still works you've cracked it!

May I ask, what programmes do you lot use to back test? I'm just using excel. Say I use a one second logger and record all of the key information in the spreadsheet from 20 minutes out until 00:00:00 that's 1,200 rows of data per market. If I record thousand like you lot have and try and get excel to create charts and stuff it would just freeze. I've seen a video before, I think it was on Nigelk's YouTube channel where there was a loading bar. Is this something you lot use or is it a completely different programme? Point me in the right direction please!

User avatar
boony
Posts: 6
Joined: Mon Nov 07, 2016 8:53 pm

Sun Sep 16, 2018 8:04 pm

CallumPerry wrote:
Sun Sep 16, 2018 7:07 pm
I personally think at this stage you just go in with small stakes and try it out. Otherwise you run the risk of over-fitting. That's looks as good of an indicator to give it a go with a couple of quid you don't mind losing as anything I can imagine. Track everything when you go live and then after a few hundred markets you should see whether it looks promising, after a few thousand if it still works you've cracked it!

May I ask, what programmes do you lot use to back test? I'm just using excel. Say I use a one second logger and record all of the key information in the spreadsheet from 20 minutes out until 00:00:00 that's 1,200 rows of data per market. If I record thousand like you lot have and try and get excel to create charts and stuff it would just freeze. I've seen a video before, I think it was on Nigelk's YouTube channel where there was a loading bar. Is this something you lot use or is it a completely different programme? Point me in the right direction please!
I've written a suite of programs.

First one is the Logger; every half second it logs pretty much everything it gets from the Betfair API starting at 30 mins pre-scheduled off, until the race is suspended after going in-play. So the amount of data is vast - hence slow to process.

Then I have the Simulator that mimics the Betfair API and can replay the data logged by the Logger.

Finally a Strategy-Runner calls the Simulator as though it was Betfair to get data and allows me to test strategies. The strategy receives the data one row at a time, does the processing and then places the bets. The bet matching is simulated to mimic real-life as closely as possible. My simulator outputs a results file that I load into excel to do charts and other stuff.

I've tested loads of strategies but have only managed to get one that is profitable so far- I'm hoping this one will be my second. The beauty is that once the strategy is coded, it's a one-line code change to make it use the real API .. and the 'bot' is born :)

User avatar
Dublin_Flyer
Posts: 424
Joined: Sat Feb 11, 2012 10:39 am

Sun Sep 16, 2018 8:37 pm

Have you re-ran and included the orange line including the bad races to see was the recent flat line an arbitrary/seasonal fvck up? This month and next month are notorious for weird results with the changeover from flat to NH and likewise in April/May from NH to flat.

5500 races is only about 3 or 3.5 months because of summer racing quantities, could be worth running it a good while longer to see if the present streak is a seasonal change/aberration. The last 250 or so downhill streak only equates to about a 7 or 8 day period when there's 35-45 races daily, everyone has a bad week, prior to the last week it's growth was significant for the number of races involved.

I'm of the view that backfitting your system if you have a logical reason to do so is ok, it's when you start superbackfitting that it screws up.

User avatar
CallumPerry
Posts: 186
Joined: Wed Apr 19, 2017 5:12 pm
Location: Wolverhampton

Sun Sep 16, 2018 9:13 pm

That is seriously impressive stuff boony... not going to lie some people's knowledge and skills on this forum frightens me and my basic little robots. It's like I'm designing a really good paper airplane that is being thrown into a hurricane when other traders have made military standard fighter jets with their own hands.

Hope all of your work pays off for you!

Post Reply

Return to “Trading Horse racing”

  • Information
  • Who is online

    Users browsing this forum: Exabot [Bot], Pepperpot and 3 guests