Morning everyone,
I have recently embarked on a deep data-analysis project and I'm looking for opinions on whether is it worthy or not.
Over the 2 or so years that I started trading, I have tried many different methods and strategies, with mixed results. I recently came to the conclusion that I would like to start relying on my own data research and analysis rather than reading stats on sports sites.
What I'm currently looking to do is basically this: collect both STATS and ODDS data from games within leagues that have enough liquidity, and cross reference the data to look for an edge in certain situations.
A very poor example would be...
IF the score is 0-0 by the 30th minute,
AND home team scored X number of goals in the last matches
AND the 0-0 odds are between X-Y
THEN back/lay/whatever (again, please don't take this example seriously)
I like the idea of getting my own data. My question to you guys is this... do you see a potential in this? Or do the bookies (and exchanges) already have an edge on me no matter how much time and effort I put into this?
Is this big project worth the time? (Opinions)
Having collected data and started using it, it will most probably start to revert i.e. the stats will move back towards the norm for your particular scenario and your edge will be lost. During the reversion, you will lose money.Juumanji wrote: ↑Sun May 21, 2017 12:03 pmMorning everyone,
I have recently embarked on a deep data-analysis project and I'm looking for opinions on whether is it worthy or not.
Over the 2 or so years that I started trading, I have tried many different methods and strategies, with mixed results. I recently came to the conclusion that I would like to start relying on my own data research and analysis rather than reading stats on sports sites.
What I'm currently looking to do is basically this: collect both STATS and ODDS data from games within leagues that have enough liquidity, and cross reference the data to look for an edge in certain situations.
A very poor example would be...
IF the score is 0-0 by the 30th minute,
AND home team scored X number of goals in the last matches
AND the 0-0 odds are between X-Y
THEN back/lay/whatever (again, please don't take this example seriously)
I like the idea of getting my own data. My question to you guys is this... do you see a potential in this? Or do the bookies (and exchanges) already have an edge on me no matter how much time and effort I put into this?
To take your trivial example, what's significant about the 30th minute? Why isn't the 29th minute and 52nd second of the match significant?
What's significant about there having been no goal scored by this time. Why isn't the home team having scored 1 goal by this time significant?
I could go on but I hope that I've made my point.
In the past, I tried this with horse racing data, as have others. I have yet to see data that hasn't reverted. The issue with reversion is that it could happen at any time and without warning. I can tell you, I have witnessed some real disasters caused by reverting data.
The issue is, when you analyse any kind of data, eventually, you will find an edge. That edge exists usually because of randomness and not for any specific reason.
The only systems that I have found to work is when a theory has firstly been identified which is based upon solid and logical reasoning. THEN, and ONLY THEN, should you interrogate historic data to determine if it supports your logic. If you find it does, you are probably onto a winner. If not, assume that your logic is at fault.
I wish you well and hope that this helps.
Anna
Should you collect and analyse your own data? - Absolutely.
The more work you put into it, the more likely you'll find something that no one else has, and the bigger barriers to competition - like any business. If you're generating stats that aren't publicly available, then great! You have your own proprietary data.
But does that guarantee you success? - No! It just gives you a good starting point, and might offer a better chance that you can find an inefficiecy that isn't factored into the market already.
In terms of your example strategy, I think Anna is spot on. It's best to have an idea and then use the data to verify your principle, rather than datamine and look for any patterns - otherwise you're bound to find something at some point somewhere! Try to minimise your input parameters as the more parameters you have, the more likely you're just overfitting to the data. Try to generalise your principle as much as possible. This will also mean you'll have more data points to test against for significance. Too many inputs mean too few selections and statistical arbitrage (basically what you're doing) needs to have many selections to produce enough consistency in the returns.
Have people had success doing what you're doing? - Yes! I'm at least 1 example anyway.
Is it easy? - Generally, no! If it were easy, everyone would have done it already.
The more work you put into it, the more likely you'll find something that no one else has, and the bigger barriers to competition - like any business. If you're generating stats that aren't publicly available, then great! You have your own proprietary data.
But does that guarantee you success? - No! It just gives you a good starting point, and might offer a better chance that you can find an inefficiecy that isn't factored into the market already.
In terms of your example strategy, I think Anna is spot on. It's best to have an idea and then use the data to verify your principle, rather than datamine and look for any patterns - otherwise you're bound to find something at some point somewhere! Try to minimise your input parameters as the more parameters you have, the more likely you're just overfitting to the data. Try to generalise your principle as much as possible. This will also mean you'll have more data points to test against for significance. Too many inputs mean too few selections and statistical arbitrage (basically what you're doing) needs to have many selections to produce enough consistency in the returns.
Have people had success doing what you're doing? - Yes! I'm at least 1 example anyway.
Is it easy? - Generally, no! If it were easy, everyone would have done it already.
-
- Posts: 3140
- Joined: Sun Jan 31, 2010 8:06 pm
I think one of your problems, especially if you're only just starting, is maybe you won't have enough data to make reliable deductions. It's very easy to find 'winning' systems by backfitting. That's not to say you shouldn't try and it's always best to have your own data. The majority of football markets are driven by bots betting to set algorithms these days so will always have 'chinks in their armour' that can be exploited by those willing to seek them out.
I have spent 5 years looking at the time of the opening goal in terms of the effect on accuracy = jaw dropping in play analysisJuumanji wrote: ↑Sun May 21, 2017 12:03 pmMorning everyone,
I have recently embarked on a deep data-analysis project and I'm looking for opinions on whether is it worthy or not.
Over the 2 or so years that I started trading, I have tried many different methods and strategies, with mixed results. I recently came to the conclusion that I would like to start relying on my own data research and analysis rather than reading stats on sports sites.
What I'm currently looking to do is basically this: collect both STATS and ODDS data from games within leagues that have enough liquidity, and cross reference the data to look for an edge in certain situations.
A very poor example would be...
IF the score is 0-0 by the 30th minute,
AND home team scored X number of goals in the last matches
AND the 0-0 odds are between X-Y
THEN back/lay/whatever (again, please don't take this example seriously)
I like the idea of getting my own data. My question to you guys is this... do you see a potential in this? Or do the bookies (and exchanges) already have an edge on me no matter how much time and effort I put into this?
how is the project going ?
as a heads up to save time < dont pay any attention to shot on target data < go the route of game pathway analysis = looking at survival analysis re football in terms of expectation of the other team fighting back when they concede
as a heads up to save time < dont pay any attention to shot on target data < go the route of game pathway analysis = looking at survival analysis re football in terms of expectation of the other team fighting back when they concede