Sample size question

Football, Soccer - whatever you call it. It is the beautiful game.
Post Reply
Pinocchio
Posts: 4
Joined: Wed Oct 14, 2009 11:21 pm

Hi,

im a pro poker player who is having a go at betangle and trading on football matches to see if i can make any cash.

I have sort of developed a betting system and i started testing it yesterday. From what i can work out it is +ev (but im prob wrong) and should get a ROI of 10-15%. So far im hitting 33% ROI but im prob running hot.

what im wondering is what sample size do i need to get an accurate idea of the systems ROI? i was thinking around 100, any ideas...?

Ty
MarkWraith
Posts: 8
Joined: Sat Aug 15, 2009 8:08 pm

Funnily enough I'm a poker player as well.

As posed your question is unanswerable because the sample size needed to reasonably determine population EV depends on the standard deviation of the sample.

That is, you have observed some data with an EV of +33%. Imagine now that the s.d. of this data was zero - every trial, you get a return of 33%. That would be akin to having a system that has a guaranteed 100% success rate of picking winners priced at 1.33.

Conversely, your system might give you a method of picking winners priced at 1000 correctly one time in 666.

The first system needs a very small sample to determine it's EV - zero trials. The second system needs a very large sample (because you could easily have never have picked a winner after 1000 races even if your system wins in the 'long run')

If you provide the average odds backed at, and the EV, we can then estimate a confidence interval. There are a few other assumptions needed but can't be bothered to explain them right now.

Suppose our average odds backed at are p, and we always use the same stake, say '1 unit' (it's just a scaling factor, if you bet £100 then £100 is '1 unit').

If we win, we gain p-1 units. If we lose, we lose 1 unit. If we win q percent of the time, then our EV in units is:

q(p-1) - (1-q)


You have told us that from your sample, this comes out to be 0.33

q(p-1) - (1-q) = 0.33


We can now calculate q. For example, if p were 4, then we get 3q - (1-q) = 0.33. 4q - 1 = 0.33. 4q = 1.33. q = 0.3325 in this example.

Now with our p and q we estimate the variance of one trial. Note that this is an estimate based on the sample, but sample s.d. converges to population s.d. extremely quickly in relation to the convergence of sample mean to population mean, so it works as a ballpark figure.

The formula for variance is:

Var(X) = E(X - E(X)]^2

E(X) = 0.33 here, our expectation.

In our case, this works out to be:

Var(X) = q(p-1-0.33)^2 + (1-q)(-1-0.33)^2

Using the example I gave earlier where p = 4 and thus q = 0.3325:

Var (X) = 0.3325(2.667)^2 + 0.6675(-1.33)^2

= 3.54 units squared.

The variance of n trials is 3.54n units squared, and so the s.d. of n trials is 1.883sqrt(n). Again, this is for my example with p=4, yours will be different.

You want an 'accurate' idea of the systems ROI. Suppose we say that 'accurate' is +/- 1%. The standard error of n trials is 1.883/sqrt(n), we want this to be less than 0.01.

1.883/sqrt(n) < 0.01
188.3/sqrt(n) < 1
188.3 < sqrt(n)


n > 35456. This is the point at which 1 standard deviation is less than 0.01 EV. 1 s.d. contains approximately 68% of all samples.

As a poker player it probably shouldn't surprise you that the number is so large, in fact, a sample of 35k hands of a given limit of poker would not be much of a guide to whether you were a winning player at all. [The s.d. in poker is much higher, as a proportion of EV, than it is in sports-betting].

As a second example, if you just wanted to know if you were likely to be a winner, you could take accurate to be +/- 33% since this is your observed win-rate.


1.883/sqrt(n) < 0.33. Multiply both sides by 3 and sqrt n.
5.649 < sqrt(n)
n > 31.


So after only 32 trials observing 0.33 EV, if you are backing at an average price of 4's, you would be about 68% sure that your strategy is a winner.
[Edit: actually 84% sure, we only care about results that are 1 s.d. below observed, not 1 s.d. above].
Last edited by MarkWraith on Thu Oct 15, 2009 1:39 pm, edited 1 time in total.
Pinocchio
Posts: 4
Joined: Wed Oct 14, 2009 11:21 pm

thank you for the detailed response
Quinny
Posts: 83
Joined: Sat May 02, 2009 4:57 pm

Mark,

Very good of you to take the time and effort with such a detailed reply.

I learned quite a bit from that.
Pinocchio
Posts: 4
Joined: Wed Oct 14, 2009 11:21 pm

im hitting a 14.56 ROI over a 25 game sample, looking good so far...
Post Reply

Return to “Football trading”