Better than A/B Testing?



you can't test more than 1 thing at once, because you don't know which variable caused the different results.

That is why we make controlled test, changing 1 thing at a time and making adjustments.

Also showing 3 different land pages at the same time is retarded.


Iteration 1:
1/1 = 100% 1/1 = 100% 1/1 = 100%

Iteration 2:

1/2 - 50% 1/2 = 50$ 1/2 = 50%

Iteration 4:

1/4 - 25% 1/4 = 25% 2/4 = 50%

we chose subject #3 to show in 100% of the cases, until its results prove themselves not satisfying.

Here is where he fails. If for some reason those results are skewed and subject #3 is not the best, there will be needed 5 more iterations until that is proven.

That is why you do it like this:

10,000 guaranteed impressions. 9k impressions on your favorite copy, 1k for the chalanger.

If chalanger wins - he is your favorite copy. Repeat.

You can't just automate that. The test results, which lead to displaying different variations are gathered from statistically insignificant numbers. You can potentionaly lose a lot more, because of statistic errors due to various factors skewing your decision making program.

i hope that make sense. Thanks for the link though.
 
This method testing is essentially flawed. Why? You still need the same amount of tests to determine which result is best, and the 10% method doesn't evenly take into account day parting conversion rates. Pick your "confidence" number (the amount of iterations required before you decide a choice is bad) and run it for that number. Once you hit that number, remove the bad performers - it's not that hard at all. If anything, the 10% method takes longer to arrive at the same conclusion.
 
you can't test more than 1 thing at once, because you don't know which variable caused the different results.

That is why we make controlled test, changing 1 thing at a time and making adjustments.

Also showing 3 different land pages at the same time is retarded.

Maybe you don't have enough traffic, but it's not retarded to deliver more than 3 LPs if you have at least an average 1,000 daily users for every lander.
And sure you can test more than 1 element of page at time, this is the reason why Fully Factorial Testing was created.
 
....<snip>

Iteration 4:

1/4 - 25% 1/4 = 25% 2/4 = 50%

we chose subject #3 to show in 100% of the cases, until its results prove themselves not satisfying.

Here is where he fails. If for some reason those results are skewed and subject #3 is not the best, there will be needed 5 more iterations until that is proven.

. ...<snip>

untrue.

Everyone seems to gloss over the very first few lines of his code. I quote:
def choose():
if math.random() < 0.1:
# exploration!
# choose a random lever 10% of the time.
else:

This little bit prevents the algorithm from straying into a premature optimization too early.

::emp::