Better than A/B Testing?

Medici · May 30, 2012

20 lines of code that will beat A/B testing every time - Steve Hanov's Programming Blog

Math seems to make sense. thoughts?

momchil9 · May 30, 2012

you can't test more than 1 thing at once, because you don't know which variable caused the different results.

That is why we make controlled test, changing 1 thing at a time and making adjustments.

Also showing 3 different land pages at the same time is retarded.

Iteration 1:
1/1 = 100% 1/1 = 100% 1/1 = 100%

Iteration 2:

1/2 - 50% 1/2 = 50$ 1/2 = 50%

Iteration 4:

1/4 - 25% 1/4 = 25% 2/4 = 50%

we chose subject #3 to show in 100% of the cases, until its results prove themselves not satisfying.

Here is where he fails. If for some reason those results are skewed and subject #3 is not the best, there will be needed 5 more iterations until that is proven.

That is why you do it like this:

10,000 guaranteed impressions. 9k impressions on your favorite copy, 1k for the chalanger.

If chalanger wins - he is your favorite copy. Repeat.

You can't just automate that. The test results, which lead to displaying different variations are gathered from statistically insignificant numbers. You can potentionaly lose a lot more, because of statistic errors due to various factors skewing your decision making program.

i hope that make sense. Thanks for the link though.

netbert · May 30, 2012

Very detailed discussion about this article on HackerNews: 20 lines of code that beat A/B testing every time | Hacker News

tspesh · May 30, 2012

momchil9 said:
you can't test more than 1 thing at once, because you don't know which variable caused the different results.

Not entirely true.

Multivariate testing - Wikipedia, the free encyclopedia

momchil9 · May 30, 2012

tspesh said:
Not entirely true.

Multivariate testing - Wikipedia, the free encyclopedia

thank you for the link, i will definitely look into it later, but I suppose it is really time consuming and complicated thing to do.

Insomniac · May 30, 2012

This method testing is essentially flawed. Why? You still need the same amount of tests to determine which result is best, and the 10% method doesn't evenly take into account day parting conversion rates. Pick your "confidence" number (the amount of iterations required before you decide a choice is bad) and run it for that number. Once you hit that number, remove the bad performers - it's not that hard at all. If anything, the 10% method takes longer to arrive at the same conclusion.

tomaszjot · May 30, 2012

Something worth reading:

Landing Page Optimization: The Definitive Guide to Testing and Tuning for Conversions: Amazon.co.uk: Tim Ash: Books

faceblogger · May 30, 2012

tomaszjot said:
Something worth reading:

Landing Page Optimization: The Definitive Guide to Testing and Tuning for Conversions: Amazon.co.uk: Tim Ash: Books

Not sure if tomaszjot

Flash4Ever · May 30, 2012

momchil9 said:
you can't test more than 1 thing at once, because you don't know which variable caused the different results.

That is why we make controlled test, changing 1 thing at a time and making adjustments.

Also showing 3 different land pages at the same time is retarded.

Maybe you don't have enough traffic, but it's not retarded to deliver more than 3 LPs if you have at least an average 1,000 daily users for every lander.
And sure you can test more than 1 element of page at time, this is the reason why Fully Factorial Testing was created.

Flash4Ever · May 30, 2012

faceblogger said:
Not sure if tomaszjot

Good reading indeed, especially for the well depicted case studies.

emp · May 30, 2012

momchil9 said:
....<snip>

Iteration 4:

1/4 - 25% 1/4 = 25% 2/4 = 50%

we chose subject #3 to show in 100% of the cases, until its results prove themselves not satisfying.

Here is where he fails. If for some reason those results are skewed and subject #3 is not the best, there will be needed 5 more iterations until that is proven.

. ...<snip>

untrue.

Everyone seems to gloss over the very first few lines of his code. I quote:

def choose():
if math.random() < 0.1:
# exploration!
# choose a random lever 10% of the time.
else:

This little bit prevents the algorithm from straying into a premature optimization too early.

::emp::

netbert · Jun 1, 2012

Another rebuttal: Why multi-armed bandit algorithm is not "better" than A/B testing | Hacker News

Search

Search

Better than A/B Testing?

Medici

New member

momchil9

New member

netbert

New member

tspesh

New member

momchil9

New member

Insomniac

New member

tomaszjot

Membership Suspended

faceblogger

WF Senior Premium Member

Flash4Ever

New member

Flash4Ever

New member

emp

New member

netbert

New member