The primary difference between (generative) research and an (evaluative) experiment is the existence of a clear, testable hypothesis.
This article helps you learn how to write strong hypotheses. In GLIDR, this skill will be critical for running accurate and valuable Experiments.
Once you've figured out a good hypothesis, visit the Index of Methods page to figure out the best testing methodology.
Article excerpted from The Real Startup Book
Writing a Good Hypothesis
Lean startup practices turn project managers, business leaders, and designers into scientists who constantly validate their ideas by running an array of experiments. But experiments can get out of hand and turn perfectly sane people into mad scientists. A sure way to keep our sanity is to start with a strong hypothesis to give the experiment structure.
A strong hypothesis will tell you what you are testing and what you expect to get out of the test. By stating expectations, you are delineating the goals that the experiment has to hit to make it a success or failure. This will help you define when to determine that the experiment needs to be scrapped or the idea is ready to be taken to market.
Key elements of writing a good hypothesis:
The change that you are making
The aspect that will change
The success or fail metric
How long are you going to run the test
A hypothesis will end up looking like this:
This new feature will cause a 10 percent increase of new users visiting the homepage in 3 months.
(the change) ----------------------------(the metric)-------------------(the impact)----------------------(the timeframe)
Let’s break down each aspect:
This is the aspect that you are going to change, launch, or create that is going to affect your overall business or product. It can be as simple as changing the color of a button or as big as launching a new marketing campaign. Make sure that only one aspect is changed at a time, otherwise there is no way to tell which aspect contributed to the effect.
This is the expected results of the experiment. If you change x, then you expect y to happen.
This is a measurement that needs to be hit or surpassed. This can be a fail metric where if the experiment does not meet the minimum goal, then the project must pivot to a completely new direction. The metric can also be a success metric, where the experiment is deemed to be a success if it hits the goal. Choosing between the success or fail metric is dependent on if you want the baseline to know when to scrap a project or when to launch a project.
This is the length of time it takes to run the test. If the timebox is too short then the amount of data might be too small, or there might not have been enough time for effects to take place. But if the timebox is too long you are wasting valuable time collecting unnecessary data.
Let’s go through an example scenario of writing a hypothesis:
Say that you are a product manager at a startup that creates a mobile app to help waiters and waitresses keep track of their tips. You have noticed that users who document their tips four times a week have a higher retention rate. You want to see if you can increase the number of times current users use the app within a week.
What are you going to change within the app?
How about adding a notification system so the user can set reminders to ping them at the end of a shift?
What do you want the outcome to be?
You want more users to open the app four or more times in a week.
What is the metric of failure or success?
At this time you have 50,000 monthly users and 10,000 use the app four days a week. You want to increase the current user’s rate of opening the app from 10,000 to 15,000. This translates to a 10 percent increase.
How long are you going to run the test?
This always depends on a number of variables within the company but let’s say that you are at a midsize company that has a little more time to get the correct data. So let’s say three months.
The end hypothesis would be:
If I add a notification feature that allows the waiter/waitress to set reminders to add in his/her tips, then I am going to see a 10 percent increase in the number of users opening the app four times or more in a week over the next three months.
Below is a worksheet that will test if you can figure out the strongest hypothesis for a given scenario.
Scenario 1: You work for a company that rents out toddlers’ clothes. It is a monthly subscription where families get a box of five pieces of clothing, and when the toddler grows out of them, they return the clothes for a new box. The data shows that there might be a correlation between members who frequently send back items to higher customer retention rates. Your goal is to have members return more boxes. You have decided that you can do this by adding pieces that are seasonal, holiday themed, or super trendy so that the family will need to keep updating the clothes.
By adding one piece of special occasion clothing, you will see a 10 percent rise in returned boxes in three months.
If you include one special occasion outfit, a new designer piece and a seasonal accessory, then you will see a 15 percent increase in returned boxes in the next 12 weeks.
When you add three seasonal pieces, families will learn to request more items, and you will see growth in the next 2 months.
By including one trending designer piece, you will see a 15 percent increase in requests for those designers once the experiment is completed.
Scenario 2: You already made your millions with the Uber for parrots so you decided to invest your money into saving the manatees. You designed a tracking app that shows boaters where herds of manatees are sleeping so they don’t run the herds over. You are having a hard time getting the boaters to download the app, so you decide to start advertising. You want to conduct a test to see if a promotion will increase the app’s downloads.
If you pair up piers to give 10 percent off of a month of docking their boat if the owner download the app, then you will see a 10 percent increase in downloads over the next three years.
If you give out 10 percent coupons to boat rentals for downloading the app, 15 percent off tack shops, and advertise around piers, then you will see an increase of 15 percent new downloads the next three months.
By pairing up with 10 boat rentals to give a coupon for 10 percent off the boat rentals for downloading the app, you will see a 5 percent increase of downloads over a 6 month period.
When you have a special where someone downloads the app, they get a one-of-a-kind lure at Ted’s tack shop (which has 15 stores in Florida), then you will see a 10 percent decrease in manatee deaths over the next 5 months.
Scenario 3: Your Labrador is obsessed with a tennis ball and you are tired of throwing the slobbery thing. It inspired you to start a drone company that drops tennis balls and takes funny pictures of the dogs. Your customer support team has received complaints that it is hard to understand how to download the pictures from the iPhone app. You want to test moving the photos section to various parts of the app.
If you add a photos section to the navigation bar, then you will see a 5 percent increase of new users over a 4 month period.
If you advertise about the photo feature in your app, you will have more users and fewer complaints within the next 10 weeks.
If you add three pages to the onboarding process that explains the pictures, then you will get 25 percent increase of dog pictures.
By moving the download photos to be part of the home screen, you will receive 50 percent fewer complaints about the photos section in the next three months.
Scenario 4: You are so sick of wearing the same outfits that you developed an AI software to pick out your clothes every morning. A venture capitalist saw your tweets about it and gave you a million dollars to start the company. You need the AI to address weather conditions when choosing the clothes. You want to run an experiment to test a method for collecting data for when it is 80 degrees and sunny.
If you poll people in popular cities on sunny days, then you will be able to add 5 percent more data points.
If you see what is in clothing stores on sunny days, you will be able to add 10 percent more data points to the algorithm in a month.
If you send out a survey to ask people what they are wearing when it is 80 degrees and sunny, then you will get a 75 percent answer rate in a week.
By collecting data points of Instagram selfie dates to days that are 80 degrees and sunny, your AI can identify 75 percent of the clothing in 3 months.
Scenario 5: Men’s socks are a great way to jazz up an outfit, so you decided to start a men’s sock e-commerce store. Your customers are not completing the checkout process, and usability tests show that some users question the site’s security. You want to add a small adjustment to the payments page to see if more users complete the checkout process.
If you add a password strength indicator then more people will create passwords in the next 2 months.
If you add a lock icon next to the credit card information, the completion of the checkout process will increase by 15 percent in 3 months.
If you make the site prettier, the completion of the checkout process will increase by 25 percent in 6 months.
If you add a review page before the confirmation page, then 20 percent of customers will be able to complete the checkout flow in 10 minutes.
1 - A
2 - C
3 - D
4 - D
5 - B
Learn from your mistakes. Look at the questions you got wrong and see which key element is either missing or vague.
There are too many variables. If you are testing multiple things then you cannot pinpoint which variable caused the results.
There is not an achievable metric attached to the hypothesis to know the point at which the experiment succeeds or fails.
The success or failure of the experiment is not directly linked to the experiment. If the success or failure could have been the result of a number of variables, then you don’t know if the experiment was the reason for the change.
The timebox is too long or too short. Some experiments are going to take longer, but make sure that the timebox is reasonable for the growth rate of your company. You don’t want to have an experiment that takes years or extends past your runway.
Other Resources (external links)
Is it simple and unambiguous?
I mean … is it simple...?
...and is it unambiguous?
Is it measurable?
Does it describe a relationship between two things?
Is the cause and effect relationship clear?
Is it achievable?
Is there any evidence that would convince us the hypothesis is invalid?
GLIDR helps you put Discovery at the heart of your process, sign up for your trial here.