Running effective prototype tests takes more than good intentions. Without clear objectives and a repeatable process, you'll collect plenty of data and still walk away with nothing your team can act on.
If you're still deciding which prototype fidelity to build or which testing methodology fits your sprint, start with our Prototype Testing Guide, which covers fidelity levels, moderated versus unmoderated testing, and the metrics that prove impact.
This article picks up from there with a five-step framework you can run inside a single sprint, from writing your first hypothesis to prioritising what gets fixed.
Start with specific, testable questions rather than vague goals. Instead of "Let's see if users like the new design," ask "Can users complete a purchase in under 90 seconds using the one-page checkout?" This specificity determines everything else: which prototype fidelity to use, which testing methodology to employ, and which metrics to track.
Frame objectives as hypotheses with measurable success criteria. For example: "We believe that consolidating the checkout process to one page will enable users to complete purchases 20% faster than the current three-page flow." This hypothesis defines what you're testing (checkout consolidation), what you're measuring (completion time), and what would constitute success (20% improvement).
Connect objectives to business goals by tying UX improvements to KPIs. If the business goal is reducing cart abandonment, your testing objective might focus on identifying friction points in the checkout flow. If the goal is increasing feature adoption, test whether users understand the value proposition and can find the feature without assistance.
Tying research questions to KPIs this way keeps testing focused on insights stakeholders actually care about, not just interesting observations.
Testing with colleagues or friends invalidates your results. Internal employees already understand your product's mental model and terminology. Friends want to be supportive and may unconsciously give positive feedback. Testing with the wrong people is genuinely worse than not testing at all because it creates false confidence in flawed designs.
Use screener surveys to recruit participants who match your actual target personas. If you're designing a budget airline booking app, recruit frequent travellers who regularly use budget carriers, not business travellers who prioritise premium services. If you're testing accounting software, recruit small business owners who manage their own books, not professional accountants with specialised training.
For CX Managers, this means tester demographics should mirror your actual customer base:
Sample size depends on your research goals. For qualitative usability testing, Jakob Nielsen's research shows that five users uncover approximately 85% of usability issues. For quantitative preference testing or statistical validation, recruit 20 to 30 or more participants.
External panels provide access to large participant pools, while a private customer community, like the one you can build in Leanlab's Customer Lab, lets you test with people who already use your product, without a recruitment cycle for every round.
Task design determines the quality of feedback you receive. Leading questions that telegraph the correct action invalidate results by creating artificially high success rates.
Avoid instructions like "Click the green 'Add to Cart' button." This tells users exactly what to do, bypassing the discovery process that reveals usability issues. Instead, create scenarios that reflect real-world context: "You're looking for a gift for a friend under £50. Find an item you like and proceed to the point of purchase." This scenario provides motivation and context without revealing the interface elements users should interact with.
Use neutral language that doesn't hint at expected paths. Don't say "Use the search function to find…" because this assumes users will use search. Instead, say "Find a product that meets these criteria" and observe whether users naturally gravitate toward search, browse categories, or use filters.
For moderated tests, prepare follow-up probes in advance. When a user hesitates, ask "What were you expecting to see?" or "How did that make you feel?" These open-ended questions reveal the cognitive friction behind observable behavior.
During testing, your primary job is observation, not intervention. Encourage participants to use the think-aloud protocol, verbalising their thought process as they work. This narration reveals the gap between what users do and what they intend to do.
For moderated tests, observe:
A furrowed brow or frustrated sigh often indicates a usability problem even if the user eventually completes the task. Take notes on both actions and commentary, because these often diverge. A user might say "This is easy" while taking three wrong turns to complete a simple task.
For unmoderated tests, review recordings systematically, looking for patterns in navigation, error recovery attempts, and task abandonment. If multiple users make the same wrong turn, that's a design flaw, not user error.
Avoid intervening unless the user is completely stuck due to a prototype bug rather than a design issue. If a user struggles because the prototype doesn't respond to a click, you can clarify. If they struggle because they can't find the button, that's valuable feedback about your design.
Analysis begins by looking for patterns. If one user struggles with a feature, it might be an outlier or a misunderstanding of the task. If three or more users struggle with the same element, you've identified a design flaw that needs fixing.
Use the Traffic Light system to prioritise issues:
Quantify findings by calculating success rates, average time on task, and error rates. These metrics provide objective measures of improvement across iterations. If your initial prototype had a 60% success rate and your revised version achieves 85%, you've demonstrated measurable progress.
Create highlight reels by compiling short video clips of users struggling with specific features. For stakeholders who didn't observe testing sessions, watching a customer get confused is far more persuasive than reading statistics. These clips transform abstract data into concrete, empathetic understanding of user challenges.
Stockmann's CX team used this same step-by-step approach to rebuild their login flow: they split the redesign into three smaller pieces, tested each one in Leanlab over two weeks, and caught several assumptions that didn't match how customers actually behaved before development finalised the flow. As Arla Jussila, Lead Specialist for Customer Experience & Insight at Stockmann, put it: "If we didn't have Leanlab, I don't think we would have been able to do any testing. We probably would have ended up just launching and hoping for the best."
You can run this entire framework with pen, paper, and a spreadsheet. What Leanlab changes is steps 2 and 4: instead of recruiting participants from scratch for every round, you invite them from your own private Customer Lab, and instead of waiting weeks for a testing agency, you can launch an unmoderated test on Thursday morning and have results by Friday afternoon. The hypotheses, task scripts, and Traffic Light prioritisation above stay exactly the same. Leanlab just removes the friction between steps.