The usability testing reporting template

M 32 Reporting Template

Use the following guidelines to assure that required elements are reported. Use the ones that are relevant for your testing purpose, method and setting.

NOTE: this works as guideline. As mentioned above, not everything mentioned has to be described.
NOTE: this template is meant to be used for reporting all kind of tests.

Title Page

Usability Report.	xxxx
Due date of Report:	xx.xx.200x
Actual submission date:	xxx.xx.200x
Revised version:	xx.xx.200x (final)

Product name and version:	Tool name, version (proto or stable version)
Organisers of the test:	Xxxxx

 |Date of the test:|Xxx, xxx|

Date of the report:	Xxx,xxx
Editor:	Xxx

 |Contact name(s): | Xxx|

Executive summary

Provide a brief level overview of the test (including purpose of test)
Name the product:
Purpose/objectives of the test:
Method 1:
Number and type of participants:
Tasks (if task are used):
Method 2:

Results in main points
e.g. bullet list (this is needed for being able to get the main results without reading the full report, this is seen important, since the reports serve different purposes and sometimes the need is to get a fast overview)

Table of contents
Title Page
Introduction
Full Product
Test Objectives
Method
Participants
Context of Product Use in the Test
Test Facility
Experimental
Results

Introduction

Full Product Description

Formal product name and release or version
Describe what parts of the product were evaluated
The user population for which the product is intended
Brief description of the environment in which it should be used (this means the context of the use of product/tool, e.g., is it an education product used in primary school, higher education, etc., or maybe research tool used in the field -then what could be field)

Test Objectives

State the objectives for the test and any areas of specific interest
Functions and components with which the user directly and indirectly interacted
Reason for focusing on a product subset

Method

Participants

The total number of participants tested
Segmentation of user groups tested, if more than one user group was tested
Key characteristics and capabilities of user group (this info might have been acquired through the background (pre) questionnaires, thus it can be just referred here, e.g. linked to the description of the results of the background (pre) questionnaires)
How participants were selected; whether they had the essential characteristics
Differences between the participant sample and the user population

Context of Product Use in the Test

Any known differences between the evaluated context and the expected context of use
Tasks
Describe the task scenarios for testing
Explain why these tasks were selected
Describe the source of these tasks
Include any task data/information given to the participants
Completion or performance criteria established for each task

Test Facility
Describe the setting, and type of space in which the evaluation was conducted
Detail any relevant features or circumstances, which could affect the results (e.g. There was a brake down of the server, which messed up the test for a while and created unnecessary tension. There was unforeseeable noise that disturbed the test, etc.)

Participant's Computing Environment

Computer configuration, including model, OS version, settings,
Browser name and version;
Relevant plug-in names and versions (the bullets mean stating e.g., what browser and computers the users are using in the test. In field trials this is information that is not known by the technical partners. For example, in one of the tests during last spring 2007, one of the users was at home using SSp during the test, so it was asked what she used e.g., Internet Explorer 6 and Mozilla Firefox2.0.0.6, Compaq Presario with Windows XP and IBM ThinkPad with Windows XP. If all is not know then it is not but it would be good to try to get the info. Plug-ins can refer for example to the browser add-ons (in Firefox these are found from the upper tools menu. Sometimes it is needed to know if some plug-ins are on or off, because it might change or prohibit some functions.).

Display Devices (report if relevant, e.g., Paper prototypes are tested or static prototypes are tested on screen)

If screen-based, screen size, resolution, and colour setting
If print-based, the media size and print resolution

Test Administrator Tools (report if relevant for the particular test)

If a questionnaire was used, describe or specify it here (add these to appendix)
Describe any hardware or software used to control the test or to record data (audio, video)

Experimental Design

Define independent variables and control variables
Describe the measures for which data were recorded (the scale/scope of the recorded data, if relevant for the particular test, i.e., written notes, think aloud in audio recording, etc.).

Procedure

Operational definitions of measures (e.g., how is it decided that that a task is completed)
Policies and procedures for interaction between tester(s) and subjects (e.g., is the test conductor aloud to answer questions of the user, provide help, etc.)
State used: non-disclosure agreements, form completion, warm-ups, pre-task training, and debriefing
Specific steps followed to execute the test sessions and record data
Number and roles of people who interacted with the participants during the test session
Specify if other individuals were present in the test environment
State whether participants were paid

Participant General Instructions (here or in Appendix)

Instructions given to the participants
Task instruction summary
Usability Metrics (if used)
Metrics for effectiveness
Metrics for efficiency
Metrics for satisfaction, etc.

Results

Data Analysis
Quantitative data analysis
Qualitative data analysis
Presentation of the Results
From quantitative data analysis
From qualitative data analysis (descriptive and clarifying presentation of the results)

Reliability and Validity

Reliability is the question of whether one would get the same result if the test were to be repeated.
This is hard to acquire in usability tests, but it can be reasoned how significant the findings are.
(Example from expert evaluation: This review was made by one reviewer in order to give quick feedback to the development team. To get more reliable results it would have been desirable to use three, or at least two reviewers, as it is often the case that different reviewers look at different things. We do feel, however, that for the purpose of this report, and the essence of quick feedback, one reviewer has given enough feedback to enhance the usability of the system.)

Validity is the question of whether the usability test measured what was thought it would measure, i.e., provide answers to. Typical validity problems involve: using the wrong users, giving them the wrong tasks.
(Example from expert evaluation: "The reviewer is an experienced usability professional that has evaluated systems for many years. We therefore feel that the method used as well as the tasks used give an appropriate view of how ordinary users would behave in the system.")

Summary Appendices

Custom Questionnaires, (if used, e.g., in expert evaluation there is no participants)
Participant General Instructions
Participant Task Instructions, if tasks were used in the test

Child pages