M 32 Reporting Template
Use the following guidelines to assure that required elements are reported. Use the ones that are relevant for your testing purpose, method and setting.
NOTE: this works as guideline. As mentioned above, not everything mentioned has to be described.
NOTE: this template is meant to be used for reporting all kind of tests.
Title Page
Usability Report. |
xxxx |
Due date of Report: |
xx.xx.200x |
Actual submission date: |
xxx.xx.200x |
Revised version: |
xx.xx.200x (final) |
Product name and version: |
Tool name, version (proto or stable version) |
Organisers of the test: |
Xxxxx |
|Date of the test:|Xxx, xxx|
Date of the report: |
Xxx,xxx |
Editor: |
Xxx |
|Contact name(s): | Xxx|
Executive summary
Provide a brief level overview of the test (including purpose of test)
Name the product:
Purpose/objectives of the test:
Method 1:
Number and type of participants:
Tasks (if task are used):
Method 2:
Results in main points
e.g. bullet list (this is needed for being able to get the main results without reading the full report, this is seen important, since the reports serve different purposes and sometimes the need is to get a fast overview)
Table of contents
Title Page
Introduction
Full Product
Test Objectives
Method
Participants
Context of Product Use in the Test
Test Facility
Experimental
Results
Introduction
Full Product Description
- Formal product name and release or version
- Describe what parts of the product were evaluated
- The user population for which the product is intended
- Brief description of the environment in which it should be used (this means the context of the use of product/tool, e.g., is it an education product used in primary school, higher education, etc., or maybe research tool used in the field -then what could be field)
Test Objectives
- State the objectives for the test and any areas of specific interest
- Functions and components with which the user directly and indirectly interacted
- Reason for focusing on a product subset
Method
Participants
- The total number of participants tested
- Segmentation of user groups tested, if more than one user group was tested
- Key characteristics and capabilities of user group (this info might have been acquired through the background (pre) questionnaires, thus it can be just referred here, e.g. linked to the description of the results of the background (pre) questionnaires)
- How participants were selected; whether they had the essential characteristics
- Differences between the participant sample and the user population
Context of Product Use in the Test
- Any known differences between the evaluated context and the expected context of use
- Tasks
- Describe the task scenarios for testing
- Explain why these tasks were selected
- Describe the source of these tasks
- Include any task data/information given to the participants
- Completion or performance criteria established for each task
Test Facility
Describe the setting, and type of space in which the evaluation was conducted
Detail any relevant features or circumstances, which could affect the results (e.g. There was a brake down of the server, which messed up the test for a while and created unnecessary tension. There was unforeseeable noise that disturbed the test, etc.)
Participant's Computing Environment
- Computer configuration, including model, OS version, settings,
- Browser name and version;
- Relevant plug-in names and versions (the bullets mean stating e.g., what browser and computers the users are using in the test. In field trials this is information that is not known by the technical partners. For example, in one of the tests during last spring 2007, one of the users was at home using SSp during the test, so it was asked what she used e.g., Internet Explorer 6 and Mozilla Firefox2.0.0.6, Compaq Presario with Windows XP and IBM ThinkPad with Windows XP. If all is not know then it is not but it would be good to try to get the info. Plug-ins can refer for example to the browser add-ons (in Firefox these are found from the upper tools menu. Sometimes it is needed to know if some plug-ins are on or off, because it might change or prohibit some functions.).
Display Devices (report if relevant, e.g., Paper prototypes are tested or static prototypes are tested on screen)
- If screen-based, screen size, resolution, and colour setting
- If print-based, the media size and print resolution
Test Administrator Tools (report if relevant for the particular test)
- If a questionnaire was used, describe or specify it here (add these to appendix)
- Describe any hardware or software used to control the test or to record data (audio, video)
Experimental Design
- Define independent variables and control variables
- Describe the measures for which data were recorded (the scale/scope of the recorded data, if relevant for the particular test, i.e., written notes, think aloud in audio recording, etc.).
Procedure
- Operational definitions of measures (e.g., how is it decided that that a task is completed)
- Policies and procedures for interaction between tester(s) and subjects (e.g., is the test conductor aloud to answer questions of the user, provide help, etc.)
- State used: non-disclosure agreements, form completion, warm-ups, pre-task training, and debriefing
- Specific steps followed to execute the test sessions and record data
- Number and roles of people who interacted with the participants during the test session
- Specify if other individuals were present in the test environment
- State whether participants were paid
Participant General Instructions (here or in Appendix)
- Instructions given to the participants
- Task instruction summary
- Usability Metrics (if used)
- Metrics for effectiveness
- Metrics for efficiency
- Metrics for satisfaction, etc.
Results
- Data Analysis
- Quantitative data analysis
- Qualitative data analysis
- Presentation of the Results
- From quantitative data analysis
- From qualitative data analysis (descriptive and clarifying presentation of the results)
Reliability and Validity
Reliability is the question of whether one would get the same result if the test were to be repeated.
This is hard to acquire in usability tests, but it can be reasoned how significant the findings are.
(Example from expert evaluation: This review was made by one reviewer in order to give quick feedback to the development team. To get more reliable results it would have been desirable to use three, or at least two reviewers, as it is often the case that different reviewers look at different things. We do feel, however, that for the purpose of this report, and the essence of quick feedback, one reviewer has given enough feedback to enhance the usability of the system.)
Validity is the question of whether the usability test measured what was thought it would measure, i.e., provide answers to. Typical validity problems involve: using the wrong users, giving them the wrong tasks.
(Example from expert evaluation: "The reviewer is an experienced usability professional that has evaluated systems for many years. We therefore feel that the method used as well as the tasks used give an appropriate view of how ordinary users would behave in the system.")
Summary Appendices
- Custom Questionnaires, (if used, e.g., in expert evaluation there is no participants)
- Participant General Instructions
- Participant Task Instructions, if tasks were used in the test