GOMS Analysis & Web Site Usability

GOMS Analysis & Web Site Usability

Mining the Web for Design Guidelines Marti Hearst, Melody Ivory, Rashmi Sinha UC Berkeley The Usability Gap 196M new Web sites in the next 5 years [Nielsen99] ~20,000 user interface professionals [Nielson99]

2 The Usability Gap 196M new Web sites in the next 5 years [Nielsen99] A shortage of user interface professionals [Nielson99] Most sites have inadequate usability [Forrester, Spool, Hurst] (users cant find what they want 3966% of the time)

3 One Solution: Design Guidelines Example 4 Design Guidelines Break the text up to facilitate scanning Dont clutter the page Reduce the number of links that must be followed to

find information Be consistent Problems with Design Guidelines Guidelines are helpful, but There are MANY usability guidelines Survey of 21 web guidelines found little overlap [Ratner et al. 96] Why?

5 One idea: because they are not empirically validated Sometimes imprecise Sometimes conflict Question: How can we identify characteristics of good websites on a large scale?

Question: How can we turn these characteristics into empirically validated guidelines? Conduct Usability Studies: Hard to do on a large scale Find a corpus of websites already identified as good! Use the WebbyAwards database Talk Outline WebbyAwards 2000 Study I: Qualities of highly rated websites Study II: Empirically validated design guidelines Putting this into use

Criteria for submission to the WebbyAwards Anyone who has a current, live website Should be accessible to the general public Should be predominantly in English No limit to the number of entries that each person can make 9 Site Category Sites

must fit into at least one of 27 categories. For example: Arts Activism Fashion Health News Radio 10 Sports Music News Personal Websites

Travel Weird Webby Judges 11 Internet professionals who work with and on the internet: new media journalists, editors, web developers, and other Internet professionals

have clearly demonstrable familiarity with the category which they review 3 Stage Judging Process Review Stage: From 3000 to 400 sites 3 judges rate each site on 6 criteria, and cast a vote if it will go to the next stage Nominating Stage: From 400 to 135 sites 3 judges rate each site on 6 criteria, and cast a vote if it will go to the next stage

Final Stage: From 135 to 27 sites Judges cast vote for best site 12 Criteria for Judging 6 criteria

13 Content Structure & navigation Visual design Functionality Interactivity Overall experience Scale: 1-10 (highest) Nearly normally distributed Content is the information provided on the site.

Good content is engaging, relevant, appropriate for the audience-you can tell it's been developed for the Web because it's clear and concise and it works in the medium 14 Visual Design is the appearance of the site. Good visual design is high quality, appropriate, and relevant for the audience and the message it is supporting 15 Interactivity

is the way a site allows a user to do something. Good interactivity is more than sound effects, and a Flash animation. It allows the user to give and receive. Its input/output in searches, chat rooms, ecommerce etc. 16 Can overall rating be predicted by specific criteria? Statistical Technique: Regression analysis Question: What % variance is explained by 5 criteria Percentage variance explained = 89%

Can votes be predicted by specific criteria? Statistical Technique: Discriminant analysis Question: Can we predict the votes from the 5 specific criteria? Classification Accuracy for Sites = 91% Review Stage: Which criteria contribute most to overall rating? Figure 2a. Review Stage Contribution of Specific Criteria to Overall Site Rating 1 0.9 0.8

0.7 0.6 0.5 0.4 0.3 Content Navigation VisualDesign Interactivity Functionality

Nominating Stage Analysis 6 criteria Content, Structure & Navigation, Visual Design, Functionality & Interactivity Overall experience 400 sites 3 judges rated each site

Nominating Stage: Top sites for each category Overall Rating 300 200 Mean = 7.6 SD = 1.66 100 0 1.0

overall 2.0 3.0 4.0 5.0 6.0 7.0 8.0

9.0 10.0 Which criteria contribute to overall rating at Nominating Stage? 77% variance explained in overall rating Contribution of Criteria (Correlation) Unique Contribution of Criteria (Partial Correlation) 1 0.9 0.8

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Content Navigation VisualDesign Interactivity

Functionality Summary of Study I Findings 23 The specific ratings do explain overall experience. The best predictor of overall score is content.

The second best predictor is interactivity. The worst predictor is visual design Are there differences between categories? Arts Activism Fashion Health News

24 Sports Music News Personal Websites Travel Art Arts: Contribution of criteria to overall rating Variance explained = 93% 1 0.95 0.9 0.85 0.8

0.75 0.7 0.65 0.6 0.55 0.5 Content Navigation VisualDesign Interactivity Functionality Commerce Sites Commerce Sites: Contribution of criteria to overall rating Variance explained = 87% 1 0.95

0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5 Content Navigation VisualDesign Interactivity Functionality Radio Sites Radio Sites: Contribution of criteria to overall rating

Variance explained = 90% 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5 Content Navigation VisualDesign Interactivity Functionality

Conclusions: Study I The importance of criteria varies by category. Content is by far the best predictor of overall site experience. Interactivity comes next. Visual Design does not have as much predictive power except in specific categories

28 Study II An empirical bottom-up approach to developing design guidelines Challenge: How to go use Webby criteria to inform web page design? Answer: Identify quantitative measures that characterize pages 29 Quantitative Measures Page Composition

words, links, images, Page Formatting fonts, lists, colors, Overall Characteristics information & layout quality 30 Quantitative page measures Word Count Body Text % Emphasized Body Text % Text Cluster Count

Link Count Page Size Graphic % Color Count Font Count Quantitative Measures: Word Count 32 Quantitative Measures: Body Text % 33

Quantitative Measures: Emphasized Body Text % 34 Quantitative Measures: Text Positioning Count 35 Quantitative Measures: Text Cluster Count 36 Quantitative Measures:

Link Count 37 Quantitative Measures: Page Size (Bytes) 38 Quantitative Measures: Graphic % 39 Quantitative Measures: Graphic Count

40 Study Design Quantitative Page Metrics Word Count, Body Text %, Text Cluster Count, Link Count etc. Webby Ratings Highly Rated

Sites Top 33% Low Rated Sites Bottom 33% Model Accuracy Within Across Categories Categories 67% 63%

76% 83% Classification Accuracy Comparing Top vs. bottom Accuracy higher for within categories N Overall Within Community Categories Education

Finance Health Living Services Cat. Avg. 1286 305 368 142 165 106 208 Classification Accuracy Top

Bottom 67% 63% 83% 92% 76% 73% 77% 93% 93% 87% 42% 76% 86% 75% 76%

83% Which page metrics predict site quality? All metrics played a role However their role differed for various categories of pages (small, medium & large) Summary Across all pages in the sample Good

pages had significantly smaller graphics percentage Good pages had less emphasized body text Good pages had more colors (on text) Role of Metrics for Medium Pages (230 words on average) Good medium pages Emphasize less of the body text Appear to organize text into clusters (e.g., lists and

shaded table areas) Use colors to distinguish headings from body text Suggests 44 that these pages Are easier to scan Low Rated Page No Text Clustering

45 No Selective Highlighting High Rated Page Selective Highlighting Text Clustering 46 Why does this approach work?

Superficial page metrics reflect deeper aspects of information architecture, interactivity etc. 47 Possible Uses A grammar checker to assess guideline conformance Imperfect Only suggestions not dogma Automatic

template suggestions Automatic comparison to highly usable pages/sites 48 Current Design Analysis Tools Some tools report on easyto-measure attributes 49

Compare measures to thresholds Guideline conformance Comparing a Design to Validated Good Designs Web Site Design Comparable Designs Profiles Analysis Tool Favorite Designs 50

Prediction Similarities Differences Suggestions Future work Distinguish according to page role Better metrics

More aspects of info, navigation, and graphic design Site level as well as page level Category-based profiles 51 Home page vs. content vs. index Use clustering to create profiles of good and poor sites

These can be used to suggest alternative designs Conclusions: Study II Automated tools should help close the Web Usability Gap We have a foundation for a new methodology We can empirically distinguish good pages

52 Empirical, bottom up Empirical validation of design guidelines Can build profiles of good vs. poor sites Eventually build tools to help users assess designs More Information http://webtango.berkeley.edu 53

Recently Viewed Presentations

  • The 3.8 Paragraph

    The 3.8 Paragraph

    Example: The little town of Romney was an important hub for the east-west railroad lines. The area changed hands 56 times during the course of the war because each side wanted control of the B&O Railroad. Sentence 8: The conclusion...
  • Managing Change and Transition Constitutional Amendment and GST

    Managing Change and Transition Constitutional Amendment and GST

    Features of Constitution Amendment Act. 6. Alcohol for human consumption. Power to tax remains with the State. Five petroleum products - crude oil , diesel, petrol, natural gas and ATF
  • Roman Numerals and the 24 Hour Clock - Weebly

    Roman Numerals and the 24 Hour Clock - Weebly

    If it takes 30 minutes to prepare the cake, one hour to bake it, and 15 minutes to cool, at what time should she begin? _____ 20. A plane leaves Raleigh at 1915. It takes 3 ½ hours to fly...
  • Structure and Function in Biology - Research

    Structure and Function in Biology - Research

    Structure and Function in Biology Kristen J. Champion Vanderbilt University Dept. of Biomedical Engineering Project Definition Problem: Students often find concepts in biology abstract and difficult to understand Goal: The aim of this project is to aid upper middle school...
  • Site Location and Design Approval for Domestic Wastewater ...

    Site Location and Design Approval for Domestic Wastewater ...

    Denver Tom Armitage Cary Pilon Dennis Pontius Forrest Vaughn Pueblo Dave Knope Tim Vrudny Grand Junction Mike Havens Mark Kadnuck Steamboat Springs Andy Poirot Durango Greg Brand SITE LOCATION AND DESIGN APPROVAL FOR DOMESTIC WASTEWATER TREATMENT WORKS GARY SOLDANO, P.E....
  • AP Literature and Composition Mr. Houghteling September 18, 2009

    AP Literature and Composition Mr. Houghteling September 18, 2009

    - page 53 - the corporal's boots - page 61 - crying - page 67 - "I" - page 73 - "mustard gas and roses" - page 75 - Hitler, Adam and Eve More important text to examine: page 86...
  • Test Taking - uthsc.edu

    Test Taking - uthsc.edu

    Test Taking UNIVERSITY OF TENNESSEE ... if the viskal is zortil and the hackshe is plaffed **Test-wise candidates will pick D as the key because it is the only option to specify a condition and the stem directs candidates to...
  • Jigsaw Exercise  Each participant has a jigsaw assignment

    Jigsaw Exercise Each participant has a jigsaw assignment

    Jigsaw Exercise. Each participant has a jigsaw assignment (e.g. A4). "A" refers to your first group assignment "4" refers to your second group assignment. Groups will meet with a facilitator to explore one of 4 different topics ... PowerPoint Presentation...