Voiceover 2 – a WebDev’s guide

Voiceover logo (DaVinci man done as blue and white isotope man).VoiceOver, the screen reader for Apple’s OS X, has undergone a major update. This article looks at the new functions, and what that means for people browsing websites. I am not looking at VoiceOver in general, just how different aspects of web pages affect the experience when using VoiceOver.

NB: Comparing Windows based screen readers with VoiceOver is very difficult, and probably a fairly pointless exercise. I’m not trying to do that.

DOM Navigation

The first thing to check is always the settings, as these can have a massive impact on how a screen reader reacts, setting like ‘verbosity’ will change what is read out. Under the "Web" section of the VoiceOver utility, there is a new setting to choose "DOM navigation" as well as previously available "Group navigation".

One of the unique aspects of VoiceOver compared to traditional Windows screen readers was that you navigated in a 2 dimensional manner, able to go up/down and left/right across a page. This had the effect of making some things quicker, but also make it possible to miss content. For example, sometimes indented text (e.g. bullet points) would be missed because it was treated as a different column.

DOM navigation should eliminate that possibility, as the user goes through the content in source order. However, that does mean that in many cases people will have to go through more items before getting to the content. Therefore within-page navigation mechanisms become more important (see links and headings below).

DOM navigation is the default setting in VoiceOver, although some people may change it back.

In more general terms, VoiceOver no longer wraps around pages or other interfaces by default. When you get to the beginning or end of a page or interface, you get a definite ‘thud’ sound effect, like a door closing.


Source order matters, so the linear order of the HTML should make sense and be consistent.

Structural Elements

Previously, VoiceOver basically read out text. That was it. Nothing structural was announced, let alone navigable.

There are now quite a few ‘navigate by structure‘ commands, such as jump to the next heading, or the next quote. Unfortunately it seems that only the heading and link commands work.


The lack of special features for structure is especially unfortunate for tables, as it means tables are treated as standard text, read out row by row. There are commands in the manual for skipping to tables, reading out rows and columns, and reading the column heading of a cell. These commands work very well in iTunes (in a brief check), but don’t seem to have been applied to HTML tables in Safari.

So should web developers not use data tables? Or course not, even apart from it being perfectly valid in the spec, it’s a bug that’s likely to be fixed, perhaps even overnight in a minor update (in the same way that iTunes just started working).


Headings are now read out, for example the main heading on this page would be read out as Heading level 1, VoiceOver 2, a WebDev’s guide (recording in mp3). This should help users understand the page structure much better, assuming it uses headings appropriately.

There are also good jump commands, with skipping between headings, and also skip to the next/pervious heading of the same level. I put together a quick HTML test page for the purpose of testing this (I was offline at the time), and skipping to the next heading of the same level (e.g. 2) was invaluable, as you could skip the subsections.

The links functions have changed quite a bit, although you can still create a list of links (VOu), you can now skip to visited links as well.

Also changed are how they are announced, VoiceOver has options to have the link announced with the word ‘link’, a ping, or a change in pitch. I found it difficult to discern the difference in pitch with the default voice, especially with the pause before and after the link. However, the ‘speak link’ and ‘play tone’ options are very good.

One minor bug (perhaps) seems to be that links are not announced directly when in headings. You can get to them, the heading would be read as Heading level 3 with two items, and you can ‘interact’ with (think go down a layer into) the heading, and get to the text and link as though it were a paragraph.

You can also read out the URL of a link, which is very useful as the status bar does not show the currently focused link without using the mouse cursor.

The major bug with links is that within page links don’t work, although the screen moves, the location of the VoiceOver cursor does not. There is a workaround where you put the mouse cursor within the window, use the link, and put the VoiceOver cursor where the mouse is. I don’t think this is something you can expect a regular user to know or do.

Inline elements

I’m not quite so enthused with how VoiceOver deals with inline elements such as b, abbr etc. It basically stops. So if you add a strong to a sentence, VoiceOver will read out the text up to it, stop, the user presses next, VoiceOver reads the text in the strong, and stops again. (Recording of the last sentence.)

It has the effect of completely breaking up a sentence, which is rarely what was intended by the writer.

Given the excellent new voice, it seems strange that a different intonation or other effect was not used to indicate inline elements.


Fairly straightforward: The alt attribute is read out followed by "image".


You can read out the appearance of any text, e.g. default text is: “16 point helvetica, black on black”. I’m not sure when that would be useful for browsing, although it must be very useful for editing documents.

Block elements

There is a little sound effect for going between block elements. Each time you go right (which often means down through a page), it gives you a little ‘do-di’, and going backwards through the document reverses it to ‘di-do’. However, apart from headings, structural block elements aren’t announced.

Lists seem to have taken a step backwards. In VoiceOver 1, bullets and numbering were read out. But in version 2 a bullet or number is treated as a new line, with no indication that it is a list.


The biggest problem here is that VoiceOver doesn’t understand within-page links, they essentially seem to be broken to the user because nothing happens. Although not that many sites use larger pages with internal links, if you wanted to change something on the screen with AJAX and send the user to the right place to read it, you can’t with VoiceOver.

VoiceOver has no caching issues that affect the Windows based screen readers, however, there is also not automatic announcement of updates, or means for a developer to send the user to the updated part of the page.

Title attribute

Hoorah! VoiceOver can read title attributes on almost anything. Termed ‘help tags’, VOh will read out the title on acronyms, abbreviations, and just about any inline element I tried.

Titles can also be read out from block elements like ps and divs. If and if there are titles on parent elements it did what I hoped: read out the one closest to the current element.

If only you could tell when something has a ‘help tag’, as there doesn’t seem to be any way to get them announced.


VoiceOver users aren’t going to notice abuses of the title attribute, but without other uses in the OS I can’t see it being used much.


Forms behaviour in VoiceOver definitely favours better marked up forms, as text inputs, radio buttons and check boxes all use a properly attached label, rather than assuming nearby text is probably the label. Other elements (textboxes and selects) do not announce the label, presumably because they generally have sufficient explanation either within or around the element.

Given that there is no ‘forms mode’ that the Windows based screen readers tend to use, not using labels on these elements will probably work reasonably in most situations, especially when the explanation is read out first. For other screen readers, using the forms mode means you can loose the context around the form elements, which isn’t the case here.

There is one fairly major bug that was identified on the MacVisionaries list (I think by Rich Caloggero) where a multiple select is simply ignored by VoiceOver, and just completely skipped over. This was introduced in the Safari 3 beta, and has remained since (hat-tip to David Poehlman).

Accessibility oriented attributes

tabindex and accesskeys are generally not needed or usable, but it’s worth knowing what different technologies do.

Tabindex in VoiceOver works as it does when using Safari in generally, anything with tabindex is first in the tabbing order, other links and form controls come afterwards in source order. Unfortunately from an ARIA point of view, values of 0 and -1 on non-links don’t do anything.

Accesskeys work with the ctrl key, and generally work except for clashes. For example, I started off testing with "1" like the UK government guidelines suggest, but that activated ‘spaces’ rather than the accesskey.


Again, if you are using AJAX type updates in the page it is difficult to help screen reader users. Use of tabindex is likely to be confusing if people do use it, and unnoticed if they don’t.

Accesskeys will work in general, but aren’t announced and are likely to be unused without going to extraordinary lengths to draw people’s attention to them.

Flash content

You wouldn’t know it existed.


The new version of VoiceOver has many improvements from a web browsing point of view, albeit with a few bugs and omissions remaining. The headings and DOM navigation are likely to have the biggest impact on people’s use and the general usability of pages when using VoiceOver.

The main aspects I would like to see improved are making within page links work as you would expect, and enable table functions to be as good in Safari as they are in iTunes. There will also be a lot of content locked away behind Flash, even accessible Flash.

5 contributions to “Voiceover 2 – a WebDev’s guide

  1. I put a link to this on ATMac so hopefully people looking for advice for content production and OS X accessibility will spot it more easily. And yes, I’m waaaay behind in reading MacVisionaries which is why I just found out that you had this here!

  2. When I read a whole page with VO-a, it reads each header twice. But, it reads properly when I cursor to a header. Have others seen this problem too?

  3. I’d ask on the MacVisionaries list, I haven’t found that but I tend to arrow around anyway. (When I use VO, Im not VI so it’s usually for testing things.)

  4. Missed this article when it was originally posted, but regarding the section on inline elements, try launching VoiceOver Utility app, and under Web settings, select “Group navigation” instead of “DOM navigation.” You may like that better.

Comments are closed.