Maze Day 2008

About once a year, K-12 kids with disabilities from all over North Carolina (and beyond) travel to UNC-Chapel Hill to take part in Maze Day. Throughout the day, the kids, their teachers, and their parents wander Sitterson Hall to try out the numerous games, applications, activities, and demos designed to help them learn and have fun at the same time.

The kids always have a blast, and this year was no exception. Some of the 21 projects demoed this year included the following:

  • Carolina…
http://mindtrove.info/maze-day-2008/

posted : Wednesday, April 30th, 2008

tags : mindtrove_info from_feed

Blog moving

My defense is over (passed), it’s a Friday, and I now have time to do things like, say, finally setup WordPress on my own domain.

If you actually care to read everything I put on my blog (i.e., accessibility), update your reader. (Certainly the posts will be greater in quantity and higher quality now that my brain isn’t stuck in the thesis writing loop, right?) If you only care about my occasional GNOME-specific ramblings, fret not. I’m configuring my tumblr account to automatically import new WP posts tagged with “gnome” so they continue to show up here, and hence on Planet.

posted : Friday, April 25th, 2008

tags :

Validate your accessibility

Eitan committed a new plug-in for Accerciser that makes it dirt simple to find basic accessibility problems. You know, the ones that cause grief for apps like Orca, GOK, On-Board, etc. To use it, run Accerciser, point it at part of a GUI, click validate, and wait for the report.

The rules in the plug-in aren’t the greatest right now. But the plug-in is extensible with new rule sets called schemas. For instance, you could have a “Desktop” schema to check basic GUI problems, a “Web” schema to test document accessibility, and an “Orca” schema to check a program’s fitness for Orca scripting. The sky’s the limit, and I’m sure Eitan, Will, and company will come up with quite a few useful tests.

To ward off any fear brought on by the word “schema,” I should note that they’re really just Python modules with simple, three-method classes in them. For example:

class CheckFocusable(Validator):
  def condition(self, acc):
    # only test accessibles that have the action interface
    return acc.queryAction()      
  def after(self, acc, state, view):
    # check an accessible after checking its descendants
    # acc is the accessible
    # state is a dictionary of whatever you need to store across tests
    # view logs errors, warnings, etc.
    pass
  def before(self, acc, state, view):
    # check an accessible before checking its descendants
    s = acc.getState()
    if not s.contains(STATE_FOCUSABLE):
      view.error('actionable widget is not focusable')

No more excuses for inaccessible apps now, right? :)

posted : Wednesday, January 16th, 2008

tags :

Spatial PulseAudio

In his interview about Pulse Audio in Fedora 8, Lennart Pottering mentions support for spatial sound as one of his future goals:

Spatial event sounds: click on a button on the left side of your screen, and the event sound comes out of your left speaker. Click on one on the right side of your screen, and the event sound comes of of the right speaker. It’s earcandy, but I think this could actually be quite useful, but only if we get better quality event sounds, than we have right now.

While spatialized event sounds may be “earcandy” as Lennart admits, there are other benefits of using 3D audio over mono sounds in certain applications. One interesting use concerns the separation of concurrent sound streams such that a user can distinguish and “pick out” one of many. The theory of auditory scene analysis (Bregman, 1990) says (among many other things) that we humans can better segregate different sound sources and select one for attentive processing if the acoustic and semantic properties of streams from distinct sources differ along certain dimensions while certain properties within a stream remain constant over time.

For instance, say I make two audio recordings of the same person speaking two different utterances. In one recording, the person says “What a lovely bunch of coconuts.” In the other, “That dog certainly has fleas.” If I mix these two recordings into a mono track with the two utterances starting at exactly the same time, you will have a hard time determining and understanding the two independent phrases. If I create a stereo sound, with one phrase played in the left speaker and another in the right, you’ll have a much easier time identifying the original phrases. (But you’ll likely have to listen to the sound more than once before you can repeat both phrases: another tenet of auditory scene analysis.) Better still, if I apply a head related transfer function (HRTF) to each recording such that the two utterances appear to come from the left and right side of you head in a 3D space, your task becomes even easier.

In other words, spatialized sounds aid segregation and selection of independent streams of speech and sound. In fact, research (McGookin, 2004) suggests that spatialization alone is sufficient to aid recognition of information encoded in properties of concurrent musical sounds (earcons).

Applying the concept of distinguishable sound streams to screen readers is an interesting endeavor. (Or, at least, I think so.) Screen readers currently rely on a single, serial stream of speech and sound to describe the multitasking, high-bandwidth graphical desktop. In a single stream design, reports of peripheral information outside the application focus are either non-existent, delayed, or interruptions, and can be easily missed. For instance, if a screen reader is busy reading an email when the user receives an instant message in another application, the screen reader has to decide whether to keep reading the email or announce the new message in some manner. If the screen reader interjects, the user might confuse the instant message content with that of the email or become annoyed with the interruption. If the screen reader decides to wait for the email reading to finish, the late announcement about the chat message runs the risk of being stale.

Worse yet, any single-stream announcement of the new message can be inadvertantly interrupted at any time by the next user command. In such a situation, unless the user tabs around looking for the new instant message or the chat program is set to play a sound every time a message is received (which still doesn’t indicate which of potentially many chats has the new message), the user may never learn of the existence of the new message.

Concurrent streams provide an answer to this peripheral awareness problem, but only if the screen reader can present them in a way that avoids masking other simultaneous streams. And this is exactly where the ability to spatialize sound helps. Without interrupting or modifying the stream of speech reading the email, another stream can pipe up and announce the new chat message with a sound, speech, or both according to the verbosity settings of the user (Zing! or “Message from Harvey” or “Harvey says ‘Hey! Stop reading your email and answer me! This is important!’”). As long as these streams are spatially separated according to some simple rules, the user will be able to effectively distinguish them, ignore one, listen to one, and switch attention back and forth between them.

Instant messaging is just one example of a modern desktop application that begs for concurrent streams in screen readers. Just from looking at my GNOME desktop I see a system monitor, the clock applet, my network status, a popup balloon, and a log monitor all updating in the background while I write this post. Of course, a user can’t cope with all of these event sources reporting at once. But that’s where the interesting design problems start: how do we construct a usable multi-stream auditory display?

An open source library supporting spatial sound is a fundamental building-block for this investigation (and I’m certain, others). I hope Lennart pursues it.

posted : Tuesday, January 1st, 2008

tags :

posted : Thursday, December 20th, 2007

tags :

Android speech synth (where are you?)

I took a peek at the Google Android class hierarchy today. As far as UI goes, it looks like there’s great support for 2D/3D visuals. There’s some APIs for doing MIDI and sampled sound output. There’s even a class for doing speech reco.

What I don’t see is anything supporting synthesized speech output. That’s a bit depressing. It would be a huge boon to have an open environment for developing mobile audio apps. Talking cell phones can be a bit pricey because they’re primarily intended as assistive technologies (i.e., small market). But I can imagine a ton of applications with speech-displays that could be useful to sighted and blind users alike: listening to your email while you walk instead of reading it on a tiny screen, announcements about upcoming meetings in your calendar, voice-jockey-like naming of songs about to come up on your MP3 playlist, spoken caller ID, …

Perhaps it’s possible to add custom classes to support FreeTTS or some other Java-accessible engine. However, it would be much nicer to have the speech API in the platform itself so it’s available everywhere. Maybe they left it out because all the free engines are too resource hungry? Somehow, I can’t imagine something like espeak being too bulky for a mobile platform.

posted : Tuesday, December 18th, 2007

tags :

posted : Thursday, November 8th, 2007

tags :

posted : Friday, August 10th, 2007

tags :

“ Mitt Romney — who recently faced questions about his common sense for strapping his dog in its carrier to the top of his car during a 12-hour drive, causing the animal to defecate over his windshield — said the format is beneath his dignity.

posted : Monday, July 30th, 2007

tags :

FC7 build

Pleasantly surprised to see LSR installed, built as an RPM, and ran without a hitch on FC7. The new default Festival voice is interesting.

Updated the LSR in retrospect document. I recently realized I never uploaded the final draft to the website.

posted : Wednesday, July 11th, 2007

tags :