Thursday, November 19, 2009

Automatic captions on YouTube



UPDATE 11/24/09: Full-length video of our announcement event in Washington, D.C. has been posted to YouTube and is embedded at the bottom of this blog post. We have included English captions using our new auto-timing feature. Enjoy.

Today, here in D.C., we announced the preliminary roll-out of automatic captioning in YouTube, an innovation that takes advantage of our speech recognition technology to turn the spoken word into text captions. We also announced that if you have a transcript of your video, you can upload it to YouTube and we'll time the captions for you.

This is useful for anyone who is deaf or hearing impaired, but it will have broader effects as well. For example, YouTube captions can be automatically translated, making video more accessible across languages. And while we've had the ability to manually caption videos for a while, automatic captions and automatically timed transcripts lower the barriers and, we hope, helps open YouTube to everyone.

Indeed, with 20 hours of video uploaded to YouTube every minute, captioning YouTube through purely manual means would be very difficult. That's why we're excited about today's announcement. Please note that only 13 YouTube channels will feature automatic captions at this time so that we can gather feedback, but all video owners will be able to upload transcripts and automatically time them. Ken Harrenstien, the software engineer who led this project, describes today's announcements in more detail on the Official Google Blog.

Visit our Picasa photo album for more pictures from the event.

This morning's introductions were also exciting because over 60 accessibility leaders from the National Association of the Deaf, Gallaudet University, AAPD and other organizations joined us to be the first to learn about these new features. We made the announcement in our Washington office, in fact, just so that they could be here to give our engineers their direct feedback.

Have a look at the video below to learn more about what was announced today, and check back here tomorrow for full video from the event. You can bet it'll be captioned—we'll be uploading the transcript of the event to YouTube, which will turn it into captions that are timed just right.





5 comments:

mrburns_saranga said...

Excellent ! what a progress

visit http://www.saranga.com.ar/

foggy said...

Bless them and keep your fingers crossed that their soon to be put into use google phone numbers and system will also keep the deaf - and blind - close to the forefront of their collective mind. So many of us miss the relay of old and yet what google will be unveiling will be even better because hearing people will be able to phone us and their messages will be converted to print!

Rob Mitchell said...

Fabulous work Google. Its great to see automatic first pass voice recognition applied to videos uploaded to YouTube. Although this is being launched under the accessibility umbrella the marketing industry are just waiting for the option to tap into all that rich context based searchable text locked up in that video a user just uploaded.

Time will tell who really benefits from this, will it be the end-user or the marketing folks out there or Google or all of them!

In any case great work with this new launch.

If you are interested in seeing who else is pioneering in this area check out ramp.com (aka everyzing, aka podzinger).

Rob

Check out my blog: Web 3.0
www.gorobmitchell.com

Carina Novarese said...

Great! But, which ones are the channels in which you can use automatic translation? And...will this feature be available for everybody? Thanks!!

Tony said...

Fantastic!

This will allow communication between people in countries that do not speak the same language! A great way to get to know other people and cultures.

Now I'll be able to understand My friends' German & Spanish Videos, and all the other languages I'd never be able to learn on my own! I know translators aren't 100% effective yet but they work well enough to understand the gist of what's being communicated. Its the speech to text part that's tricky to program. But now there will be a good reason to improve that technology too. If people upload the text scripts it should work out pretty well!

Thanks Google! :)