Introducing Opened Captions

I made something awesome last week: Opened Captions.

At face value it just looks like a live feed of C-SPAN’s Closed Captions. This alone is actually pretty cool if you think about it, especially if you are a deaf political junkie who sits far away from the TV and can’t read the closed captions.

Of course there is more. The real excitement comes when you contemplate what’s happening to get those words to appear on your screen.

This system unlocks and syndicates a real-time dataset that used to be a pain in the ass to access. Now anyone can build applications and visualizations that update before those crafty politicians have even finished making their points. This post explains why Opened Captions is worth hacking with, what it takes to use it, and how it works.

What is it Good For?

The Internet is filled with real-time updates triggered by online activity, but it still feels like magic when we see automatic updates driven by the real world. Opened Captions makes it easy for programmers to use live TV transcripts as an input.

Note: version .001 only supports a single channel (and my server is pointed to C-SPAN). Eventually the protocol should expand to allow multiple channels.

Let’s consider C-SPAN. If a computer knows what is being said on C-SPAN this very second, it can do things like:

  • Change the background of your email client to reflect the issues being debated right this moment on the senate floor.
  • Generate modified, more amusing, transcripts by replacing key words and phrases with Tolkien lore (i.e. C-SPAN for Middle Earth)
  • Search through lyrics and generate a C-SPAN medley for you to rock out to while voting.
  • Send SMS messages 24/7 commanding you to “drink” when certain phrases are spoken on air.

There are also possibilities that aren’t ridiculous. For instance, you could make tools that…

  • Improve the transcript by automatically adding contextual information, such as definitions and histories thefted from Wikipedia.
  • Send emails with transcript snippets whenever a specific representative or state is being discussed on TV so you know what’s going on.
  • Parse out paraphrases of known fact checks and insert a credibility layer over the transcript feed (real time fact-checking).
  • Draw parallels between what is being said on TV and what is being said on Twitter.

I could go on and on and on. There is just so much potential!

The Backend

Behind the stream is a first stab at a distributed architecture for Closed Captioning live-feeds. Opened Captions servers can pull a CC stream over a serial port, or (more likely) they will connect to an existing Opened Captions server and pull the stream from there. What that means in de-jargon is that anybody can set up a server that does exactly what mine is doing, even if they don’t have access to hardware, software, or a live TV stream.

When I say exactly, I mean it — your new project runs the same code as mine, and will serve the feed too. People can connect their servers to yours in the same way you connected yours to mine. Practically speaking this architecture means a few things:

  1. Once your amazing mashup gets popular it won’t break my server. Your application is syndicating the captions to your users. I serve the captions to you, you serve them to the world!
  2. Your server creates a fork of my stream. Want to modify the text so the politicians sound drunk? Add extra layers of information to the message payload? Translate the captions to Klingon? Go for it. If your tweaks happen server side then others can build their apps from your stream to modify it further.
  3. You don’t have to rely on anyone else for the Closed Captions. If you want to spend some extra time setting up your own scraper you can point your server to that source instead of a third party. You have total control.

Check ‘Em

Wondering if this is worth your time? Well, it doesn’t require much of it. The service takes about two minutes to set it up if you already have Node.js and Git installed on your computer. Here’s a video to prove it:

Installation instructions can be found in the readme and you can always get in contact with me through the blog or on twitter.

About Dan

Dan's just this guy, you know?

, , , ,

  • Pingback: Knight-Mozilla's Opened Captions | Linux-Support.com

  • Pingback: Introducing Opened Captions | My Daily Feeds

  • Andy Foster

    Wow this is awesome. Im going to check it out. Also this could be of great benefit to organizations like NAD, etc. Thanks!

  • http://twitter.com/nivertech Zvi

    I think the server is down

    • http://www.slifty.com Daniel Schultz

      The stream is down — getting it back up.

  • anon

    Note that “open captions” – according to WP – refers to burnt-in captions, so calling them “opened captions” is quite confusing.

    • http://www.slifty.com Daniel Schultz

      Noted! I looked into it when brainstorming a name and made a judgement call; may have been the wrong one, but I’d need a good replacement…

  • Pingback: Knight-Mozilla’s Opened Captions | PrishLink.com

  • Paul the Caffeinated

    A lot of captions/subtitles I’ve seen for live broadcasts, news, etc are very inaccurate.

    What’s the quality like of these captions (the source for all your data, I guess).
    I guess a crowdsourced/social quality improvement layer could help.

    • http://www.slifty.com Daniel Schultz

      These are coming from the Closed Caption feed — i.e. from the television. We have noticed plenty of cases with inaccuracy (especially names of foreign leaders) but for the most part it is accurate enough to do useful things. Most applications still work great even with 95% accuracy. Down the line, things like an automated spell check would be nice; you could easily make a OC relay that autocorrected the stream.

      Unfortunately I don’t see the crowd as a scalable option simply because we aren’t going to have people watching these feeds all the time and being willing to update them. Plus by the time the corrections came in the “real time” component would be over.

  • Pingback: Captioning, subtitling and SEO and the second screen « i heart subtitles

  • Pingback: Acknowledging, Connecting, and Growing the Next Generation … | making-web-page

  • Henry Foulds

    This is an awesome idea. It’d be great if you could have a captioner producing their own captions in an Opened Captions console, and they can be picked up on the website. I’m a captioner, and I would pay to use it.

    • Herve

      If I understand well, you would propose to involve the whole community in doing closed caption? that would be great ! (even if that would probably generate a lot of mess)

      • Henry Foulds

        I don’t mean that. What I mean is provide a way for a captioner who is captioning an event, no a TV broadcast, to send their text over the internet so it can be viewed on an Opened Captions website.

  • Herve

    Hi,

    it looks like a very interesting tool/library. However I am new in that field and there are various things that I do not understand. I downloaded open-cations and installed it locally. It works directly, out of the box! Great! I can launch the server using the command >>node app, and then I can visualize the page on localhost:3000.

    However, what I don’t understand is how to connect this server with a tv+caption stream. Eg, how to connect it with C-SPAN. More specifically, to get all the closed captions info from that channel.

    Other questions, do you know other channels that I can get with your tools? For example, I am interested with european channels. Would it works? (perhaps I should have first a special box?)
    Any ideas? hints?

    Thank you for your help and happy new year,
    Herve.