TENET is a new Christopher Nolan film that looks like a heist movie mixed with time-travel shenanigans. I am a sucker for almost literally anything involving time travel or displacement. Back to the Future, Primer, Avengers: Endgame, Interstellar, Terminator movies, Palm Springs, Hot Tub Time Machine... I will seriously scrape the bottom of that barrel.
Unfortunately, given the state of things with the global COVID-19 pandemic, and because it's not being released directly to vide-on-demand services, I have no idea when I'm going to see it. So I'll do the next best thing and take a really shallow, really quick dive into identifying the tools they used to make their promotional website at tenetfilm.com.
Movie Promotions and Alternate Reality Games
Movie promotions that try to create create immersive, interactive games in the real world have also always intrigued me. Uncovering easter eggs, chasing down clues via real-world phone numbers or websites for fake companies and products that only exist in the game or even connecting with MySpace profiles for the characters a year before launch. I am old enough (just barely!) to remember "The Beast" as part of the marketing for the film A.I., but only remember hearing about it and being intrigued by the idea of "alternative reality" games that bleed over into real-life in interesting ways.
The TENET Site
The TENET site isn't that, but it reminded me of those sort of things. At its core the site lets you speak two different words into your computer or phone. Which two? You'll have to find out and infer from the trailer! Or get a dictionary and spend half the day talking to a website. Or spend 1 minute looking this stuff on Reddit.
Depending on the word you say it'll do two different things, one of which is show you a trailer you've probably already seen and the other is let you download a slightly boring gif.
You can speak words into it and it'll show you different things
How does it work? I was curious and took a peek under the hood. It looks like it was built on Tensorflow.js and a custom model they trained to recognize two speech commands. Actually, it's technically trained to recognize 3 things, if you count background noise:
You can see these word labels in the metadata JSON file here:
Links and configurations for the specific models look to be here:
Tensorflow Speech Command Recognizer
Most of the code looks like it's based off of the Tensorflow Speech Command Recognizer demo, with the (non-trivial!) exception that they're using models they've trained themselves. The stock models you can use have been trained on directional commands, numbers 1-10 and a few other command-style words; nothing like "tenet" or "twighlight" in there.
You can try the demo showcasing the default speech models in action.
Training your own models is not for the faint of heart, or those with end-of-day deadlines! There is good documentation here on generating models to recognize a corpus of about 10 different words and all the work involved:
It'll depend on your machine, but you can expect the training to take up to several hours.
Missinglink.ai has a great tutorial that puts most of these steps and code in one place for you:
Kudos to the person who spent the time creating a custom model for this site! That is some above-and-beyond commmitment to make a movie website and a nice way to get TensorFlow.js and macine-learning experience on your CV :)
Other Speech-Recognition Approaches
I think the approach used for the TENET website is great and probably the "correct" one, even if it requires a modern device & browser and is pretty CPU intensive.
While I was at Recurse Center this past winter I collaborated on a voice-directed game we called Talky-Blocky. In our game you could talk to a little block on the screen, tell it to move around a grid and change the underlying grid item to basically any color, provided it existed somewhere in the HTML color names list.
My personal favorite was telling Talky-Blocky to change a block to chartreuse, salmon or tomato. When we presented it during a 10-15 minutes demo session I think those kind of instructions caught people off guard and created the impression it might've been a little more sophisticatd than it was.
The truth is, it was kind of a parlor trick! We were using the Speech Recognition API which offloads all the heavy-lifting speech-recognition to Google's server somewhere, since Chrome and Chromium-based browsers are basically the one ones supporting this feature for now. Once you have that in place it's just a matter of listening for particular stop words, ignoring all the others and processing them in sequence. It allowed us to get a little conversational with Talky Blocky, which adds to the illusion.
You can try the demo for Talky-Blocky here:
I'd like to explore swapping that out with Mozilla Voice STT. Some day.
If I'd been in charge of building out the TENET site, I might've explore the Speech Recognition API first approach with TensorFlow as a fallback. The TensorFlow approach is much cooler though, works offline (all the processing happens in the browser) and is a little more security-centric in that regard, since your audio never touches a server.
All in all, the TENET website was a really cool project! I bet it was a lot of fun to make and I hope the team that worked on it is proud.