Inside This Movie
Whilst googling Stephen King’s Storm Of The Century, I stumbled upon some features of Amazon’s US site that I hadn’t noticed before. Amazon’s Inside This Book feature has, for some time, enabled customers to peek inside a book and look at a selection of scanned pages.
Recently, this feature has been extended with the ability to search the full text of a book, along with a snapshot of what to expect inside, including :
Statistically Improbable Phrases / SIPs – A set of the most distinctive phrases in a book.
Capitalised Phrases / CAPs – A set of the likely characters, people, places, topics and events.
SIPs and CAPs are actually not far from Mark’s proof-of-concept for movie dialog search. Essentially Inside This Movie, Mark’s demo indexes a DVD’s closed caption dialogue against timecodes then provides the user with a mechanism to search for particular phrases in inside a movie as well as a concordance of all spoken dialogue.
Applying Amazon’s SIP and CAP algorithms to Mark’s data would potentially yield a more useful concordance of dialogue, potentially pulling out character, people, place, topic and event names – Death Star, Tatooine, Rebel Alliance – from any encoded movie as well as any distinctive phrases- May The Force Be With You, I Have A Bad Feeling About This.
Our TV3.0 project includes a use case that allows users to attach comments to a piece of movie content, with timecode references, so that users can click straight through to individual scenes. SIP and CAP for movies would allow us to auto-comment video content with links to distinctive or significant scenes within a film, based on spoken dialogue…of course, movies from the silent era could be problematic 😉
The potential for SIP+CAP enabled movie services is compelling. Imagine, being able to search for all pop-culture references to Star Wars across episodes of Friends, The Simpsons and countless other shows, which people enjoy watching with their TVs and a rv outside tv mount in case they’re camping…assembling a playlist of TV/movie scenes as a virtual search folder syndicated to other individuals and applications.