Google wants you to have the best selfie

Building on last year’s GIF builder, Motion Stills, Google Research has just released two more ‘appsperiments‘ in time for your holiday merriment: Scrubbies and Selfissimo!

Scrubbies lets you “shoot a video in the app and then remix it by scratching it like a DJ. Scrubbing with one finger plays the video. Scrubbing with two fingers captures the playback so you can save or share it.”

Selfissimo! lets you “tap the screen to start a photoshoot. The app encourages you to pose and captures a photo whenever you stop moving. Tap the screen to end the session and review the resulting contact sheet.”

Are you worried that taking so many selfies might give you “selfitis” and turn you into a narcissist? Well, don’t. Snopes disproved that potential mental disorder.

What I love about Selfissimo! is that by taking the photos for you, it gives you more of a true photo session experience, heightened by the fact it only shoots in black and white. Think of the photo shoot scene in Austin Powers ‘The Spy Who Shagged Me’, which itself is homage to the photo shoot scene in Michelangelo Antonioni‘s 1966 masterful film ‘Blow-Up’.

My take: I highly recommend Selfissimo! because it’s so much fun! Here’s to a great 2018, everyone!

Battling AI’s create new realities

The adage “Seeing is believing” is no longer true.

Three researchers, Ming-Yu Liu, Thomas Breuel and Jan Kautz, working for Nvidia, have created an AI that can generate life-like images.

In their system, multiple neural networks learn together by trying to fool each other with better and better solutions to the problem at hand. These are generative adversarial networks or GANs.

See their paper and GitHub. A sample below:

My take: this is kinda scary. Neat to think of “environmental” filters to add to genuine footage (think Nighttime, Winter, Rainy, etc.) but that this technology can create genuine-looking unreal footage is downright Orwellian. How do we distinguish true from fiction, real from fake? The only conclusion is that everything is now suspect. Sad.

Seeing is not believing

At the recent Adobe Max conference, one of the sneak peeks really caught my eye: Adobe Cloak.

This “content aware fill for video” is amazing and could be revolutionary if it ever sees the light of day in a product or service.

It’s powered by Adobe Sensei and it works by imagining what’s underneath the objects you want to remove.

By the way, if you want to do this today, you can use the Remove Module in Mocha Pro.

My take: the ease and speed of this is literally astounding. There were lots of great sneak peeks this year, including SonicScape for 360/VR sound editing. First come the tools, then comes the art.

Computational Video Editing may replace Assistant Editors

Eric Escobar writes on Film Independent about his trip to Siggraph 2017 and the one technology that blew his mind: Computational Video Editing.

Three researchers from Stanford University and one from Adobe demonstrated a system that:

“automatically selects the most appropriate clip from one of the input takes, for each line of dialogue, based on a user-specified set of film-editing idioms. Our system starts by segmenting the input script into lines of dialogue and then splitting each input take into a sequence of clips time-aligned with each line. Next it labels the script and the clips with high-level structural information (e.g., emotional sentiment of dialogue, camera framing of clip, etc.). After this pre-process, our interface offers a set of basic idioms that users can combine in a variety of ways to build custom editing styles. Our system encodes each basic idiom as a Hidden Markov Model that relates editing decisions to the labels extracted in the pre-process. For short scenes (< 2 minutes, 8-16 takes, 6-27 lines of dialogue) applying the user-specified combination of idioms to the pre-processed inputs generates an edited sequence in 2-3 seconds.”

That’s right. Three seconds. For a 90 second scene. Versus 90 minutes for a human. If my math is correct, that makes this system 180,000% faster!

The idioms, from the research notes:

  • Avoid jump cuts
  • Change zoom gradually
  • Emphasize character
  • Intensify emotion
  • Mirror position
  • Peaks and valleys
  • Performance fast/slow
  • Performance loud/quiet
  • Short lines
  • Speaker visible
  • Start wide
  • Zoom consistent
  • Zoom in/out

Editors combine a number of these idioms and weight them to generate different assemblies of the rushes, automatically.

Of course, editors will then proceed to polish these rough cuts, tweaking the edits and finessing the sound.

My take: This promises to take out all the tedium in editing and let editors focus on truly being creative. Eric envisions a client-side version of this in which every viewer’s version of a film is custom-generated for them, based on their favourite editing style. That may be going a little too far but what I find fascinating about this system is that it starts with the script, once again highlighting how crucial it is.

OPA chips may one day replace optical lenses

Caltech researchers have created an optical phased array chip that can capture images.

The technological breakthrough has the potential to revolutionize photography.

Ali Hajimiri, Bren Professor of Electrical Engineering and Medical Engineering in the Division of Engineering and Applied Science at Caltech, claims:

We’ve created a single thin layer of integrated silicon photonics that emulates the lens and sensor of a digital camera, reducing the thickness and cost of digital cameras. It can mimic a regular lens, but can switch from a fish-eye to a telephoto lens instantaneously — with just a simple adjustment in the way the array receives light.

He continues:

“The ability to control all the optical properties of a camera electronically using a paper-thin layer of low-cost silicon photonics without any mechanical movement, lenses, or mirrors, opens a new world of imagers that could look like wallpaper, blinds, or even wearable fabric.”

Read the PDF.

My take: This is the perhaps unseen conclusion of digitization. First film. Soon lenses. Both usurped by ones and zeroes. I wonder what the future of visual storytelling will look like when almost anything flat — walls, windows, ceilings — can become image capturing tools.

Snap Spectacles

Snap Inc. has sold more than 100,000 of its funky retro Spectacles.

Formerly Snapchat Inc., the social multimedia firm now considers itself…

“…a camera company. We believe that reinventing the camera represents our greatest opportunity to improve the way people live and communicate. Our products empower people to express themselves, live in the moment, learn about the world, and have fun together.”

In related news, a judge has ruled against a trademark infringement case brought against Snap by Eyebobs of Minnesota. They…

“…felt the similarity of the eyeball logos would lead a Spectacles user or Eyebobs customer to think the two companies were partners, or had collaborated.”

However, Snap…

“…denied these claims, adding that a crucial flaw in Eyebobs’s argument was the trademark in question. While Snap held a trademark on its eyeball logo, Eyebobs only had a trademark on its name, not its logo.”

Ouch!

You be the judge:

My take: I’m intrigued by the POV angle of the camera in these smartglasses but a negative is the circular 1088 resolution. And getting your video out of Snap and into something you can edit might take some finagling.

How to encode movies in cells using DNA

As reported widely last week, Seth Shipman, from Harvard Medical School, has used CRISPR-Cas technology to encode a 36 x 26 pixel movie into the DNA of living E. coli bacteria.

“The mini-movie, really a GIF, is a five-frame animation of a galloping thoroughbred mare named Annie G. The images were taken by the pioneering photographer Eadweard Muybridge in the late 1800s for his photo series titled ‘Human and Animal Locomotion.'”

They explain it all in a bigger movie:

They hope to turn cells into living recorders to store information from the immediate environment.

Curiously, the scientists who did this in March of this year don’t seem to have received much coverage. And they accomplished much more: encoding, among other things, a gift card and a computer virus. Obviously, the Harvard brand has better publicists.

And similar feats have been done before. IBM spelled out its name in atoms in 1989.

My take: this is just a stunt to prove we can encode information in DNA, something Mother Nature has been doing for billions of years. But of course, let’s not forget the unintended consequences. When you mess around with Mother Nature, things don’t always go as planned. Imagine encoding ‘Godzilla‘ — and then the DNA mutates!

Interactive video comes to Netflix

Casey Newton reports on The Verge that Netflix is testing interactive video with half its audience — kids.

The first title is Puss in Book: Trapped in an Epic Tale.

Carla Engelbrecht Fisher, Netflix’s director of product innovation, says:

“Kids are already talking to the screen. They’re touching every screen. They think everything is interactive.”

The result is a branching story that results in varying viewing lengths of 18 to 39 minutes.

A second title will have a simpler structure with four endings:

Note that Netflix has not invented branching stories — Choose Your Own Adventure published 250 million gamebooks over two decades in the previous millennium.

My take: it’s interesting that Netflix only has the technology working on half of their platforms. Nevertheless the potential is seductive. Think of the dramatic possibilities: “Feeling lucky, punk?” or “You take the blue pill—the story ends, you wake up in your bed and believe whatever you want to believe. You take the red pill — you stay in Wonderland, and I show you how deep the rabbit hole goes.” One thing writers will have to get used to is the cavalier ease at which the audience will be able change the narrative — kind of like print designers had to let go of exact specificity when the web came along.

Google promotes VR180 on Youtube

Frank Rodriguez, Google’s VR Product Manager has posted The world as you see it with VR180 on both Google and Youtube.

“VR180 videos focus on what’s in front of you, are high resolution, and look great on desktop and on mobile. They transition seamlessly to a VR experience when viewed with Cardboard, Daydream, and PSVR, which allow you to view the images stereoscopically in 3-D, where near things look near, and far things appear far. VR180 also supports livestreaming videos so creators and fans can be together in real time.”

There are two main differences between 360 Video and VR180:

  • VR180 lacks the ‘back 180’ (which can therefore allow for higher resolution up front)
  • VR180 is stereoscopic, using two lenses to create true 3D, from the camera’s fixed point of view.

In addition, Google announced that they want to help build new VR180 cameras, initially partnering with Lenovo, LG, and YI Technology. Team Lucid tells me, “We hope to be the first certified camera for this program and will be sending updates via social on our progress.”

See the VR180 playlist on Youtube’s official Virtual Reality channel.

My take: glad to see Google/Youtube agree with me: 3D VR180 is a friendlier  version of 360 Video and true Virtual Reality. My only concern is that for browsers, be they web or mobile, they’ve removed the mouse or keyboard controls; the immersive goodies are for VR headsets exclusively. However, for a filmmaker like me who likes to shoot on a tripod, this promises to be the best of both worlds.

360 Video Heatmap Analytics

In your 360-degree and VR videos, the audience can look almost anywhere. You might hope they’re looking over here, but what if they’re looking over there?

Now there’s a way to find out where viewers are looking.

Youtube Creator Blog has just posted Hot and Cold: Heatmaps in VR .

“Today we’re introducing heatmaps for 360-degree and VR videos with over 1,000 views, which will give you specific insight into how your viewers are engaging with your content. With heatmaps, you’ll be able to see exactly what parts of your video are catching a viewer’s attention and how long they’re looking at a specific part of the video.”

Some key findings based on their research:

  • People spend 75% of their time within the front 90 degrees of a video.
  • Almost 20% of views are directly behind.
  • Mobile viewers using Google Cardboard need a couple of seconds to get situated before the action starts.

They also suggest: “Try using markers and animations to draw attention to different parts of the scene.”

My take: these analytics are golden if you’re making immersive video. My advice is to map out where you think attention will linger and then compare it with actual results. This should help you refine your content. In addition, I feel these findings bolster my contention that 180-degree 3D immersive video is superior to flat 360 for narrative immersive video.