As we continue to deliver robots to backers and people begin building skills for Misty, it’s important to us that you remain informed about the state of the art (technology) and the state of the product (Misty II).
One of the capabilities we are working towards for Misty II, is the ability to autonomously move about an environment, avoiding obstacles and ledges and knowing where it is. Misty’s Occipital Structure Core sensor — which is comprised of a stereo pair of infrared cameras, an ultra-wide-angle camera and an IMU — is used for simultaneous localization and mapping (SLAM). This technology allows us to build maps of Misty’s environment and track where and how the robot is moving through this space. This opens the door to many skills and use cases for Misty. We’ve shared an in-depth breakdown of how Misty’s Occipital Structure Core depth sensors work here on the blog and on Medium if you’re interested in reading more.
Today, we’re sharing a guest post from Jeff Powers, co-Founder of Occipital so you can learn more about this technology. From how Occipital got started, to what to expect of Misty’s autonomous movement skills when you receive your robot, to understanding what’s ahead for Occipital — Here are your updates, straight from the source.
Bringing Dense 3D Mapping to Misty
Jeff Powers, Co-Founder, Occipital
Decades from now, roboticists will simply be able to grab an off-the-shelf vision module and install it into their robot, giving it perception that will be in many ways superior to human perception. The system will be able to know its location in a space within to sub-millimeter accuracy and it will understand everything structurally and semantically, from a couch, to a kid’s toy, to a human.
While we’re scratching the surface today, no such vision module exists.
How we got here
When I started Occipital 10 years ago with my co-founder Vikas, one of the things we imagined was a software vision engine that could solve the perception problem for not only robots, but also things like AR headsets. We got started on this in late 2011, but we were a little ahead of our time, because mobile processing power was more than 100X slower than it is now, and mobile cameras produced fuzzy, grainy images, and barely worked at all in dim environments.
Alan Kay said, “People who are really serious about software should make their own hardware.” So, that’s exactly what we did, creating our own camera to work around the inadequacy of what was available at the time. Specifically, we built an active 3D camera called Structure Sensor, which we launched in late 2013 on Kickstarter.
“Active” means that it actually acts on the environment — by emitting a pattern of infrared (IR) light. This light pattern bounces off of objects and returns to an IR camera onboard. Using the image received, a depth map is computed, giving a 3D coordinate for every pixel.
We also made progress on the software for perception — dense 3D mapping, sparse feature tracking, loop closure, to name a few. But we never put these all together into a package for robotics. That is, until Ian Bernstein reached out to tell us about his new idea: the Misty robot.
We didn’t expect to work on robot perception so soon, but Misty was an ideal partner. To make a long story short, we signed up to help Misty bring true 3D perception to Misty’s platform, helping on both hardware and software.
How it works
The Misty robot incorporates our second-generation active 3D sensor (Structure Core), which comes with an ultra-wide-angle fisheye camera, dual IR cameras, and an IMU. Structure Core communicates to Misty’s main processor, where 3D perception software takes over.
Challenges we’ve faced
Today, Misty is able to map a room, track itself within the room, and compute safe navigation trajectories to reach other locations in the room. But to get this far, Occipital and Misty have closely collaborated to clear a number of hurdles.
CPU & USB speed
Although Misty comes with a powerful processor, we’ve had to make major reductions in CPU usage to make sure the robot doesn’t get too hot and has room to run other processing. We modified algorithms to produce maps that were still very rich for robotics, but are much more compute-efficient to build.
We also ran into the fact that sometimes USB behaves very differently on mobile processors than we were used to on a laptop or PC. For example, it turned out that IMU data, which should normally arrive over USB just 1-2 milliseconds after data acquisition, could sometimes take over 100 milliseconds on a mobile CPU! This created a cascade of problems until we ultimately found ways to get the latency under control.
Along with compute reduction, it was also necessary to find ways to reduce the amount of RAM that Misty required to map a room. Because Misty doesn’t only build an occupancy map, but also builds a dense mesh of the room, and memorizes a sparse map as it explores, the amount of RAM required is significant.
We made a number of optimizations to get memory usage down, including using a special 2.5D data structure to accumulate depth information in an efficient way. We also revamped the way we save maps to flash memory, to avoid huge memory spikes when writing data.
Another hurdle we faced, again and again, was streaming reliability with our new Structure Core sensor. Murphy’s Law was in effect: Anything that can go wrong…
Both the driver and firmware for Structure Core were new (and brittle), with plenty of fun and exciting bugs to discover. Many of these issues appeared for the first time on Misty, because we were usually using Structure Core with laptops and desktops, and we’d become too used to just unplugging & re-plugging the sensor when something went wrong — a luxury you don’t really have on the robot! Today, we’ve crushed dozens of issues, and while a few likely still exist, the reliability of streaming has been dramatically improved.
Mapping & Tracking from “robot height”
Misty is just over one foot tall, and putting a 3D sensor at this height has some interesting consequences. For one, you often find yourself looking at a flat wall and nothing else! Think of a kitchen island as viewed from the floor, versus up high.
For years, we had been successfully mapping rooms from human height, using detailed depth maps, but with Misty, we could no longer rely on the depth maps always being detailed. So rather than leverage depth alone, a fisheye camera comes to the rescue and allows Misty to see with a field of view (FOV) that is much closer to a human’s.
But to do that, it meant we had to switch to a completely different tracking pipeline — one that tracks sparse visual features. Packing this new algorithm in to the size needed to run on Misty was another challenge we had to overcome. But now, Misty leverages both depth and fisheye information together to track its movement around your space, meaning it can deal with a much wider variety of conditions.
Getting the best results from the SLAM mapping beta
Putting everything together (and dozens of things I haven’t mentioned here), we have arrived at a “SLAM beta” on the first Misty robots shipping out into the world. Some of the limitations of the beta are:
Misty doesn’t automatically explore, you need to drive Misty around.
That means that need to be careful about not driving MIsty into a position where it can’t see anything visually interesting. In other words, you should avoid getting too close to blank walls, or pointing off too far into the distance (where everything in view is 10+ meters away).
Misty cannot yet memorize an entire home (unless you have a studio apartment!)
You will need to map around one room at a time, for now. This is because Misty stores the entire room in RAM during exploration. In the future, we’ll allow Misty to save parts of the room to Flash memory in real-time, enabling exploration and mapping of an entire home. For now, try to explore the whole room in a single pass, without re-exploring the same areas over and over, to minimize memory usage and allow you to capture the largest area possible.
Misty may not always be able to determine where it is inside a mapped room you explore a little bit and show it a viewpoint it is more familiar with.
For example, if you map a room looking towards North, and then you stop mapping, and turn around and face South, Misty may not realize where it is. A few robots on the market today often solve this by simply auto-exploring to acquire position. Since Misty doesn’t have this feature yet, you’ll need to keep in mind the orientations and positions Misty explored when mapping in the first place.
Where we’re headed
We’re working continuously with the Misty team to improve things like overall reliability, robustness of relocalization, and size of mappable areas. We’d like Misty to be able to reliably map and navigate an entire home or office.
From there, Misty should get even smarter, being able to detect changes in the environment, and navigate around them, or alert you to them! Combined with deep learning, Misty will be able to do smart things like “go look for the dog”, or “go into the kitchen”, even if you never told it which room was the kitchen.
It has been a challenging but rewarding two years working with the Misty team to bring true 3D perception to their home robotics platform. We’re far from complete, and I can’t wait to continue bringing new perception updates to Misty, getting closer to the engine we envisioned ten years ago!
A special thanks to Jeff Powers for sharing his insight and updates from Occipital. If you have questions about Occipital or any of the other technology Misty comes with, head over to the Misty Community Forum and let us know.