Synthetic Vision Robot Tracks Ground Target for Precision Cargo Delivery using Augmented Reality Technology
- February 15th, 2012
- Write comment
On the week of December 5, 2011 Tanagram joined a team of engineers and adventurers in the Arizona desert (surprisingly cold desert) for a week of flight trials for the DARPA-BAA-10-57 Tactical Expandable Maritime Platform (TEMP) program. Tanagram’s assignment was to develop a synthetic vision system to facilitate the safe delivery of 3000 pound emergency supplies via enormous, autonomous (yup robots go!) powered parasail aircraft.
Our solution, we’re proud to say, is surprisingly simple and elegant. We used already ubiquitous fiducial marker tracking technology (so last year) and created a giant 20′x20′ fiducial marker to serve as the cargo landing site (we call them Drop Targets). We then programmed a robot to detect and track those markers while strapped onto our test aircraft. During this testing program we sought to prove our system could successfully detect and maintain tracking of our marker in a real-time flight environment from various altitudes, approach directions, and lighting conditions. The good news, we were successful!
Here’s how the system works.
Typically fiducial markers are used for Augmented Reality overlay where the user holds the a printed marker in front of a web cam and is presented a wonderful 3D image drawn on top of the marker. Something like this:

What’s great about fiducial marker tracking technology is that it is pre-built to handle the marker being viewed at ANY angle (including nearly flat) in the crappiest of lighting conditions. A marker does not have to be smack in front of the camera to be recognized. Another added bonus is that it is easy to embed a two-dimensional barcode within the marker. A 2D barcode is just like any barcode you’ve seen on a box of cereal. It’s a picture that contains a number meant to be read by a computer. The system we used was extremely simple and only facilitated 64 unique addresses. Here’s an example:

64 numbers (really addresses) might not seem like a lot but when an emergency response supply camp is set up, it’s hard to imagine more than 20 unique cargo landing sites so the small number worked perfectly for our purposes.
The following image is a screenshot of our synthetic vision technology viewing our Drop Target marker from atop a 30′ scissor lift.
It’s huge, ya? We actually chose these dimensions to maximize detection distance and also provide a manageable deployment scenario. It takes 3 people just under 2 minutes to spread out our Drop Target.
The red box surrounding the marker serves two purposes, 1) It indicates the marker has been properly detected and locked, and 2) It is the system’s interpretation of the distortion (or 3D placement) of the Drop Target in the viewable world. This is important because those dimensions give us very accurate information about the viewing angle of our camera and we can use that information to triangulate our aircraft’s exact location in relation to the Drop Target. Very handy when the end goal is to place a 3000 pound box of supplies on the tarp, eh? In the center of the image you will see a red number eight (click on the image to see it full size). That is the id number for this particular marker. This tarp is Drop Target number eight.
The best part of this technology is how unstable it is. I know that’s a weird thing to say, but we actually needed a system that would break if anything (mostly humans and cows) got in between the marker and the camera. We don’t want to deliver a payload on an occupied or occluded Drop Target. Bad things happen…
Our goal was to use the occluded glyph to prevent detection and therefore prevent the autonomous craft from detecting and dropping it’s payload. In this instance the vehicle would instead opt for a “go-around” (increase altitude and circle) to resurvey the area from a higher altitude.
Take a look at this picture taken moments later when Angela’s shadow is crossing on top of the Drop Target marker.
Note the red box and number are now missing. That’s because the system can no longer identify the marker. AWESOME!!!
One of the biggest challenges we faced was getting a realistic detection distance. In the industry today, a 15x multiplier, that is the length of one edge of the marker multiplied by 15, is considered a good read. That means a 20 foot marker could normally only be detected 300 feet away. We needed much more. Working with Patrick at Patched Reality, we hotrodded our code and are now getting a 52X multiplier! That means we can detect our marker at greater than 1000′ line-of-sight. That’s plenty of room for a powered parasail aircraft to negotiate an approach.
“So what about this robot you keep mentioning?” you say. The following image is a snap of the robots “arm.”
It’s really a robotic camera gimbal we designed using parts from a Sparkfun Robotic Claw kit. For those “DIY inclined folks” out there, we reversed the position of the servos to reduce the length of the armature and gain a wee bit of stability and added a few extra holes in the unit so we could mount two units together. My favorite part is the gaffer’s tape holding the USB webcam onto the front of the device. (some of this concept was taken from a cool Sparkfun Project that uses OpenCV and a web cam to track human faces – Sparkfun rocks!) Similar to the project just referenced, our bot was controlled by a GPS enabled Arduino that received commands from a human mounted laptop.
The following video is one of the successful tests. You’ll see the craft on approach heading 270 (that’s West) and out in the distance just past a white building you’ll see a small black dot and if you look REALLY hard you’ll see a small white dot. The black dot is the ground team next to their vehicle and the white dot is the Drop Target. The green lines dancing all over the place are the algorithm scanning the view for shapes that *could* represent it’s desired target from any specific angle. In other words, the green represents candidates to be considered as Drop Targets. As the video progresses you’ll see our synthetic vision system start to pick up the Drop Target with an occasional red blip and then get a lock. Once it has a lock it pans the target into the center of the visible area and holds it there. It’s kind of like a reverse engineered image stabilization system that works really well. You’ll see how much the aircraft buffets before the lock is obtained and then how steady the target is maintained in the center after the lock. At the end of the video the robot arm reaches it’s maximum angle of view (straight down) and is programmed at that point to snap back to scanning position (straight forward).
This next video is another good demonstration of our occlusion protection as the craft buzzes the ground team preparing to remove the Drop Target. You’ll notice no lock is obtained.
Much more was learned and there is much more to share but I fear this post is already TL;DR.
Stay tuned for an update on our next flight test in March!




























