Monday, November 14, 2011



Today, an easy one :) A small program in OpenCV to implement a cheap stereo webcam and visualize the left and right images. The only thing you need are two USB webcams plugged in your computer (better if both are same brand and model):

 Download the source code:

Compile it:

tar xzvf OpenCVStereoWebcam-1.0.tgz
cd OpenCVStereoWebcam

And there you go, your cheap USB stereo webcam.

Have fun!

Sunday, October 30, 2011



Cleverbot is an Artificial Intelligence conversation system specially designed to learn from the conversations that it has with other people trying to mimic the human behavior in a conversation.

To find out the level of intelligence of this kind of artificial entities the archi-famous mathematician Alan Turing proposed in 1950 the Turing Test .  Basically, a human judge engages in a natural language conversation with a human and a machine designed to generate performance indistinguishable from that of a human being. All participants are separated from one another. If the judge cannot reliably tell the machine from the human, the machine is said to have passed the test.

Every year the Turing Test  takes place, this year was celebrated on September 3rd at Techniche 2011, IIT Guwahati, India. In this scenario, several human judges had 5 minutes conversations with another entity (they ignored whether the entity was human or not). Some people believe that if 50% of the judges classify the entity they are talking to as human when in reality it is a machine, then the machine has passed the test. 

This year Cleverbot was classified as human 59% of the times and people might say it passed the Turing Test, but we humans still have hope... humans where classified as humans 63% of the time. I will start getting scared when Cleverbot is classified more often as a human than real humans xD

Anyway, it is fun to talk to Cleverbot, you really have to try it! How? You can do it right here! Just enter some text in the text bar above and hit the "Think About It!" button. Have fun!!

EDIT [2011/11/14]: I removed the widget to talk to Cleverbot directly because it was causing undesired behavior on the website. You can still talk to it by clicking on its logo and visiting Cleverbot's homepage.

Monday, October 10, 2011



Today I would like to publish the answer to another question in the comments of a previous post that might be worth its own post:

Hi Martin,

You gave such an informative article. Good Job Martin:-)

I'll explain steps that I performed in calculating distance of object.

1. Calibrated stereo camera with chessboard 8x6 cm with 2.8cm square size.

2. My calibration was success with rms error 0.206 and rectified images was good.

3. I found my interest point on left-right image. I tried to determine distance of object using method you specified. There is an error in distance measured from 3cm for closer object to 12cm or more for distant object.
Say, actual - 27cm ; measured - 24cm.
actual - 60cm ; measured - 48cm.

My queries are,
- why does there comes this much big varition is distance measurement?
- What may be reason/solution for this error?

Is there any mistake in my procedure or do i miss parameters?

Sathya Kumar

Dear Sathya,

First of all, thank you very much for your kind comment. Regarding your queries, I am afraid that what you describe is quite normal. The steps that you followed are correct, so I don't think that you did any mistake in your procedure or missed any parameters.

The thing is, that the relation between the disparity and the depth is non-linear.

Google ads, probably not very well related to the audience of this blog...

For this reason there is such a big variation in the distance measurement error, for close objects a small variation in disparity means a small variation on depth. But, for far objects a small variation in disparity means a big variation in depth.

So, there is no easy solution, it is unavoidable to get bigger error for distant objects, but you can try to mitigate the effects and reduce the error by getting a calibration as good as possible. And to do that you should take captures of the calibration pattern in as many different positions and orientations as possible, far, near, inclined, etc... Also increasing a bit the size of the chessboard could help you to get better accuracy for distant objects.

I hope this solved your doubts.

Best regards,

Sunday, September 4, 2011



Hi everybody!

A couple of days ago one of the readers of the blog asked this on the comments of a previous post:


I'm Mike. Nice work! :)

I saw that some people have coloured depth maps and not just shades of grey/black/white. If I want to have coloured depth maps how can I achieve this?


Well Mike, some time ago I had the same question and researching a bit I found two approaches. Pseudo Color and Chroma Depth. Let me answer your question with a blog post, so anybody else who could be interested can find it easily.

Basically, what we have is a function that takes as an input a gray scale value and returns the corresponding RGB.  The Pseudo Color approach uses a trigonometric function to assign the RGB color to each gray value. The next picture shows an example:

 As you can see the output is somewhat similar to the images of the luggage scanners in airports and train stations.

But for the purpose of depth visualization, maybe it is more convenient to use Chroma Depth. This method assigns red values to high depth values, magenta values to low depth values and a rainbow color to anything in between. Like in the next picture:

To make it easy, you can download  the simple Gtk application that I made to visualize the output of both methods and play around with some parameters. The source code should be self explanatory, so I will skip the details in this post.

Just execute "make" and then "./GrayToPseudocolor -gray /path/to/your/image.something"
I would like to thank Sarah Martull for letting me use her depth image for this post.

Saturday, August 27, 2011



As you might well know, if you have 2 different views from the same scene then you can estimate the 3D coordinates of any point in the scene by finding the position of that point in the left image and in the right image and then apply some trigonometry.

Let's assume that we have no previous information about the relation between the cameras. We can find a point of interest in the left image, but we don't know where that point of interest will appear in the right image. So, what do we do? We have no other option than scan the whole right image looking for our point of interest. 

Now I can hear you say: "But that would be soooo slooooow!!". Yep, you are absolutely right, that brute force approach is really slow. But, if we know the relation between both cameras then we can calculate something called epipolar lines

What is so special about these lines? Well, the magic about this lines is that a point in the left image will always have its correspondent point on the right image laying on its correspondent epipolar line! So, now instead of having to scan the whole right image to find our matching point, we only have to look over a single line :)

But wait!! There is more!! If the cameras are completely parallel then something very special happens... the epipolar lines become parallel. This means that the match of a point in the left image will appear in the exact same line on the right image! Isn't that awesome?

That is one of the main reasons to wanting to calibrate your stereo camera. Another good reason is that the lenses of the cameras introduces some distortion. That distortion makes straight lines in the real world appear curved in the image... and you don't like that, do you?

Let me show you a couple of image just to clarify.

The image above shows a couple of images taken with an uncalibrated stereo camera. The cameras are more or less parallel, but they are not perfectly aligned. Do you see the red point? See how it is not in the same line on the left and right images?

Now, almost all the methods used to calculate a dense disparity map rely on calibrated images, so if we try to use this images to calculate the dense disparity map we will get really poor results. You can check it on the next picture:

But now, if we apply the magic of calibration:

The image above has been rectified and undistorted (notice the black borders around the image, they are the result of removing the distortion and aligning the images so the epipolar lines are parallel and appear on the same row in bot images). See the green point? Do you see how it appears in the same row on both images?

Now, if we use this to calculate the dense disparity map:

There it is, much better results!!

To sum up, if you want to get the best out of stereo vision:

  1. Make sure that your cameras are as parallel as possible.
  2. Calibrate the stereo camera. See this post for instructions:
  3. Tune the parameters of your stereo matching algorithm. See this post to get an example:
  4. Have fun with it!
So you know, questions are welcome and any comments will be appreciated ;)

Sunday, August 21, 2011



In a previous post I talked about how to calibrate a stereo camera using OpenCV. Today, I would like to talk about the next step. Once your stereo camera is calibrated you can estimate the 3D position (relative to the camera) of any object given its position in the left and right image. For that, we need to calculate the stereo disparity for that object (stereo disparity = the difference in image location of an object seen by the left and right camera). If we want to know the 3D position of all points in a stereo pair of images, then we want to compute a dense disparity map. And that is what this post goes about.

A dense disparity map looks like this:

 I am not going to explain the details or the math behind it, I am more of a practical kind of guy. So let's start.
Basically OpenCV provides 2 methods to calculate a dense disparity map:
In this post I will focus on cvFindStereoCorrespondenceBM, this method is based on Konolige's Block Matching Algorithm. The OpenCV call looks like this:

void cvFindStereoCorrespondenceBM(const CvArr* left, const CvArr* right, CvArr* disparity, CvStereoBMState* state)

The structure CvStereoBMState contains all the parameters that are applicable to the algorithm. There is a bunch of them (pre-filtering, Sum of Absolute Difference windows size, disparity-related, post-filtering...). So, to make it easy, I implemented a small Gtk application that takes 2 images (left image and right image), calculates the disparity map using cvFindStereoCorrespondenceBM and allows you to play with the parameters.

The application is written in C and can be downloaded here: StereoBMTuner-1.0. The application depends on the libraries gtk+-2.0, gmodule-2.0 and opencv. Be sure to have them installed in your system.

Once the file is downloaded just execute:

tar xzvf stereoBMTunner-1.0.tgz
cd StereoBMTunner

The last command will execute the application

As you can appreciate, the disparity map generated using the default parameters is hardly similar to the first image on this post. But, you can tune the parameters until you get a clearer disparity map. This video shows the use of the application:

Once the parameters are tuned, the disparity map is much better

It is still not perfect, but it is not so bad either.
Now, to use this application with your own couple of images the only thing you need to do is execute the application like this:

./main -left /path/to/my/image/left -right /path/to/my/image/right

And that's it. Please leave a comment if you found this useful, have any problems, questions, suggestions, impressions, etc...

Monday, August 1, 2011



Subversion is a very useful centralized version control system. Is very convenient for Software development projects as it allows to the developer team work together in the same files at the same time, keep a record of all the changes and eventually return to a previous stored version of the project if somebody did something wrong or decide to do things differently. 

I have been using it for many years now in a professional environment and for over a year in my private projects (I use it even in my private life xD). I use it in conjunction with trac (which is a powerful tool for Project Management). You might think I am a bit crazy (or completely crazy), but when you have a lot on your mind it is worth having some kind of organization method and this has worked for me so far.

Anyway, recently I was working on my project, committed some changes and got this:

marpemar@MARTIN-PC:~$ svn ci -m "generate_training_samples_from_monocular_images generate color samples"
Sending generate_training_samples_from_monocular_images/main.cpp
Transmitting file data .
Committed revision 666.

Yay! I've reached 666 commits xD To celebrate it I would like to share this video recorded with gource showing the progress and evolution of my private repository until the moment of the evil commit.  Enjoy it!

Thursday, July 28, 2011



Today I came across a video of a tech talk at TED conference where a guy, working for the German automation company Festo, presented an astonishing development: A robot that flies like a real bird

Enjoy the video.

Saturday, July 23, 2011



Many people contact me through this blog to ask the following question: "Hey Martin! How could I get started in the world of Computer Vision?". Well, this book is the answer.

"OpenCV 2 Computer Vision Application Programming Cookbook" is more than just a Cookbook. The author, Robert Laganiere, makes no assumptions regarding the level of knowledge of the reader, so he starts from the basics and goes into more complex subjects progressively. It doesn't matter if you are a total beginner or an experienced user of OpenCV, all the explanations are complete and easy to follow. When the author considers that the audience could be eager to get more details about any of the topics covered in the book, he provides the appropriate bibliography

The code examples are programmed in C++, keeping in mind the performance and always trying to get the best out of Object Oriented programming paradigm. Actually, even an experienced programmer can learn many tips regarding OO programming best practices from this book. As I said: more than just a Cookbook.

You can see it for your self by reading the sample chapter that you can find here.

Of course,  the functionality and capabilities of OpenCV exceeds by far what can be covered in only one book but thanks to this cookbook you will have no problem in mastering OpenCV and being ready to unleash all the potential of the most used Computer Vision library

Sunday, July 17, 2011



Packt Open Source has this week announced a series of discounts on its selection of best selling Open Source books. Readers will be offered exclusive discounts off the cover price of selected print books and eBooks for a limited period only. 

So far in 2011, Packt Open Source announced in March that its donations to Open Source projects has surpassed the $300,000 mark, while in April insight into various projects was offered during the ‘Believe in Open Source’ campaign and July’s series of discounts continue this trend of Packt showing its commitment to the Open Source community.

The Packt Open Source books included in this exclusive discount offer include well known books such as JBoss AS 5 Performance Tuning, PHP jQuery Cookbook, Drupal 7 Module Development and Blender Lighting and Rendering, amongst others.

“This special discount showcases a host of Packt Open Source topics and allows readers to purchase some of our most well renowned books at an exclusive price” said Packt Open Source Marketing Executive Julian Copes. “

To ensure you do not miss this fantastic offer, visit the special discount page now, where you can view the extensive list of books included in the offer and access an array of related articles that were written by the authors.

The exclusive discounts are available from 4th July 2011. To find out more, please visit the Packt website.

Thursday, July 14, 2011



After the release of Kinect SDK, here is what many people was waiting for: Kinect Services for MRDS.

This package allows you to use your Kinect within Microsoft Robotics Developer Studio and moreover use it in a Simulated Environment! 

Good job MRDS team!

Thursday, June 16, 2011



Microsoft launched a beta version of an SDK for Kinect. Which aims to motivate people to develop crazy kinect-based applications for PC under Windows (Check ouy ROS for the OpenSource version. ).

The good news is that, recently, Trevor Taylor (Program Manager in the Microsoft Robotics Group and co-author of the book "Professional Microsoft Robotics Developer Studio") confirmed me in the forum of MRDS that the guys at Microsoft are working on providing a simulated version of Kinect for Robotics Developer Studio.

In my humble opinion, I believe that it will be a major contribution to the robotics community as you will be able to easily run very advanced experiments in a completely simulated environment!

Trevor promised that it will be ready soon so, get ready!

Tuesday, June 14, 2011



Last week came to my attention a new book about OpenCV: OpenCV 2 Computer Vision Application Programming Cookbook. Stay tuned for the upcoming review on this blog!

Monday, May 30, 2011



Today I would like to share a library for Arduino that makes it easy to control a SSC32.

But, what is a SSC32?
The SSC32 is a Serial Servo Controller, capable of controlling up to 32 servo motors at a time. Very useful in robotic projects such as arms or humanoids.

You can buy it at for about $40.  The thing about this piece of hardware is that it is very flexible. It is not only a servo controller, it also provides some pins where you can connect sensors and read  its values. And the most useful of its features (in my humble opinion) is that you can specify a group of servos, set each servo inside the group to move to a different location and specify how long will it take for all the servos to finish the movement. It doesn't matter if each servo is far or near its final location, SSC32 will calculate and apply the speed for each servo so all them finish at the same time!! 

This is very powerful to create complex movements. And they will be performed smoothly

Lets get down to business.
In my case, I am using the SSC32 to control an arm with 6 servos (more about this in future posts)  and I thought it would be nice to be able to command it from an Arduino.

The only "problem" is that the serial protocol defined by the SSC32 is a bit "uncomfortable" to handle, so I created a class to handle it and make it a bit more comfortable.

You can find the library for Arduino here (decompress it inside the "libraries" folder of your arduino environment):

And here is an example of how to use the library:

#include <SSC32.h>

  Tests the SSC32 library. By Martin Peris (
  This example code is in the public domain.

SSC32 myssc = SSC32();

void setup() {

  //Start comunications with the SSC32 device  


void loop() {

  //Move motor 0 to position 750
  //The first command should not define any speed or time, is used as initialization by SSC32


  //Move motor 1 to position 750
  //The first command should not define any speed or time, is used as initialization by SSC32


  //Move motor 0 to position 1500. It will take 5 seconds to finish the movement.


  //Move motor 1 to position 900. It will take 5 seconds to finish the movement

  //Move both servos to position 750 at the same time. 
  //The movement will take 5 seconds to complete
  //Notice that currently motor 0 is at position 1500 and motor 1 is at position 900,
  //but they will reach position 750 at the same time



[EDIT 2011/11/24]: The library has been adapted to the new version of arduino's IDE. Thanks to Marco Schwarz.

Monday, April 11, 2011



Lately I have been playing around with a WiiChuck connected to an Arduino. I bought an adapter to connect the WiiChuck to the arduino and downloaded the source code from the arduino playground, by Tim Hirzel.

I compiled the example and uploaded it to the arduino, but there was no response from the Wiichuck. The example was supposed to print all the data coming from the Wiichuck on the serial port, but it was not working.

I found the solution in this forum. The problem was that you need to define the "Power" and "Gnd" pins for the Wiichuck in order to power it up. So here is the modified WiiChuckClass:

 * Nunchuck -- Use a Wii Nunchuck
 * Tim Hirzel
 notes on Wii Nunchuck Behavior.
 This library provides an improved derivation of rotation angles from the nunchuck accelerometer data.
 The biggest different over existing libraries (that I know of ) is the full 360 degrees of Roll data
 from teh combination of the x and z axis accelerometer data using the math library atan2. 

 It is accurate with 360 degrees of roll (rotation around axis coming out of the c button, the front of the wii),
 and about 180 degrees of pitch (rotation about the axis coming out of the side of the wii).  (read more below)

 In terms of mapping the wii position to angles, its important to note that while the Nunchuck
 sense Pitch, and Roll, it does not sense Yaw, or the compass direction.  This creates an important
 disparity where the nunchuck only works within one hemisphere.  At a result, when the pitch values are 
 less than about 10, and greater than about 170, the Roll data gets very unstable.  essentially, the roll
 data flips over 180 degrees very quickly.   To understand this property better, rotate the wii around the
 axis of the joystick.  You see the sensor data stays constant (with noise).  Because of this, it cant know
 the difference between arriving upside via 180 degree Roll, or 180 degree pitch.  It just assumes its always
 180 roll.

 * This file is an adaptation of the code by these authors:
 * Tod E. Kurt,
 * The Wii Nunchuck reading code is taken from Windmeadow Labs

 * Modified by Martin Peris, to declare which are the power pins
 * for the wiichuck, otherwise it will not be powered up

#ifndef WiiChuck_h
#define WiiChuck_h

#include "WProgram.h"
#include <Wire.h>
#include <math.h>

// these may need to be adjusted for each nunchuck for calibration
#define ZEROX 510  
#define ZEROY 490
#define ZEROZ 460
#define RADIUS 210  // probably pretty universal

#define DEFAULT_ZERO_JOY_X 124
#define DEFAULT_ZERO_JOY_Y 132

//Set the power pins for the wiichuck, otherwise it will not be powered up
#define pwrpin PORTC3
#define gndpin PORTC2

class WiiChuck {
        byte cnt;
        uint8_t status[6];              // array to store wiichuck output
        byte averageCounter;
        //int accelArray[3][AVERAGE_N];  // X,Y,Z
        int i;
        int total;
        uint8_t zeroJoyX;   // these are about where mine are
        uint8_t zeroJoyY; // use calibrateJoy when the stick is at zero to correct
        int lastJoyX;
        int lastJoyY;
        int angles[3];

        boolean lastZ, lastC;


        byte joyX;
        byte joyY;
        boolean buttonZ;
        boolean buttonC;
        void begin()
            //Set power pinds
            DDRC |= _BV(pwrpin) | _BV(gndpin);

            PORTC &=~ _BV(gndpin);

            PORTC |=  _BV(pwrpin);

            delay(100);  // wait for things to stabilize   

            //send initialization handshake
            cnt = 0;
            averageCounter = 0;
            Wire.beginTransmission (0x52);      // transmit to device 0x52
            Wire.send (0x40);           // sends memory address
            Wire.send (0x00);           // sends memory address
            Wire.endTransmission ();    // stop transmitting
            for (i = 0; i<3;i++) {
                angles[i] = 0;
            zeroJoyX = DEFAULT_ZERO_JOY_X;
            zeroJoyY = DEFAULT_ZERO_JOY_Y;

        void calibrateJoy() {
            zeroJoyX = joyX;
            zeroJoyY = joyY;

        void update() {

            Wire.requestFrom (0x52, 6); // request data from nunchuck
            while (Wire.available ()) {
                // receive byte as an integer
                status[cnt] = _nunchuk_decode_byte (Wire.receive()); //
            if (cnt > 5) {
                lastZ = buttonZ;
                lastC = buttonC;
                lastJoyX = readJoyX();
                lastJoyY = readJoyY();
                //averageCounter ++;
                //if (averageCounter >= AVERAGE_N)
                //    averageCounter = 0;

                cnt = 0;
                joyX = (status[0]);
                joyY = (status[1]);
                for (i = 0; i < 3; i++)
                    //accelArray[i][averageCounter] = ((int)status[i+2] << 2) + ((status[5] & (B00000011 << ((i+1)*2) ) >> ((i+1)*2))); 
                    angles[i] = (status[i+2] << 2) + ((status[5] & (B00000011 << ((i+1)*2) ) >> ((i+1)*2)));

                //accelYArray[averageCounter] = ((int)status[3] << 2) + ((status[5] & B00110000) >> 4); 
                //accelZArray[averageCounter] = ((int)status[4] << 2) + ((status[5] & B11000000) >> 6); 

                buttonZ = !( status[5] & B00000001);
                buttonC = !((status[5] & B00000010) >> 1);
                _send_zero(); // send the request for next bytes


    //byte * getStatus() {
    //    return status;

    float readAccelX() {
       // total = 0; // accelArray[xyz][averageCounter] * FAST_WEIGHT;
        return (float)angles[0] - ZEROX;
    float readAccelY() {
        // total = 0; // accelArray[xyz][averageCounter] * FAST_WEIGHT;
        return (float)angles[1] - ZEROY;
    float readAccelZ() {
        // total = 0; // accelArray[xyz][averageCounter] * FAST_WEIGHT;
        return (float)angles[2] - ZEROZ;

    boolean zPressed() {
        return (buttonZ && ! lastZ);
    boolean cPressed() {
        return (buttonC && ! lastC);

    // for using the joystick like a directional button
    boolean rightJoy(int thresh=60) {
        return (readJoyX() > thresh and lastJoyX <= thresh);

    // for using the joystick like a directional button
    boolean leftJoy(int thresh=60) {
        return (readJoyX() < -thresh and lastJoyX >= -thresh);

    int readJoyX() {
        return (int) joyX - zeroJoyX;

    int readJoyY() {
        return (int)joyY - zeroJoyY;

    // R, the radius, generally hovers around 210 (at least it does with mine)
   // int R() {
   //     return sqrt(readAccelX() * readAccelX() +readAccelY() * readAccelY() + readAccelZ() * readAccelZ());  
   // }

    // returns roll degrees
    int readRoll() {
        return (int)(atan2(readAccelX(),readAccelZ())/ M_PI * 180.0);

    // returns pitch in degrees
    int readPitch() {
        return (int) (acos(readAccelY()/RADIUS)/ M_PI * 180.0);  // optionally swap 'RADIUS' for 'R()'

        byte _nunchuk_decode_byte (byte x)
            x = (x ^ 0x17) + 0x17;
            return x;

        void _send_zero()
            Wire.beginTransmission (0x52);      // transmit to device 0x52
            Wire.send (0x00);           // sends one byte
            Wire.endTransmission ();    // stop transmitting



Works like a charm for me :)

Wednesday, March 2, 2011



Today I would like to share an idea that I had some time ago. Since I came to Tsukuba I've met  a lot of nice people, one of them is my good friend Rob Howland. You see, he is a skater. And he loves going around in his skate, sometimes even late at night.

The problem is that the area where we live is kind of dark at night and sometimes you don't even see the road. Not to mention the potential danger of cars not being able to see you.

So, that is why I designed the prototype board for the "Arduino Night Rider":

It will have 30 LEDs, 24 of which will be blue, they will be placed surrounding the skate and used to make cool effects. There will be a white one in the front (I sketched it as a single white LED, but it will be a group of white leds so it can light up the way in front of you), this led will be always on. 

There will be a red LED at the back that will continously blink (like in F1 cars) so you will be easily noticed from behind. There will be 4 yellow leds, one on each corner of the skate and will be used as "direction lights".

3 Tilt sensors will be placed along the skate, two of them will sense which direction you are turning when you turn (left or right) and if you turn left then the 2 yellow leds placed on the left side will blink. The same for the right side when you turn right. 

The last tilt sensor will sense when you raise up the skate and triggers some cool stuff with the blue leds.

The electronics is quite simple and program the arduino would take me a couple of afternoons but I think the most difficult part will be to make it "look professional" 

I don't know if we will have the time, but it would be very cool if we finally do this :) 

Sunday, February 6, 2011



As you might know, the simulator integrated in Microsoft Robotics Developer Studio can import 3D objects into your simulated world. The most common format used to import those objects is ".obj", it is a simple format that most 3D Design Programs can export to.

For the project that I am working right now it was needed to insert a realistic 3D model of a human face into the simulation.  The model of the 3D face was stored in 2 different sets of files:

Head 1.obj
Head 1.mtl
Head 1.bmp

Eyeballs 1.obj
Eyeballs 1.mtl
Eyeballs 1.bmp

The first set of files corresponds to the model of the head and the second set corresponds to the model of the eyes. The .obj files define the geometry of the objects, the .mtl define the properties of the materials of the objects and the .bmp is the texture.

So, first lets try to insert the head. The source code would look like this, add it to the definition of the simulated world:

//Insert the head 
SingleShapeEntity head =
 new SingleShapeEntity(
  new SphereShape(
   new SphereShapeProperties(
    new Pose(),
  new Vector3(0.0f, 0.5f, -2f));
head.State.Assets.Mesh = "Head 1.obj";
head.SphereShape.SphereState.Material = new MaterialProperties("sphereMaterial", 0.5f, 0.4f, 0.5f);
head.State.Name = "Head";


Ok, now compile and execute the simulation:

Uhmmm... WHERE IS MY HEAD? Ok, don't panic! Let's try to find if it is out there. Select the "edit mode" on the simulator window:

Well, it looks like the head is inserted in the simulation, but for some esoteric reason we are not able to see it. Lets look around and try to find it. Use the mouse to rotate the camera in the simulated environment.

Did you see that?

There is some vertices there... wait a minute! Lets move the camera away about 20 meters.

There it is!! But it is a huge head!! No problem... just resize it. To do that, just add the proper line of code before inserting the object into the Simulation Engine:

head.MeshScale = new Vector3(0.01f, 0.01f, 0.01f);

Great, but, where is the texture? mmm thats odd. Why is it not showing the texture?

If you open the File "Head 1.obj" you will see something like this:

# 3ds Max Wavefront OBJ Exporter v0.97b - (c)2007 guruware
# File Created: 30.01.2011 20:28:48

mtllib Head 1.mtl


In the .obj file there is a reference to "Head 1.mtl", I'd bet having spaces in the file name would freak out MRDS , well... let's open that file in a text editor:

# 3ds Max Wavefront OBJ Exporter v0.97b - (c)2007 guruware
# File Created: 30.01.2011 20:28:48

newmtl defaultMat
 Ns 30.0000
 Ni 1.5000
 d 1.0000
 Tr 0.0000
 Tf 1.0000 1.0000 1.0000 
 illum 2
 Ka 0.5500 0.5500 0.5500
 Kd 0.5500 0.5500 0.5500
 Ks 0.0000 0.0000 0.0000
 Ke 0.0000 0.0000 0.0000
 map_Ka C:\Users\sama-sama\Desktop\Head 1\Head 1.bmp
 map_Kd C:\Users\sama-sama\Desktop\Head 1\Head 1.bmp

mmm Interesting, the file "Head 1.mtl" has an absolute path reference to the file "Head 1.bmp" (that, by the way, doesn't even exist in my computer, it is a path on the computer of the designer that modeled the head.Thanks a lot 3ds Max Studio). Let's try something: Remove the spaces in the file names and get rid of the absolute paths.

Now the files should be named as:



Now the content of Head1.obj should look like:

# 3ds Max Wavefront OBJ Exporter v0.97b - (c)2007 guruware
# File Created: 30.01.2011 20:28:48

mtllib Head1.mtl


And the content of Head1.mtl should be like:

# 3ds Max Wavefront OBJ Exporter v0.97b - (c)2007 guruware
# File Created: 30.01.2011 20:28:48

newmtl defaultMat
 Ns 30.0000
 Ni 1.5000
 d 1.0000
 Tr 0.0000
 Tf 1.0000 1.0000 1.0000 
 illum 2
 Ka 0.5500 0.5500 0.5500
 Kd 0.5500 0.5500 0.5500
 Ks 0.0000 0.0000 0.0000
 Ke 0.0000 0.0000 0.0000
 map_Ka Head1.bmp
 map_Kd Head1.bmp

Save the changes to all the files, and execute again:

Yeah! Much better!! Now we only have to do the same changes in the files corresponding to the eyes and insert them into the simulation, the final source code would be:

//Insert the head 
SingleShapeEntity head =
 new SingleShapeEntity(
  new SphereShape(
   new SphereShapeProperties(
    new Pose(),
  new Vector3(0.0f, 0.5f, -2f));
head.State.Assets.Mesh = "Head1.obj";
head.SphereShape.SphereState.Material = new MaterialProperties("sphereMaterial", 0.5f, 0.4f, 0.5f);
head.State.Name = "Head";
head.MeshScale = new Vector3(0.01f, 0.01f, 0.01f);
SingleShapeEntity eyeballs =
 new SingleShapeEntity(
  new SphereShape(
   new SphereShapeProperties(
    new Pose(),
  new Vector3(0.0f, 0.5f, -2f));
eyeballs.State.Assets.Mesh = "eyeballs1.obj";
eyeballs.SphereShape.SphereState.Material = new MaterialProperties("sphereMaterial", 0.5f, 0.4f, 0.5f);
eyeballs.State.Name = "Eyeballs";
eyeballs.MeshScale = new Vector3(0.01f, 0.01f, 0.01f);

Let me introduce you to Mary:

I would like to specially thank Sama-Sama Studio for providing several realistic head models for the experiments of my project.

Sunday, January 30, 2011



Today I would like to talk about a very interesting computer vision technique to control robots: Visual Servoing. Also called Vision-Based Robot Control, is a technique which uses the information gathered from a vision sensor (usually a camera) to control the motion of a robot.

A very good start point is the tutorial Visual Servo Control, Part I: Basic Approaches by F. Chaumette and S. Hutchinson published at IEEE Robotics and Automation Magazine, 13(4):82-90, December 2006.

The basic idea of a visual-based control scheme is to minimize the error between a set of measurements (usually a set of x,y coordinates of several image features) taken at the goal point of view and the set of measurements taken at the current point of view.

In this post I will not go into the details of the method, for that I refer you to the article cited above. But, instead I would like to show you a video, of the simulated robot I am working with, doing Visual Servo Control.

First the robot is activated and it learns how the target pattern "looks like" from the starting point of view (this will be the "goal point of view"). Then the target pattern is moved 6cm up. Obviously, at this position the point of view of the target that the camera sees is different than before, so the robot adapts itself to match the original point of view.

Later the target is moved 50cm towards the robot and again it moves itself so the goal point of view of the target is achieved. 

The demo has been developed using Microsoft Robotics Developer Studio and EmguCV

Sunday, January 9, 2011



One of the basic tasks in Computer Stereo Vision is to calibrate the stereo camera in order to obtain the parameters that will allow you to calculate 3D information of the scene.

Now, I could tell you a lot of stuff about camera projection models, stereoscopy, lens distortion, etc... but there is a lot of information available about such topics out there. So, this post is for those who simply need to calibrate a stereo camera system and start calculating 3D stuff right away by using OpenCV

Anyway, I strongly recommend you to read the book: Learning OpenCV: Computer Vision with the OpenCV Library by Gary Bradski and Adrian Kaehler, Published by O'Reilly Media, October 3, 2008.

So... what do I need to calibrate my stereo camera? A chessboard like this:

Why a chessboard? Because its corners are very easy to find by using computer vision algorithms and its geometry is very simple. In order to find out the position of any corner you only need to know how many horizontal and vertical squares there are in the chessboard and the size of a square. The chessboard in the image is a 9x6 chessboard and if you print it in a paper of size A4 the size of the squares would be more or less 2.5cm.

OK, I've printed my chessboard and I have measured the real size of the squares, now what?
Now you just take multiple views of the chessboard in different positions and orientations with your stereo camera using your favorite software (maybe your own software, software provided by your camera manufacturer or some other free software like Coriander). The images should look like this:

(Yeah, that is me in Hawaiian shorts on a summer day :P)
The more variety of positions and orientations of the checkerboard in the images the better.
Great, you have taken a lot of shots of the chessboard in different positions, now create a text file with the paths to the images. For example:

Now download this software and compile it.

It is just one of the examples of the book mentioned above that I modified to accept some configuration parameters and store the results of the calibration. The usage of the software is as follows:

USAGE: ./stereo_calibrate imageList nx ny squareSize
imageList : Filename of the image list (string). Example : list.txt
nx : Number of horizontal squares (int > 0). Example : 9
ny : Number of vertical squares (int > 0). Example : 6
squareSize : Size of a square (float > 0). Example : 2.5

So, in this example the call to the program stereo_calibrate would be:

./stereo_calibrate list.txt 9 6 2.5

The program will start showing the detected chessboards, calculate the calibration parameters and store them in a bunch of xml files:

D1.xml D2.xml
M1.xml M2.xml
mx1.xml mx2.xml
my1.xml my2.xml
P1.xml P2.xml
R1.xml R2.xml

Congratulations! You have calibrated your stereo camera!! Now you can load this parameters into any other program that uses that stereo camera and play with them:

CvMat *Q = (CvMat *)cvLoad("Q.xml",NULL,NULL,NULL);
CvMat *mx1 = (CvMat *)cvLoad("mx1.xml",NULL,NULL,NULL);
CvMat *my1 = (CvMat *)cvLoad("my1.xml",NULL,NULL,NULL);
CvMat *mx2 = (CvMat *)cvLoad("mx2.xml",NULL,NULL,NULL);
CvMat *my2 = (CvMat *)cvLoad("my2.xml",NULL,NULL,NULL);

Each of the files contains a matrix, if you would like to know the meaning of each matrix, please refer to the book at the beginning of this post. Right now, the useful stuff is contained on the files mx1.xml, my1.xml, mx2.xml, my2.xml and Q.xml.

The files m*.xml are the distortion models of the cameras. So we will need these matrices to undo the distortion of the images caused by the lens. Using the cvRemap() function:

cvRemap(imgLeftOrig, imgLeftUndistorted, mx1, my1);
cvRemap(imgRightOrig, imgRightUndistoreted, mx2, my2);

The goal of all this is to be able to calculate the 3D position (in meters, cm, mm or whatever magnitude you chose) of a point given its position (in pixels) on the left image and its correspondent on the right image. We are almost there, but for that we need the matrix Q. Given the position an interest point in the left and right image, its 3D position can be calculated as follows:

d = pointRightImage.X - pointLeftImage.X;

X = pointLeftImage.X * Q[0, 0] + Q[0, 3];
Y = pointLeftImage.Y * Q[1, 1] + Q[1, 3];
Z = Q[2, 3];
W = d * Q[3, 2] + Q[3, 3];

X = X / W;
Y = Y / W;
Z = Z / W;

And thats pretty much it, now you know how to calculate 3D positions from 2 images using OpenCV. How to find interest points in one image and its correspondent on the other is an art that will be explained another day ;)

EDIT [2011/06/16]: Many people asked me about a good book to get started in OpenCV, so take a look at this:  OpenCV 2 Computer Vision Application Programming Cookbook

EDIT [2011/08/22]: What next? Check out the following post: OpenCV: Stereo Matching

EDIT [2011/08/27]: I changed the Makefile of the software for one much simpler and without hard-coded stuff. Also hosted the software in my own server (it was in megaupload, sorry about that), the link to the software in this post has been updated. Here it is, just in case: 
EDIT [2012/01/05]: My hosting bandwidth was not enough to handle the traffic so I had to host the software on googlecode, the link to the software in this post has been updated. Here it is, just in case: