Face Tracking Pan and Tilt with an ESP32-CAM


Using the ESP-WHO library and a pan and tilt platform to track a moving face.

With the Espressif ESP-FACE library it’s easy to detect a face and find its location in the frame. The library provides a function called draw_face_boxes that is normally used to display a box around a detected face.

Face Detected

The X and Y co-ordinates of this box combined with its height and width can be used find the centre of the box and therefore the centre of the face.

For example, if X is at 105px, Y is at 90px, and the box has a width of 50px and height of 70px then the centre can be found by adding half the width or height of the box to the X or Y values like this: x+w/2, y+h/2 so for the figures above, 105+50/2 and 90+70/2 would give the face centre as x:130 and y:125.

Face Tracking Finding Co-ordinates

One of the tricky parts of using a pan and tilt platform to track a face is converting the distance of the face from the centre in pixels to the degrees the platform needs to move. I’ve chosen the simplest method using a basic conversion from pixels to degrees.

One guide I found recommended using the diagonal measurement of the sensor as below:
For QVGA (320×240)
sqrt(sq(320) + sq(240)) = 400
and then dividing the field of view (for my camera 45 degrees) by this to get the pixels per degree of rotation:
400/45 = 8.89

So for every ~9 pixels of movement in the frame, the servo moves 1 degree in that direction.

Face Tracking Calculating Movement

However, with the platforms I’ve used, the degrees of movement of the servos don’t coincide with the change in degrees of the view area because either the pan or tilt is offset from the centre of rotation.

Pan and Tilt Platform
Pan and Tilt Platform
ESP32-CAM Mount
ESP32-CAM Mount

My original plan was to get the reading and move the platform straight to that location but often it would overshoot (possibly a problem with the off-centred sensor or maybe just my maths) and start oscillating back and forth. So I changed the code to only move half the registered distance each time until it reached the new location. I experimented with looping this movement until completed and then return to detecting, but I went for continuous detection and calculation in the end.

I’ve seen other tutorials where the servos are moved in the direction of the face until the face is in the centre of the frame which is another approach. I think this might only work well when the frame rate is higher. The face detection runs about a 3 frames per second.

Another thing I’ve noticed is that variations in the detected face location mean the pan and tilt platform wanders a little when the face is centred. Some code could be added that so the servos are only activated if the face is outside of the centre area.

Face Tracking Video Demonstration

Wiring Diagram

The wiring is the same as the basic pan and tilt tutorial here. I use a USB power bank for the 5v source with one of these USB cable connectors to make it easy to connect and disconnect the power.

ESP32-CAM Pan and Tilt Wiring Diagram

The Code

If you’ve not used the ESP32-CAM before you will need to read through this tutorial first – https://robotzero.one/esp32-cam-arduino-ide/ to get familiar with it.

You also need to install the ArduinoWebsockets library by searching in Tools > Manage Libraries:

Library Manager ArduinoWebsockets

Copy and paste the Sketch below and save it. Copy these two files: camera_index.h and camera_pins.h to the same directory. You should be able to compile and run the same way as other ESP32-CAM projects.

If anyone has suggestions for improving the maths or how to calculate degrees of movement when the sensor pan or tilt movement is off the axis centre please let me know via the comments or contact form.


3D printable pan tilt mount: https://www.thingiverse.com/thing:3579507
Pan and tilt location calculation (complicated): https://stackoverflow.com/questions/44253787/translating-screen-coordinates-x-y-to-camera-pan-and-tilt-angles
Pan and tilt location calculation (simple – the one I used): https://stackoverflow.com/questions/17499409/opencv-calculate-angle-between-camera-and-pixel
The reason simple isn’t accurate: https://www.quora.com/How-can-I-find-the-pixels-per-degree-if-I-know-the-resolution-and-angle-of-view-for-a-pi-cam

15 Replies to “Face Tracking Pan and Tilt with an ESP32-CAM”

  1. Anonymous says:

    The code seems to have some Copy paste error. I mean some parts of the code repeats itself over and over again. Please check the uploaded code and make corrections in the website.

    1. WordBot says:

      Hi, Dunno what happened there but I fixed it.

  2. ilas says:

    Hello, everything works except the movement of servos, specifically the signal to the servos is 50% constant even in the presence of the face is not centered in all conditions. If I try to force the servo I feel that tends to remain in the same position. Where am I doing wrong?

    1. WordBot says:

      Hi. If you change this code
      ledcAnalogWrite(2, 90); // channel, 0-180
      to ledcAnalogWrite(2, 180); // channel, 0-180
      does the servo move to 180 position?

      You can also try some Serial.println(); statements to check the face movement is being detected.
      pan_center = (pan_center + move_to_x) / 2;
      for example

  3. ilas says:

    hello, thanks for the reply.
    Yes, if I change to ledcAnalogWrite (2, 180); // channel, 0-180, the servo moves at about 180 degrees
    If I include Serial.println (pan_center); immediately after the while, on the serial I always read “90” even, if I move my face at any point …
    I’ve just finished the box too …

    1. WordBot says:

      Hi, Does the CameraWebServer example work for you? File > Examples > ESP32 > Camera > CameraWebServer

  4. ilas says:

    Hi, yes,work correctly also CameraWebServer with face recognition and detection.
    I have CAMERA_MODEL_AI_THINKER module.
    The 5V is stable.
    The servo is good, It seems to always detect the face in the middle when I load your program because when it includes Serial.println (pan_center); I always read the value 90 on serial.
    I’m trying to figure out where I’m wrong ..

    1. WordBot says:

      Hi, I’ve found the problem. The draw_face_boxes() function isn’t being called because (I’m guessing) the face_detect() function has changed in the newer versions of the Espressif ESP32 Arduino hardware libraries. Face detection was broken in 1.0.2 so I used 1.0.1 for this script. It works if you choose 1.0.1 in the Boards Manager. I’ll update the script for 1.0.4 at some point.

  5. ilas says:

    Hi, I installed the 1.0.1 version but the problem persists.
    I’m thinking of everything, which browser do you use?
    The image via web I see only with Firefox while with the old Explorer and Explorer Edge does not pass streaming .. I have not tried with Chrome.
    I can confirm however that with the 1.0.4 version the “CameraWebServer” program works perfectly (my face appears with a yellow outline once it is hooked).
    During the ignition, the servants make a gesture of movement (first one and the other) to return to the central position.
    Thanks for your interest …

    1. WordBot says:

      Hi, I tried in IE11 and it doesn’t work. Probably doesn’t support WebSockets. Edge works for me. Try adding this code to see if you see anything in the serial monitor:

      pan_center = (pan_center + move_to_x) / 2;

  6. ilas says:

    Hi, check if the coffee has arrived 😉
    I wrote the Serial.println (pan_center); inside the loop and actually on the serial monitor I always read “90” (it’s a test I had already done before but with version 1.0.4 now have installed the 1.0.1).
    I would like to be able to solve this problem for Halloween as a treat for my children by putting a small paper ghost in front of the camera for face autotracking …
    I understand that it is hard work to compile for version 1.0.4 ..
    Do you have any other ideas?
    Thanks in advance.

    1. WordBot says:

      Thanks for the coffee! Try putting the println code in the draw_face_boxes() function to see if it’s being called by the loop.

  7. ilas says:

    Hi WordBot, yes, actually the draw_face_boxes () function is not called, or rather it is called but only twice at an interval of about 2 minutes, but then, it is no longer called.
    The video streaming on web it’s ever regular ( about 3fps ).

    1. WordBot says:

      I wonder if you need more light or the face is too close or far away. The draw_face_boxes() function is called when a face is detected.

  8. ilas says:

    it’s something that I thought too, I tried in all light and distance conditions, I don’t know if it’s the same detection method, but I noticed that with the “Camera Web server” example the yellow face detection panel works very well even in low light conditions, with a very close face (about 30 cm) and a face with a maximum distance of about 1.5 m work correctly.
    No fear, sometimes I’ll check on your site if you’ve updated this section.
    Thanks anyway..

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

scroll to top