Artificial Intelligence

Design and realisation

Within the project, it is decided to use a modified Raspberry Pi camera that is capable of auto-focus. This feature is handy to make sharp images when the distance to the object is relatively small (i.e. 10 cm). As mentioned in the introduction, the delta robot has been based on the design of Mr. Chasteau and adapted to mount the Jetson and the camera on the end effector. The target for the project is to write a central application that can control the auto-focus camera (AF CAM) and the delta-robot. Further, the application must allow for the aquisition of images, train an artificial intelligence network model, test it and subsequently be used for image (read: object) classification. The results can be output to file or anything else that is required. The schematic below shows a simplified block diagram of the system.

block diagram delta robot

After running through several NVIDIA tutorials for the Jetson on artificial intelligence (AI), the delta-robot is built and the base for the application is written. At this point in the project, a simple GUI written in Python has been realised in which the camera images are shown and the robot can be controlled by inputting cartesian coordinates and transmitting these to the controller of the servo motors that move the robot's legs. In the robot controller, reverse kinematic functions are implemented to calculate the required angles of the servos. Details on this part of the project are not discussed on this page but can be explained further if required (just contact Mc'bedded)

Some images of the Delta-robot and auto-focus camera after assembly
0 0 0 0
Delta-robot Delta-robot and monitor Delta-robot bottom, leds off Delta-robot bottom, leds on

The white parts in the images are 3D printed at Mc'bedded. The 3 servo motors are attached to the robot's arms and at the end of the arms the end effector is modified to be able to mount the Jetson Nano. At the bottom of the end effector, a modified Raspberry Pi camera capable of auto-focussing is attached. Ring LED lighting is pasted with sticky tape around the camera to provide better lighting of the object. The object will be placed on the body base in a later phase of the project. At this moment a piece of paper with a black circle printed on it is placed on the base. The diameter of the circle is 5 mm and is used for calibration. The movie below is a desktop recording. It presents the base application where the camera images are displayed. A sequence is run where the end effector of the robot is moved to different heights and a calibration routine is executed afterwards. The cartesian coordinates and the matching calibration factor at these coordinates are displayed in the text box at the right of the application.

The basic movements and calibration is introduced for later use in the project when it may be needed to position the object into the center of the field of view (fov) of the camera.For the next step the text output field has been disabled on the GUI form.

The GUI has been given additional functionality to allow for the training of a neural network. It is decided to begin very simple: train a network that can distinguish between an allen key and a coin. Note that the type of vision application in this case is  classification (not regression). For training of a network multiple images are required of the object(s) that is to be trained. To make the network more robust, the position of the camera has been varied during the image acquisition phase, as well as the type of ring lighting color (white, yellow and red ring lighting have been applied) and the position of the allen key and coin have been moved around on the base of the delta robot. As base for the first network the "Resnet18" model has been used for transfer learning of the allen key and coin images. In total approximately 250 images of the allen key and the coin have been acquired. The number of epochs (training cycles) has been chosen on 100. The total time for training the network was about 4 hours on the Jetson Nano. The example images in the table below show the input and output images of the allen key and the coin. The percentage preceeding the class (allen key or coin in this case) indicates the confidence level of the neural network, i.e. the model is 99.640% sure that the image contains an allen key and 92.588% certain that the other image contains a coin.

Single image acquisition and classification for the first test neural network distinguishing between an allen key and a 1 euro coin
0 0 0 0
Input image of the allen key Classified output image of the allen key with the confidence level Input image of a 1 euro coin Classified output image of the 1 euro coin with confidence level

It is found that the confidence level of the model decreases when the "size" of the object in the image decreases (i.e. number of object pixels to maximum image pixels ratio decreases). The below desktop video shows this: both the allen key and the coin are placed on the delta robot base and after pressing the "Classify" button, an image is acquired and put through the neural network model. The height of the camera is adjusted and the classification exercise is repeated. The result is output to the desktop screen in an image. It can been seen in the movie that the condifdence level of both the allen key and the coin are well above 90% when the camera is at a height of 150 mm. When the camera position is raised to 180 mm and the appearance of the allen key and coin get smaller, the confidence level of the coin drops to 86%. The allen key is still recognised with a confidence level of 99%.

For comparison to the previous desktop demonstration video of the delta robot presented above: several buttons have been added to the GUI form. All newly added buttons and drop-down lists are used to acquire images from the camera and store the images in the appropriate folders that are required within the structure for training a neural network. Details and additional information on how to acquire images and train a custom neural network can be explained on request (just contact Mc'bedded).

Tests have been executed by adding more images of the coin to the network but this did not prevent the confidence level from dropping. However, since this is just a first test, it is decided to not put more time into this at the moment and move on from classification to a regression type application. For this, a new network will be trained to recognise a dice (including the number that faces upward) and locate it in a live camera stream (on the delta robot).

This project is currently being executed and is "work-in-progress". When presentable results are available, the progress will be reported on this page.


Mc'bedded
Idastraat 5
5431 GJ Cuijk
Dit e-mailadres wordt beveiligd tegen spambots. JavaScript dient ingeschakeld te zijn om het te bekijken.

Telephone +31 6 20 40 61 02

Chamber of Commerce Arnhem 72696311

Bank account NL46 KNAB 0258 2265 60