Augmented reality technology features increasingly within manufacturing’s AI arsenal.It’s important to be aware, says Wendy Mlynarek, of the key role performed by tracking.
Tracking is the software process that locates a given product in a real-time video camera acquisition. This tracking process is the basis of the application that offers an augmented reality experience. This involves locating the camera in relation to specific target objects; understanding and mapping the environment; and real-time recognition and tracking of target objects based on movements.
In addition, the accuracy of the 3D model in relation to reality, the texture and shape (presence of edges) and lighting conditions (visibility) are key points to take into account for more effective tracking.
Tracking relies upon detection and tracking algorithms. These are trained to recognise and follow the object’s distinctive characteristics, such as its shape, colour, contours and so on.
For tracking to work, these elements are required:
- The real object being manipulated
- The specific object to be tracked
- The tracking model (or 3D model) of the object to be tracked
In the case of manual tracking initialisation, the user manually defines the first position to start tracking. The application then stores certain reference images taken during the process. These reference images are a starting point for the tracking initialisation process.
When the user defines the first position, the application captures images that represent the scene from different angles and perspectives. These images are then used as a reference subsequently tracking. They may contain objects or specific elements of the scene that the user wishes to track. For initialisation, the reference images are compared with real-time images captured by the device’s camera. Image matching techniques are used to find correspondences between the pixels of the reference images and those of the real-time images.
Using these correspondences, the application can estimate the transformations required to align the reference images with the real-time images. This determines the initial position and orientation of the objects to be tracked in relation to the camera.
Once initialised, the tracking process begins. The application uses the initial information obtained to track objects in the scene in real time. The positions and orientation will be maintained in the video stream.
This method depends on the accuracy of the initial position set by the user. If the position is incorrect or imprecise, this can lead to tracking errors later on. However, this approach can be useful in cases where the user wishes to track specific objects and is able to provide a reasonable initial estimate of their position.
Tracking with AR
When talking about deep learning-based tracking initialisation, the idea is to use a trained model for this task. This model is capable of learning to recognise relevant features and patterns in reference images in order to provide a more robust and accurate tracking initialisation.
The process based on deep learning typically involves several steps:
Data collection: It is necessary to gather a training dataset that includes reference images taken during tracking. These images must cover a variety of scenes, illuminations and environmental conditions in order to obtain a model capable of generalising efficiently.
Model training: Once the training data has been collected, it is used to train a deep learning model to recognise features relevant to initialisation, such as objects of interest, distinctive patterns or landmarks in the image.
Validation and adjustment: Once the model has been trained, it is evaluated on a separate validation dataset to measure its performance and effectiveness. If necessary, adjustments can be made to improve model performance, such as increasing the training data, adding regularisation or optimising parameters.
Using the model: Once the model has been sufficiently trained and validated, it can be employed to initialise tracking in a real application. When the user starts tracking, the application uses the model to analyse the first images and estimate the initial position of the object or target to be tracked. This initial estimate is then used as the starting point for the continuous tracking process.
By using deep learning for tracking initialisation, we can therefore obtain a method that is more robust and adaptable to changes in illumination, scene background and other visual variations.
However, it should be noted that the effectiveness of the model depends on the quality and diversity of the training data, as well as the performance of the learning algorithm used.
Tracking vs tracking model
Used in many fields, such as computer vision, robotics and augmented reality, a tracking model is a model or algorithm used as part of the tracking process described above, more specifically in the tracking and localisation of objects in image or video sequences.
It is designed to estimate and predict the position, movement and characteristics of an object of interest over time. It can be based on machine learning techniques such as supervised learning, unsupervised learning or reinforcement learning.
The application captures images that represent the scene from different angles and perspectives. These images are then used as a reference for subsequent tracking
The aim is to provide precise information on the position and movement of objects, enabling tracking, detection, recognition and analysis.
The tracking model should be distinguished from the term tracking, as they are not the same thing: they are interdependent, and one cannot function without the other.
For model tracking to be possible, several conditions must be met:
- 3D CAD models
- They must be accurate in relation to reality
- The associated target object must remain visible in the camera during operations
- Containing one or more parts
Tracking is used in the DELMIA Augmented Experience solution to identify the equipment to be assembled or inspected thanks to its 3D model, to simultaneously locate several elements at once, and to display the digital information required for the industrial process in the right place, at the right time and at the right scale.
- Wendy Mlynarek is DELMIA business development and marketing director