IIT Madras, an American university is developing AI-powered algorithms to enhance 3D effects in phone videos

Researchers from the Indian Institute of Technology (IIT), Madras and US-based Northwestern University have developed deep learning algorithms that can significantly improve depth perception and 3D effects in videos shot using smartphone cameras.

According to officials, such algorithms will prevent cellphone images from being “flat” and give a realistic 3D feel. A crucial advantage of the developed algorithm is that it eliminates the need for sophisticated equipment or an array of lenses to capture video with depth.

“It’s a common complaint, especially among amateur and professional photographers, that photographs and videos taken using smartphone cameras look flat and two-dimensional. Apart from the flat look, some 3D features such as “bokeh effect” – the aesthetic blurring of the background – which are easy in the DSLR camera, are difficult in smartphone cameras”, Kaushik Mitra, Assistant Professor, Department of Electrical Engineering, told IIT Madras to PTI.

“While a few mid-to-high-end smartphone cameras are now programmed to incorporate such effects into still photographs, especially in portrait mode, it is not yet possible to render them into video captured from the camera. using smartphones,” he added.

Mitra explained that advanced professional cameras capture information about the intensity and direction of light in a scene, known as the light field (LF), to give depth perception.

“LF capture is achieved through the use of a set of micro lenses inserted between the main camera lens and the camera sensor. Several micro lenses cannot be placed on mobile phones due to space constraints. Instead, algorithms capable of post-processing the image captured by existing mobile cameras are being developed.

“Artificial intelligence and machine learning techniques are used for such image manipulation. Our team looked into this problem and built a deep learning algorithm that converts stereo images captured using a smartphone into LF images,” he said.

The research has been published in the “Proceedings of the International Conference on Computer Vision (ICCV), 2021”.

“The algorithm first captures two videos (called a stereo pair) simultaneously using the two adjacent cameras that are present in many smartphones these days. These stereo pairs go through a sequence of steps involving deep learning models. The stereo pairs are converted into a 7X7 image grid, mimicking a 7X7 camera array, thus producing the LF image,” Mitra explained.

“A crucial advantage of the algorithm developed by our team is that it eliminates the need for sophisticated equipment or an array of lenses to capture video with depth. Bokeh and other aesthetic 3D effects can be achieved with a smartphone equipped with a dual camera system.

“In addition to providing depth, our algorithm allows us to view the same video not from just one viewpoint, but from any of the viewpoints in the 7×7 grid,” he said. .

(This story has not been edited by the Devdiscourse team and is auto-generated from a syndicated feed.)

Sharon D. Cole