Free Shipping on orders over US$39.99 How to make these links

Clarkson Computer Science PhD Student Wins Best Paper Award at the 19th International Conference on Signal Processing and Multimedia Applications for Research on Using Video Motion Vectors for Structure from Motion 3D Reconstruction


Clarkson Computer Science PhD Student Richard Turner’s research on the use of video motion vectors for structure from motion 3D reconstruction was awarded Best Paper at the 19th International Conference on Signal Processing and Multimedia Applications (SIGMAP 2022) held in Lisbon. , Portugal. Richard was also advised by Dr. Sean Banerjee and Dr. Natasha Banerjee, Associate Professor of Computer Science. SIGMAP brings together international scholars working in theory and practice related to the representation, storage, authentication, and communication of multimedia information from images, video, and audio data as well as developing which is a source of multimodal data such as text, social media, and healthcare. Richard’s work has played a critical role in reducing the computational resources required for 3D scene reconstruction using structure from motion (SfM). SfM is a computationally intensive process, usually performed offline with very powerful computing tools. 3D scene reconstruction involves the use of advanced computer vision techniques to create three-dimensional models from a series of two-dimensional images. 3D scene reconstruction is an important pipeline in providing robots and autonomous vehicles with the ability to navigate their environments. There is growing interest in the research community to perform SfM using low-power devices commonly found in unmanned aerial and ground vehicles as well as autonomous vehicles.

Richard’s method uses video motion vectors that are used for video compression. H.264 video compression has become a widespread choice for devices that require live video streaming and include mobile phones, laptops and Micro Aerial Vehicles (MAV). H.264 uses motion estimation to predict the distance of pixels, grouped as macroblocks, between two or more video frames. Live video compression using H.264 is ideal because each frame contains a lot of information that can be seen in previous and future frames. By estimating the motion vector of each macroblock for each frame, significant compression can be obtained. Richard’s method provides a near real-time feature detection and matching algorithm for SfM reconstruction using motion estimation properties found in H.264 video compression encoders. Validation of Richard’s method was performed using video taken from a MAV flying within an urban environment.

As a Senior Principal Software Engineer at Northrup Grumman, Richard works on next-generation space challenges. His research deals with real-time 3D mapping of environments for autonomous navigation. Richard’s work has had a wider impact on the development of technologies for performing landscape reconstruction using low-resource tools found on unmanned aerial and ground vehicles for use in challenging scenarios such as disasters or mass casualty events. Richard is currently leading an interdisciplinary team of graduate and undergraduate students in deploying his algorithm to unmanned aerial vehicles designed by the team.

Richard is a member of the Terascale All-sensing Research Studio (TARS) at Clarkson University. TARS supports the research of 15 graduate students and nearly 20 undergraduate students each semester. TARS has one of the largest high-performance computing facilities at Clarkson, with 275,000+ CUDA cores and 4,800+ Tensor cores spread across 50+ GPUs, and 1 petabyte of (almost full!) storage. TARS houses the Gazebo, a massively dense multi-viewpoint multi-modal markerless motion capture facility for imaging multi-person interactions with 192 226FPS high-speed cameras, 16 Microsoft Azure Kinect RGB-D sensors, 12 Sierra Olympic Viento-G thermal cameras, and 16 surface electromyography (sEMG) sensors, and the Cube, a single- and two-person 3D imaging facility with 4 high-speed cameras, 4 RGB-D sensors, and 5 thermal camera. TARS conducts research on the use of deep learning to gain insight into natural multi-person interactions from large datasets, to enable next-generation technologies, for example, intelligent agents and robots , which will seamlessly integrate into future human environments.

Source link

We will be happy to hear your thoughts

Leave a reply

Info Bbea
Enable registration in settings - general
Compare items
  • Total (0)
Shopping cart