Siddharth Choudhary

Applied Scientist, Amazon Lab126

Email, Google Scholar, Github, Linkedin

I am an Applied Scientist at Amazon Lab126. Before this I was a Principal Computer Vision Researcher at Magic Leap. I finished my Ph.D. from the School of Interactive Computing at Georgia Institute of Technology. I was advised by Professor Henrik I. Christensen and Professor Frank Dellaert. I completed my Bachelors and Masters in Computer Science from IIIT Hyderabad where I was advised by Prof. P J Narayanan.

My primary research interests are in the areas of Robotics and Computer Vision and its intersection with Machine Learning and High Performance Computing. Recently, I've started exploring the potential of CV/ML technologies for healthcare with Amazon Halo.


  • 2020.09 - Joined Amazon Lab126 as an Applied Scientist..
  • 2020.06 - Giving a tutorial in CVPR 2020 on the perception technologies behind the MagicVerse (AR Cloud).
  • 2020.03 - Released 3D Object Recognition feature which runs in the MagicVerse (AR Cloud). Checkout videos here and here.
  • 2019.06 - Gave a tutorial in CVPR 2019 on Perception at Magic Leap.
  • 2017.10 - Joined the computer vision team at Magic Leap.
  • 2017.08 - Gave a successful demo of decentralized exploration and mapping system during the Micro Autonomous Systems and Technology (MAST) Collaborative Technology Alliance (CTA) Capstone event. The event was covered in The Economist, Phys, APG News, ARL News.

Selected Research Projects (check Google Scholar for the complete list)

Multiuser, Scalable 3D Object Detection in the AR Cloud

As AR Cloud gains importance, one key challenge is large scale, multi-user 3D object detection. Current approaches typically focus on the single-room, single-user scenarios. In this work, we present an approach for multi-user and scalable 3D object detection, based on distributed data association and fusion. We use an off-the-shelf detector to detect object instances in 2D and then combine them in 3D, per object while allowing asynchronous updates to the map. The distributed data association and fusion allows us to scale the detection to a large number of users concurrently, while maintaining a lower memory footprint without loss in accuracy. We show empirical results, where the distributed and centralized approaches achieve comparable accuracy on the ScanNet dataset while reducing the memory consumption by a factor of 15.

This feature was released on ML-19 and it runs in the MagicVerse (AR Cloud). Checkout some videos at here and here. Below is the initial pre-print. Full publication is a work in progress.

Abstract (with initial results):

[1] S. Choudhary, N. Sekhar, S. Mahendran, P. Singhal. Multi-user, Scalable 3D Object Detection in AR Cloud, in CVPR Workshop on Computer Vision for Augmented and Virtual Reality, Seattle, WA, 2020 [pdf] [web] [bibtex]

Distributed Object based SLAM with known Object Models

We propose a multi robot SLAM approach that uses 3D objects as landmarks for localization and mapping. The approach is fully distributed in that the robots only communicate during rendezvous and there is no centralized server gathering the data. Moreover, it leverages local computation at each robot (e.g., object detection and object pose estimation) to reduce the communication burden. We show that object-based representations reduce the memory requirements and information exchange among robots, compared to point-cloud-based representations; this enables operation in severely bandwidth-constrained scenarios. We test the approach in simulations and field tests, demonstrating its advantages over related techniques: our approach is as accurate as a centralized method, scales well to large teams, and is resistant to noise.


[1] S. Choudhary, L. Carlone, C. Nieto, J. Rogers, Z. Liu, H. I. Christensen, F. Dellaert. Multi Robot Object-based SLAM, in ISER 2016. [pdf] [bibtex]

[2] S. Choudhary, L. Carlone, C. Nieto, J. Rogers, H. I. Christensen, F. Dellaert. Distributed Mapping with Privacy and Communication Constraints: Lightweight Algorithms and Object-based Models, in IJRR 2017. [arxiv] [code] [bibtex]

Distributed Object based SLAM with Joint Object Modeling and Mapping

We extend the previous work on Distributed Object based SLAM to the case where object models are previously unknown and are modeled jointly with Distributed Object-based SLAM. We show that this approach further reduces the memory required to store the object models while maintaining the accuracy at the same level as the state of art RGB-D mapping approaches.


[1] S. Choudhary, A. Trevor, H. I. Christensen, F. Dellaert. SLAM with Object Discovery, Modeling and Mapping, in IROS 2014. [pdf] [www] [bibtex]

[2] S. Choudhary. Distributed Object based SLAM. Ph.D Thesis (2017). [pdf] [ppt]

Distributed Pose Graph Optimization and Visual SLAM

We propose a distributed algorithm to estimate the 3D trajectories of multiple cooperative robots from relative pose measurements. Our approach leverages recent results which show that the maximum likelihood trajectory is well approximated by a sequence of two quadratic subproblems. The main contribution of the present work is to show that these subproblems can be solved in a distributed manner, using the distributed Gauss-Seidel (DGS) algorithm. Our approach has several advantages. It requires minimal information exchange, which is beneficial in presence of communication and privacy constraints. It has an anytime flavor: after few iterations the trajectory estimates are already accurate, and they asymptotically convergence to the centralized estimate. The DGS approach scales well to large teams, and it has a straightforward implementation. We test the approach in simulations and field tests, demonstrating its advantages over related techniques.


[1] S. Choudhary, L. Carlone, C. Nieto, J. Rogers, H. I. Christensen, F. Dellaert. Distributed Trajectory Estimation with Privacy and Communication Constraints: a Two-Stage Distributed Gauss-Seidel Approach, in ICRA 2016. [pdf] [ppt] [www] [bibtex]

[2] T. Cieslewski, S. Choudhary, D. Scaramuzza. Data-Efficient Decentralized Visual SLAM, in ICRA 2018. [pdf] [video] [code] [bibtex]

Improving the Efficiency of Structure from Motion

Large scale reconstructions of camera matrices and point clouds have been created using structure from motion from community photo collections. Such a dataset is rich in information; we can interpret it as a sampling of the geometry and appearance of the underlying space. In this dissertation, we encode the visibility information between and among points and cameras as visibility probabilities. The conditional visibility probability of a set of points on a point (or a set of cameras on a camera) can be used to select points (or cameras) based on their dependence or independence. We use it to efficiently solve the problems of image localization and feature triangulation. We show how the conditional probability can be combined with other measures to prioritize a set of points (or cameras) for matching and use it for fast guided search of points for the image localization problem. We define the problem of feature triangulation as the estimation of 3D coordinate of a given 2D feature using the SfM data. Our approach can guide the search to quickly identify a subset of cameras in which the feature is visible.
Other than image localization and feature triangulation, bundle adjustment is a key component of the reconstruction pipeline and often its slowest and the most computational resource intensive. It hasn't been parallelized effectively so far. We also a present a hybrid implementation of sparse bundle adjustment on the GPU using CUDA, with the CPU working in parallel. The algorithm is decomposed into smaller steps, each of which is scheduled on the GPU or the CPU. We develop efficient kernels for the steps and make use of existing libraries for several steps. Our implementation outperforms the CPU implementation significantly, achieving a speedup of 30-40 times over the standard CPU implementation for datasets with upto 500 images on an Nvidia Tesla C2050 GPU.


[1] S. Choudhary and P. J. Narayanan. Visibility Probability Structure from SfM Datasets and Applications, in ECCV 2012. [pdf] [www] [bibtex]

[2] S. Choudhary, S. Gupta, P. J. Narayanan. Practical Time Bundle Adjustment for 3D Reconstruction on GPU, in ECCV 2010. [pdf] [ppt]

S. Choudhary. Improving the Efficiency of SfM and its Applications, MS Thesis 2012. [pdf][www]


Data-Efficient Decentralized Visual SLAM

This is the code for the 2018 ICRA paper Data-Efficient Decentralized Visual SLAM by Titus Cieslewski, Siddharth Choudhary and Davide Scaramuzza.

Distributed Pose Graph Optimization

This library is an implementation of the algorithm described in Distributed Trajectory Estimation with Privacy and Communication Constraints: a Two-Stage Distributed Gauss-Seidel Approach (ICRA 2016, IJRR 2017). The core library is developed in C++ language.


This library is an implementation of the algorithm described in Exactly Sparse Memory Efficient SLAM using the Multi-Block Alternating Direction Method of Multipliers (IROS 2015). The core library is developed in C++ language.

Planning to Calibrate

This library is an implementation of the algorithm described in Active Planning Based Extrinsic Calibration of Exteroceptive Sensors in Unkown Environments (IROS 2016). The core library is developed in C++ and MATLAB.


A multi-modular mapping framework.


Tutorial on Bundle Adjustment

This is a tutorial on Bundle Adjustment which I made for a reading group.

SLAM - Literature Survery

This is a literature survey on Scalable SLAM that I worked on during grad school.

Minimal Map Representation for localization and navigation

This is a literature survey on existing techniques for mapping and navigation using topological semantic maps.