Siddharth Choudhary
I am a Senior Applied Scientist in Amazon AGI. I closely collaborate with AWS AI Labs. Before this, I was an applied scientist in Amazon Halo where I worked on problems at the intersection of Computer Vision and Health.
Prior to Amazon, I was a Principal Computer Vision Researcher at Magic Leap. I finished my Ph.D. from the School of Interactive Computing at Georgia Institute of Technology. I was advised by Professor Henrik I. Christensen and Professor Frank Dellaert. I completed my Bachelors and Masters in Computer Science from IIIT Hyderabad
where I was advised by Prof. P J Narayanan.
Email  / 
CV  / 
Google Scholar  / 
Twitter  / 
LinkedIn  / 
Github
|
|
Research
I'm interested in multimodal LLMs and its applications in robotics.
|
Vision-Language Models
|
|
Multi-modal hallucination control by visual information grounding
Alessandro Favero,
Luca Zancato,
Matthew Trager,
Siddharth Choudhary,
Pramuditha Perera,
Alessandro Achille,
Ashwin Swaminathan,
Stefano Soatto,
CVPR, 2024  
arXiv / web
We address hallucination in Generative Vision-Language Models by introducing Multi-Modal Mutual-Information Decoding (M3ID), which amplifies the influence of reference images, reducing hallucinated responses by up to 28% without compromising linguistic fluency.
|
3D Human Reconstruction and Health CVML
|
|
Development and Validation of an Accurate Smartphone Application for Measuring Waist-to-Hip Circumference Ratio
Siddharth Choudhary,
Ganesh Iyer,
Brandon M. Smith,
Jinjin Li,
Mark Sippel,
Antonio Criminisi,
Steven B Heymsfield
Nature Digital Medicine, 2023 (Shipped with Amazon Halo Body)
pdf / web / citation
Propose a CNN based model called MeasureNet for accurately and reliably predicting body measurements and waist-hip ratio. Model is trained using realistic synthetic dataset.
|
|
SplatArmor: Articulated Gaussian splatting for animatable humans from monocular RGB videos
Rohit Jena,
Ganesh Iyer,
Siddharth Choudhary,
Brandon M. Smith,
Pratik Chaudhari,
James C. Gee,
arXiv, 2023  
arXiv / web
A fully articulated Gaussian splatting model for human avatars. Our model includes both rigid and non-rigid skinning components, and a Neural Color Field for implicit color regularization.
|
|
Mesh Strikes Back: Fast and Efficient Human Reconstruction from RGB videos
Rohit Jena,
Pratik Chaudhari,
James C. Gee,
Ganesh Iyer,
Siddharth Choudhary,
Brandon M. Smith
arXiv, 2023  
arXiv
Optimizing a SMPL+D mesh and an efficient, multi-resolution texture representation for novel view synthesis and pose synthesis.
|
Augmented Reality
|
|
Multiuser, Scalable 3D Object Detection in the AR Cloud
Siddharth Choudhary,
Nitesh Sekhar,
Siddharth Mahendran
Prateek Singhal
CVPR Workshop on AR/VR, 2020   (Shipped on Magic Leap One)
pdf / project page / bibtex
An approach for multiuser and scalable 3D object detection, based on distributed data association and fusion.
|
Distributed Object SLAM
|
|
Data-Efficient Decentralized Visual SLAM
Titus Cieslewski,
Siddharth Choudhary,
Davide Scaramuzza
ICRA, 2018  
pdf / video / code / bibtex
Decentralized visual SLAM combining distributed PGO using Gauss-Seidel and efficient, distributed place recognition using NetVLAD features.
|
|
Distributed Object based SLAM
Siddharth Choudhary
PhD Thesis, 2017  
pdf / ppt
Decentralized object based SLAM combining distributed PGO with joint modeling and mapping of object landmarks.
|
|
Distributed Mapping with Privacy and Communication Constraints:
Lightweight Algorithms and Object-based Models
Siddharth Choudhary,
Luca Carlone,
Carlos Nieto,
John Rogers,
Henrik I. Christensen,
Frank Dellaert
IJRR, 2017  
arxiv / code / video / bibtex
Propose a distributed implementation of the two-stage pose graph optimization, using Successive Over-Relaxation (SOR) and the Jacobi
Over-Relaxation (JOR) as workhorses to split the computation among the robots. Extends it to work with object-based map models.
|
|
Multi Robot Object-based SLAM
Siddharth Choudhary,
Luca Carlone,
Carlos Nieto,
John Rogers,
Zhen Liu,
Henrik I. Christensen,
Frank Dellaert
ISER, 2016  
pdf bibtex
Multi robot SLAM approach that uses 3D objects as landmarks for localization and mapping. Leverages
local computation at each robot (e.g., object detection and object pose estimation) to reduce the communication burden.
|
|
Distributed Trajectory Estimation with Privacy and Communication Constraints: a Two-Stage Distributed Gauss-Seidel Approach
Siddharth Choudhary,
Luca Carlone,
Carlos Nieto,
John Rogers,
Henrik I. Christensen,
Frank Dellaert
ICRA, 2016  
pdf / ppt / www / video / bibtex
Leverages recent results which show that the maximum likelihood trajectory is well approximated by a sequence of two quadratic subproblems and solves it in a distributed manner, using the distributed Gauss-Seidel (DGS) algorithm.
|
|
Active planning based extrinsic calibration of exteroceptive sensors in unknown environments
Varun Murali,
Carlos Nieto,
Siddharth Choudhary,
Henrik I. Christensen
IROS, 2016  
pdf / code
Plans a trajectory which actively reduces the uncertainty of the robot's calibration given a rough initial calibration estimate.
|
|
Exactly Sparse Memory Efficient SLAM using the Multi-Block Alternating Direction Method of Multipliers
Siddharth Choudhary,
Luca Carlone,
Henrik I. Christensen,
Frank Dellaert
IROS, 2015  
pdf / code / bibtex
Scalable SLAM using multiblock Alternating Direction Method of Multipliers (ADMM).
|
|
Information-based Reduced Landmark SLAM
Siddharth Choudhary,
Vadim Indelman,
Henrik I. Christensen,
Frank Dellaert
ICRA, 2015  
pdf / bibtex
Information theoretic algorithm to efficiently reduce the number of landmarks and poses in a SLAM estimate without compromising the accuracy of the estimated trajectory.
|
|
SLAM with Object Discovery, Modeling and Mapping
Siddharth Choudhary,
Alexander J. B. Trevor,
Henrik I. Christensen,
Frank Dellaert
IROS, 2014  
pdf / bibtex
Propose an approach
for online object discovery and object modeling, and extend a
SLAM system to utilize these discovered and modeled objects as
landmarks to help localize the robot in an online manner.
|
Structure from Motion / GPU Computing
|
|
CPU and/or GPU: Revisiting the GPU Vs. CPU Myth
Kishore Kothapalli, Dip Sankar Banerjee, P. J. Narayanan, Surinder Sood, Aman Kumar Bahl, Shashank Sharma, Shrenik Lad, Krishna Kumar Singh, Kiran Matam, Sivaramakrishna Bharadwaj, Rohit Nigam, Parikshit Sakurikar, Aditya Deshpande, Ishan Misra, Siddharth Choudhary, Shubham Gupta
arXiv, 2013  
arXiv
|
|
Geometry directed browser for personal photographs
Aditya Deshpande, Siddharth Choudhary, PJ Narayanan, Krishna Kumar Singh, Kaustav Kundu, Aditya Singh, Apurva Kumar
ICVGIP, 2012  
pdf / bibtex
|
|
Improving the Efficiency of SfM and its Applications
Siddharth Choudhary
MS Thesis, 2012  
pdf / www
|
|
Visibility Probability Structure from SfM Datasets and Applications
Siddharth Choudhary,
P. J. Narayanan
ECCV, 2012  
pdf / www / bibtex
Encode the visibility information between and among points and cameras as visibility probabilities for improved localization and triangulation.
|
|
Practical time bundle adjustment for 3d reconstruction on the gpu
Siddharth Choudhary,
Shubham Gupta,
P. J. Narayanan
ECCV CVGPU Workshop, 2010  
pdf / ppt / tutorial
Hybrid implementation of sparse bundle adjustment on the GPU using CUDA, with the CPU working in parallel.
|
|