Working within IsaacLab to systematically reduce dimensionality by converting RGBD sensor data for target objects into states via an image approximation pipeline.

pretrained-both-frozen.mov

RGDB -> Object Recognition -> SAM (xmem) -> 3D deprojection -> centroid state 

detection.png

Screen Recording 2024-09-13 at 16.27.05.mov

I am working on making this process even more efficient by storing the visual information in latent space and interleaving the updates of this latent space between environments.