TY - GEN
T1 - SARA-RT
T2 - 2024 IEEE International Conference on Robotics and Automation, ICRA 2024
AU - Leal, Isabel
AU - Choromanski, Krzysztof
AU - Jain, Deepali
AU - Dubey, Avinava
AU - Varley, Jake
AU - Ryoo, Michael
AU - Lu, Yao
AU - Liu, Frederick
AU - Sindhwani, Vikas
AU - Vuong, Quan
AU - Sarlos, Tamas
AU - Oslund, Ken
AU - Hausman, Karol
AU - Rao, Kanishka
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment. SARA-RT relies on the new method of fine-tuning proposed by us, called up-training. It converts pre-trained or already fine-tuned Transformer-based robotic policies of quadratic time complexity (including massive billion-parameter vision-language-action models or VLAs), into their efficient linear-attention counterparts maintaining high quality. We demonstrate the effectiveness of SARA-RT by speeding up: (a) the class of recently introduced RT-2 models [1], the first VLA robotic policies pre-trained on internet-scale data, as well as (b) Point Cloud Transformer (PCT) robotic policies operating on large point clouds. We complement our results with the rigorous mathematical analysis providing deeper insight into the phenomenon of SARA.
AB - We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment. SARA-RT relies on the new method of fine-tuning proposed by us, called up-training. It converts pre-trained or already fine-tuned Transformer-based robotic policies of quadratic time complexity (including massive billion-parameter vision-language-action models or VLAs), into their efficient linear-attention counterparts maintaining high quality. We demonstrate the effectiveness of SARA-RT by speeding up: (a) the class of recently introduced RT-2 models [1], the first VLA robotic policies pre-trained on internet-scale data, as well as (b) Point Cloud Transformer (PCT) robotic policies operating on large point clouds. We complement our results with the rigorous mathematical analysis providing deeper insight into the phenomenon of SARA.
UR - https://www.scopus.com/pages/publications/85199654519
U2 - 10.1109/ICRA57147.2024.10611597
DO - 10.1109/ICRA57147.2024.10611597
M3 - Conference contribution
AN - SCOPUS:85199654519
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 6920
EP - 6927
BT - 2024 IEEE International Conference on Robotics and Automation, ICRA 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 13 May 2024 through 17 May 2024
ER -