FAST+ uses relative or absolute actions ?
Hello,
One information I could not find on the paper is if the FAST+ tokenizer was trained on absolute or relative (velocity) actions. I have assumed it considered absolute actions, but in the DROID Policy Setup section it says that it was trained on joint velocities for that experiment. So, to use FAST+, should we provide it with relative or absolute actions ? Also, would you have some insights about which should work better for training FAST from scratch in new data?
Thanks in advance !
It's trained on a mix -- most of the data is "relative in chunk", ie "absolute_actions - absolute_actions[0]", for each chunk.
Some datasets like DROID use joint velocity actions out of the box, so that's what we train the tokenizer on.
We haven't ablated this super carefully, but my hunch is that it should work for either action space (though would be interesting if you wanted to do a more careful comparison!). Also the intuition I have for the BPE part of FAST is that it mostly learns to squash away 0s in the quantized DCT coefficient matrix. As such, it's not too critical what data the BPE tokenizer was pre-trained on and it should work for a wide range of downstream datasets (as our experiments suggest). Note that in any case, the BPE part of the tokenizer does not change the tokenization error (it's always loss-less), it would only change the compression ratio that's achievable, and in practice the differences may not be too large!
Got it, thank you for the very complete answer and amazing work with FAST !