Only embedding layer is frozen. 10 epochs. 0.00001 learning rate. 8 batch size. 512 max tokens. AllQuAD dataset. d2cdea1 verified alienit commited on Mar 3, 2024