Why this has the same inference speed as the actual llama-2, wasn't this supposed to be faster?
· Sign up or log in to comment