The Llama models are substantially behind the state of the art, particularly whe...

		danpalmer 9 months ago \| parent \| context \| favorite \| on: Attention Wasn't All We Needed The Llama models are substantially behind the state of the art, particularly when it comes to efficiency, they’re probably not the best example for adoption of these sorts of techniques.