Ah, you're thinking about embeddings which are basically the encoder stack on a ...

revelio · on May 12, 2023

Decoder only architecture? What is this? That doesn't sound like a transformer at all, are you saying gpt4 uses a totally different algorithm?

newhouseb · on May 12, 2023

Nope, a decoder only transformer is a variant of the original architecture proposed by Google [1]. All variants of GPT that we know about (1 through 3) all roughly use this same architecture which takes only the decoder stack from the original Google paper and drops the encoder [2]

[1] Original Google Paper - https://arxiv.org/abs/1706.03762

[2] Original GPT Paper - https://s3-us-west-2.amazonaws.com/openai-assets/research-co...

flangola7 · on May 12, 2023

How can it work without an encoder?