• So many misconceptions about architectures (esp encoder-decoder vs decoder) partially due to nomenclature being confusing.
  • EncDec, PrefixLMs, Causal Dec-onlys are all autoregressive. Even T5/UL2’s objective is autoregressive.
  • All 3 archs are not that different. People… View Tweet