• Excited to go to NeurIPS conference tomorrow! It’s an annual gala for AI. Many revolutionary ideas debuted here, like AlexNet, Transformer, & GPT-3. I read all 15 Outstanding Papers and can’t wait to share my thoughts with you all. Here’s your front row seat to the festival:🧵 View Tweet

  • For each paper, I’ll give a TLDR and a note on why I think it’s significant. I may also link any interesting blogs and websites that dive in with greater depth. Original authors are welcome to chime in and expand the discussion or correct any mistakes! Tweet index is by paper. View Tweet

  • Training Compute-Optimal Large Language Models. Hoffmann et al, @Deepmind. TLDR: introduces a new 70B LM called “Chinchilla”🐭 that outperforms much bigger LMs (GPT-3, Gopher). To be compute-optimal, model size and training data must be scaled equally. 1.1/ View Tweet

  • Chinchilla’s discoveries are profound. It shows that most LLMs are severely starved of data and under-trained. Given the new scaling law, even if you pump a quadrillion parameters into a model (GPT-4 urban myth), the gains will not compensate for 4x more training tokens 😱 1.2/ View Tweet

  • Is this why OpenAI created the “Whisper” speech recognition system, so they can feed GPT-4 with another trillion text tokens harvested from YouTube audio? I guess we’ll find out soon! Chinchilla paper: https://t.co/wYC9ezzJcX Fantastic blog post: https://t.co/7xOMdONMkJ 1.3/ View Tweet

  • Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. Saharia et al, @GoogleAI. TLDR: “Imagen” is a large text-to-image and super-resolution diffusion model that generates beautiful photorealistic images. Beats DALLE-2 (May 2022) in human rating. 2.1/ View Tweet

  • The biggest advancement over DALLE2 is the use of a much stronger text encoder (T5-XXL) trained on enormous text corpus. Though DALLE’s CLIP text encoder is pixel-aware, it is not as good as T5 in terms of language understanding. This results in better image-text alignment. 2.2/ View Tweet

  • I’m still looking forward to a public portal to play with Imagen myself! Paper: https://t.co/NnRnwEnK9V Website with lots of fancy images: https://t.co/RmZl3kVBrB 2.3/ View Tweet

  • ProcTHOR: Large-Scale Embodied AI Using Procedural Generation. Deitke et al, @allen_ai. TLDR: ProcTHOR is a simulator that procedurally generates a large variety of interactive, customizable, and physics-enabled houses for training embodied agents. Huge open asset library! 3.1/ View Tweet

  • Like Chinchilla, embodied agent research also needs a ton of diverse data to scale. An agent generates its own experience data via interaction & exploration. Its abilities are upper-bounded by the simulator complexity. ProcTHOR offers a scalable way to enrich the experience 3.2/ View Tweet