• Reflections on Competition in the AI Space Summary: The speaker introduces the concept of the ‘four wars’ as a framework to analyze the significant events in the AI space. These wars include the data wars, GPU rich-poor war, multi-modal war, and Reagan ops war. The discussion revolves around Inflection, a well-known contender in the AI domain that recently faced significant changes with most of its team moving to Microsoft and a new CEO being appointed. This shift raises questions about the harsh reality of competition in the AI field and the importance of resources like talent and compute power.

    Speaker 1
    Yeah, so maybe I’ll take this one. So the four wars is a framework that I came up around trying to recap all of 2023. I tried to write sort of monthly recap pieces. And I was trying to figure out like what makes one piece of news last longer than another or more significant than another. And I think it’s basically always around battlegrounds, wars are fought around limited resources. And I think probably the, you know, the most limited resources talent, but the talent expresses itself in a number of areas. And so I kind of focus on those areas at first. So the four wars that we cover are the data wars, the GPU rich poor war, the multi-modal war, and the Reagan ops war. And I think you actually did a dedicated episode to that. So thanks for your covering that.
    Speaker 2
    Yeah, yeah. Not only did I do a dedicated episode, I actually use it. I can’t remember if I told you guys I did give you big shout outs, but I use it as a framework for a presentation that entails big AI event that they hold each year, where they have all their Folks who are working on AI internally. And it totally resonated. So that’s amazing. That’s amazing. Yeah. So so what got me thinking about it again is specifically this inflection news that we recently had this sort of, you know, basically, I can’t imagine that anyone’s who’s listening Would wouldn’t have thought about it. But, you know, inflection is a one of the big contenders, right? I think probably most folks would have put them, you know, just a half step behind the anthropics and open AI’s of the world in terms of labs. But it’s a company that raised 1.3 billion last year, less than a year ago. Pete Hoffman’s a co founder, a staff of Selleman, who’s a co founder deep mind, you know, so it’s like, this is not a small startup, let’s say, at least in terms of perception. And then we get the news that basically most of the team it appears is heading over to Microsoft and they’re bringing in a new CEO. And, you know, I’m interested in kind of your take on how much that reflects the cold aside, I guess, you know, all the other things that it might be about, how much it reflects this sort Of the stark brutal reality of competing in the frontier model space right now and, you know, just the access to compute.
  • The Battle of Modality Models in AI Development Summary: The battle in AI development revolves around the competition between large multi modality companies and small dedicated modality companies. The trend is shifting towards the large companies, as seen in instances like Sora’s success in video generation. Having multiple state-of-the-art models under one roof brings synergy and benefits, like the case of Sora and Dolly. This approach allows for cross-modality enhancements and synthetic data improvements. Startups focusing on a single modality face challenges in keeping up with the advancements. Despite this, each company carves out its niche, like Suno AI in the music domain, leading to broader user engagement and interest beyond the target audience. The recommendation is to explore the Sora and Dolly blog posts to understand the key methodologies and advantages of having multiple models collaborating in one ecosystem, which is a limitation for dedicated modality companies.

    Speaker 2
    We we wandered very naturally into sort of another one of these wars, which is the multimodality kind of idea, which is, you know, basically a question of whether it’s going to be these Sort of big everything models that that end up winning or whether, you know, you’re going to have really specific things, you know, like something, you know, Dolly three inside of sort Of open AI’s larger models versus, you know, a mid journey or something like that. And at first, you know, I was kind of thinking like from for most of the last call it six months or whatever, it feels pretty definitively both and in some ways, you know, and that you’re Seeing just like great innovation on sort of the everything models, but you’re also seeing lots and lots happen at sort of the level of kind of individual use cases. But then Sora comes along and just like obliterates what I think anyone thought, you know, where we were when it comes to video generation. So how are you guys thinking about this particular battle or war at the moment?
    Speaker 1
    Yeah, this was definitely a both and story and Sora tipped things one way for me in terms of scale being all you need. And the benefit, I think, of having multiple models being developed under one roof, I think a lot of people aren’t aware that Sora was developed in a similar fashion to Dolly three. And Dolly three had a interesting paper out where they talked about how they sort of boot strapped their synthetic data based on GPT for vision and GPT for and it was just all like really Interesting. Like if you work on one modality, it enables you to work on other modalities. And all that is more is more beneficial if it’s all in the same house. Whereas the individual startups who don’t who sort of carve out a single modality and work on that definitely, you know, won’t have the state of the art stuff on helping them on synthetic Data. So I do think like the balance is tilted a little bit towards the God model companies, which is challenging for the for the for the sort of dedicated modality companies. But everyone’s carving out different niches, you know, like we just interviewed Suno AI, the sort of music model company. And you know, I don’t see opening, I pursue music anytime soon.
    Speaker 2
    Yes, Suno has been phenomenal to play with Suno has done that rare thing where which I think a number of different AI product categories have done, where people who don’t consider themselves Particularly interested in doing the thing that the AI enables, find themselves doing a lot more of that thing, right? Like, it’d be one thing if just musicians were excited about Suno and using it, but what you’re seeing is tons of people who just like music, all of a sudden like playing around with it And finding themselves, kind of down that rabbit hole, which I think is kind of like the highest compliment that you can give one of these startups at the early days of it.
    Speaker 1
    Yeah, you know, I asked them directly, you know, in the interview about whether they consider themselves mid journey for music. And he had a more sort of nuanced response there, but I think that probably the business model is going to be very similar because he’s focused on the B2C element of that. So yeah, I mean, you know, just to just to tie back to the question about, you know, large multi modality companies versus small dedicated modality companies. Yeah, highly recommend people to read the Sora blog posts and then read through to the Dali blog posts because they strongly correlated themselves with the same synthetic data bootstrapping Methods as Dali. And I think once you make those connections, you’re like, Oh, like, it is beneficial to have multiple state of the art models in house that all help each other. And these, that’s the one thing that a dedicated modality company cannot do.
  • Focus on Specificity for Success Summary: Companies are increasingly favoring vertical agents tailored for specific domains over generic, bottom-shelf products. The challenge with overly broad applications is that they struggle to achieve reliability and true effectiveness. Entrepreneurs are now prioritizing specialized use cases, as this approach appeals more to investors who seek clear, actionable solutions rather than abstract generic functionalities. As a result, successful ventures are emerging in niches such as financial research, security, compliance, and legal, where targeted applications resonate more with user needs and market demands.

    Speaker 3
    And David won from a depth, you know, they in our episode, he specifically said, we don’t want to do a bottom self product, you know, we don’t want something that everybody can just use And try because it’s really hard to get it to be reliable. So we’re seeing a lot of companies doing vertical agents that are narrow for a specific domain. And they’re very good at something my counter where who was a Databricks before is also a friend of late in space. He’s doing this new company called Brighwave doing AI agents for financial research. And that’s it, you know, and they’re doing very well. There are other companies doing it in security, doing it in compliance, doing it in legal, all of these things that like people, nobody just wakes up and say, Oh, I cannot wait to go on Auto GPD and ask it to do a compliance review of my thing, you know, just not what inspires people. So think the gap on the developer side has been the more bottom sub hacker mentality is trying to build this like very generic agents that can do a lot of open end the task. And then the more business side of things is like, Hey, if I want to raise my next round, I cannot just like sit around the mess mess around with like super generic stuff. I need to find a use case that really works. And I think that that is worth for a lot of folks. In parallel, you have a lot of companies doing emails.
  • Embrace the Evolution of Diffusion Technology Summary: The ongoing advancements in diffusion technology are reshaping the landscape of AI-generated art and text. With pioneers like Bill Pebels leading the way, innovations such as stable diffusion, glass diffusion, and SDXL turbo are enhancing efficiency, reducing costs, and simplifying the creation process. Individuals who believe that generating stable diffusion art is challenging or expensive lack awareness of the latest models. Additionally, the potential of text diffusion presents a groundbreaking approach, allowing for the generation of entire text segments through diffusion models rather than traditional token-by-token methods.

    Speaker 4
    The guy who wrote the diffusion transformer paper, Bill Pebels, is the lead tech guy on Sora. So you’ll just see a lot more diffusion transformer stuff going on.
    Speaker 1
    But there’s more sort of experimentation to diffusion. I’m holding a meetup actually here in San Francisco that’s going to be like the state of diffusion, which I’m pretty excited about. Stability is doing a lot of good work. And if you look at the architecture of how they’re creating stable diffusion, 3, all glass diffusion, and the inconsistency models or SDXL turbo, all of these are like very, very interesting Innovations on like the original idea of what stable diffusion was. So if you think that it is expensive to create or slow to create stable diffusion or AI generated art, you are not up to date with the latest models. If you think it is hard to create text and images, you are not up to date with the latest models. And people still are kind of far behind.
    Speaker 4
    The last piece of which is the wildcard I always kind of hold out, which is text diffusion. So instead of using auto-generative or auto-aggressive transformers, can you use text to diffuse?
    Speaker 1
    So you can use diffusion models to diffuse and create entire chunks of text all at once instead of token by token.
  • Progress is Perpetually Five Years Away Summary: Advancements in technology, particularly in AI and autonomy, often take longer than anticipated, resembling the delayed rollout of self-driving cars. The concept of ‘levels of autonomy’ serves as a useful framework for understanding this progression, indicating that initial expectations may not align with real-world implementation. This highlights the importance of recognition that technological breakthroughs may face substantial practical challenges, causing anticipated timelines to stretch indefinitely. Observing the gradual evolution of self-driving capabilities reinforces the notion that achieving practical autonomy demands sustained effort and patience.

    Speaker 4
    How long do you think it is till we get to early early early?
    Speaker 1
    This is my equivalent of AI timelines. I know I know. Lots of active, I mean, I have supported companies actively working on that. I think it’s more useful to think about levels of autonomy. And so my answer to that is perpetually five years away until it figures it out. No, but my actual antidote, the closest comparison we have to that is self-driving. We’re doing this in San Francisco for those who are watching the live stream. If you haven’t come to San Francisco and see and taken away more ride, just come, get a friend, take away more ride. I remember 2014, we covered a little bit of autos in my hedge fund. And I remember telling a friend, I was like, self-driving cars around the corner, like this is it, like, you know, parking will be, like, parking will be a thing of the past. And it didn’t happen for the next 10 years. And but now we, not like most of us in San Francisco can take it for granted. So I think like you just have to be mindful that the rough edges take a long time. And like, yes, it’s going to work in demos, then it’s going to work a little bit further out. And it’s just going to take a long time. The more useful mental model I have is sort of levels of autonomy. So in self-driving, you have level one, two, three, four, five, just the amount of human attention that you get. At first, like your hands are always on 10 and two and you have to pay attention to the driving every 30 seconds.