• Episode AI notes
  1. Alan Cowen’s goal at Hume AI is to optimize AI models for user happiness by using large-scale data to determine what influences people’s emotions.
  2. Reinforcement learning for human feedback may lead to biased opinions and prioritize positive ratings over genuine user experiences.
  3. Concerns about privacy implications in AI devices focus on building trust by ensuring decisions prioritize individual well-being rather than engagement or product sales. Time 0:00:00

  • Optimizing AI Models for User Happiness Summary: Hume’s goal is to use large-scale data to determine what influences people’s happiness or sadness, optimizing AI models to prioritize user happiness over time. This is achieved through objective proxies of human emotional experience, as relying solely on human labels may lack robustness. Human feedback in reinforcement learning models is often biased towards inoffensive responses, making the models sycophantic to human raters, rather than focusing on optimizing user experience for happiness. The ultimate aim is for AI models to generate responses that are tailored for user happiness.

    Speaker 1
    Hume’s mission is to be able to take large scale data and use it to extract information about what determines whether people are happy or sad. And then use that to optimize AI models for what makes people happy over short in long periods of time. And you can only do that if you have like measures of the objective proxies of human emotional experience because it’s scalable. Otherwise, you’d rely on human labels. With reinforcement learning for human feedback, it’s a version of that that just relies on human labels. But it’s not as robust. It doesn’t involve people in their actual circumstances. It’s human raters opinions and those raters are just like an arbitrary group of people, not necessarily experts at what people are asking about. And the raters will generally say like, oh, this is a very carefully thought out response. They’re biased toward like these more of both inoffensive responses that do a lot of couching and like different and contour like non-controversial ways. And it’s just like it makes them boring, honestly, because at the end of the day, these models are not trained for you to have a good experience. They’re trained for these raters to give a positive rating. There’s a difference there. And the model ends up being more sycophantic to there’s a lot of issues with it. What you want is in the application, when you generate a response, for that response to be optimized for your happiness as a user, that’s what the model should be optimized for.
  • Optimizing Models for User Happiness Summary: Reinforcement learning for human feedback relies on human labels which are not robust and may involve biased opinions from arbitrary raters. Models trained for human feedback prioritize positive ratings over user experiences, leading to sycophantic and boring responses. The goal should be to optimize model responses for user happiness, considering the limitations of text input data.

    Speaker 1
    Otherwise, you’d rely on human labels. With reinforcement learning for human feedback, it’s a version of that that just relies on human labels. But it’s not as robust. It doesn’t involve people in their actual circumstances. It’s human raters opinions and those raters are just like an arbitrary group of people, not necessarily experts at what people are asking about. And the raters will generally say like, oh, this is a very carefully thought out response. They’re biased toward like these more of both inoffensive responses that do a lot of couching and like different and contour like non-controversial ways. And it’s just like it makes them boring, honestly, because at the end of the day, these models are not trained for you to have a good experience. They’re trained for these raters to give a positive rating. There’s a difference there. And the model ends up being more sycophantic to there’s a lot of issues with it. What you want is in the application, when you generate a response, for that response to be optimized for your happiness as a user, that’s what the model should be optimized for.
    Speaker 2
    Right. And you’re saying there’s only so much it can know about me for me typing in a text box. Yeah.
    Speaker 1
    Yeah. I mean, even if they have that data, the text is a very impoverished modality. It’s very narrow. And you’re not usually going to tell the model this was a bad response.
  • Building Trust in AI and Privacy Protection Summary: The concerns about privacy implications of devices listening and processing data revolve around trusting the AI with the data and ensuring its protection from exposure to others. Building trust in AI involves ensuring that the AI makes decisions based on what is right for the individual, moving away from optimizing for engagement or selling products to prioritizing the individual’s well-being.

    Speaker 2
    What do you think about, I mean, I’m not the first person that’s asked this, but what do you think about the privacy implications of that, of now these devices listening in a way that we Haven’t had before, but not just listening, being able to then feed a new type of data set and model like the one we’re talking about that hasn’t really existed before. How are humans going to get over that hump?
    Speaker 1
    Yeah, I think that that’s going to be a real concern and they have to trust the AI that’s processing their data. So, there’s two layers to the question. One is, what is the AI doing with your data? And the other is, is this data protected from being exposed to other humans? And so, those are kind of two different questions. And if you solve the second one, which is privacy, which I think you already have tons of data on your phone, you already have tons of data on your computer, like you’re already kind of Assuming in order to live in this world that your privacy is protected by the devices that you’re using, so that’s sort of an assumption. So once you’ve solved that part, the other side of the question is, is the AI doing something with your data that you wanted to do? And so you really need to trust these models, these algorithms. How do they build that trust? I think that in order to build that trust, they have to demonstrate that they make decisions on the basis of what’s right for you and not some third party. And so we have to move away from optimizing algorithms for engagement or for just selling you things or any third party objective that could conflict with your own well-being.
    Stopping optimizing for engagement and away from well being ai-playlists