Self-Harness: Harnesses That Improve Themselves

(arxiv.org)

62 points | by jonnonz 2 days ago

5 comments

  • drdeca 6 minutes ago
    Was surprised and somewhat disappointed that the article doesn’t appear to evaluate how well the models work when running in the harnesses optimized for the other models. Do they still do better than with the baseline harness? Does each model do worse with a harness optimized (by this process) for the other models, than it does for the harness optimized for itself?
  • behnamoh 2 hours ago
    What else is new? Put it in emacs and let the model improve the harness over time.
  • 7e 1 hour ago
    Pretty obvious stuff; see Terminator for the conclusion (SkyNet). Or the Matrix. We really need more work on model alignment, trustworthiness, and control.
  • tlarkworthy 1 hour ago
    [flagged]
  • mncharity 1 day ago
    [dead]