Ask HN: If I cancel Codex today whats the next best local inference agent?

better place to ask over /r/LocalLLaMA

7 points | by Bulbasaur2015 5 hours ago

2 comments

  • bigyabai 5 hours ago
    For local inference? It entirely depends on what your hardware is.
  • verdverm 4 hours ago
    OpenCode + vllm, model will depend on your hardware, but OpenCode also has a killer $10/m plan with quotas for some top tier open weight models.

    I'm using qwen3.6 on a DGX spark, llama-cpp has prompt cache bugs for qwen/gemma models (among more being reported). Using my OpenCode-go sub when I want a bigger / more capable model