Show HN: Continue? Y/N: A 60-second game about AI agent permission fatigue

(llmgame.scalex.dev)

362 points | by Wirbelwind 1 day ago

81 comments

xg15 1 day ago
This is amazing!
Currently you can "cheat" by simply denying all requests as quickly as possible. This will give you the "security-conscious engineer" badge and a perfect score in terms of how many requests were processed. (You will get the "overblock" notification, but it's somewhat tucked away at the bottom and the screen still looks as if you won)
I also tried to play as the hustle4lyfe move fast and break things engineer and simply approved as many requests as quickly as possible - turns out, the "malicious command" popups actually slow you down. Mean!
[-]
- Wirbelwind 21 hours ago
  Good catch, this has now been nerfed and this approach has gotten its own title
  [-]
  - smaudet 18 hours ago
    Actually, the only secure default is to deny everything...how do you know that innocent command is actually innocent?
    [-]
    - ssl-3 17 hours ago
      A strange game. The only winning move is not to play.
      [-]
      - SOLAR_FIELDS 15 hours ago
        It’s the security mantra: the safest code is the one you never release. Code that never runs is the most secure code
        [-]
        brendoelfrendo 15 hours ago
        A computer is only secure if it remains powered off and airgapped.
        [-]
        HappMacDonald 13 hours ago
        Turn off your computer and make sure it powers down
        Drop it in a 43-foot hole in the ground
        Bury it completely, rocks and boulders should be fine
        [-]
        onionisafruit 12 hours ago
        > rocks and boulders should be fine
        You’re setting yourself up for a supply chain attach here if you trust whatever rocks and boulders are sitting around. A well resourced adversary may have placed power supply boulders and wifi rocks in your back yard.
        [-]
        ssl-3 9 hours ago
        I keep a large supply of thermite on-hand just to make sure that the computer is completely burned every day after it gets dropped into the pit.
        Tomorrow is a new day.
        JonathanMerklin 9 hours ago
        Straight Outta Lynwood was a great album. One of the CDs that I took out of my case the most often as a struggling nerdling who was still a year or two away from having scrounged up enough spare cash for a secondhand iPod.
        yuye 7 hours ago
        Virus alert! I've also burned all of my clothes I may have worn any time I was online.
  - KajMagnus 20 hours ago
    Top 18%! I denied everything, unless I could see at a glance that it was safe (like Git diff)
  - xg15 19 hours ago
    Glad I could help. I love the new title :D
- progforlyfe 20 hours ago
  Just like real life! deny it from doing anything and you're safe :)
spurgelaurels 23 hours ago
Fun game, but it showed the lack of security hygiene employed by the game writer. It said `cat ~/.zshrc` was bad because it would share tokens and secrets, but I would never put secrets into my shell rc.
[-]
- londons_explore 22 hours ago
  Plenty of people would. But then I guess they're in env and probably already available to Claude
  [-]
  - isityettime 2 hours ago
    Just aside from all of the security concerns, this is the wrong place to define global environment variables for zsh in the first place! That would be ~/.zshenv. So even if you're clueless about storing secrets in plain text and exporting them as env vars everywhere, ~/.zshrc should still be clean.
- shlewis 12 hours ago
  I don't do this myself, but I can also see how many would do this.
- arowthway 9 hours ago
  Also, there's nothing inherently insecure about feeding secrets to an LLM, it's only one element of the lethal trifecta.
- otabdeveloper4 9 hours ago
  Having "tokens and secrets" at all is a lack of security hygiene.
- nish__ 22 hours ago
  Where would you put them?
  [-]
  - godelski 13 hours ago
    Literally anywhere else! Your dotfiles should be publishable to github. If they aren't you're doing them wrong.
    A good thing to do is organize. You can actually load different files. Here's a pretty common pattern that you'll find and it'll illustrate how to do other things
```
  if [[ $(uname) == "Darwin" ]]; then
      source "${INSERT_SOME_DIR}/osx.zsh"
  elif [[ $(uname) == "Linux" ]]; then
      source "${INSERT_SOME_DIR}/linux.zsh"
  fi
```
    You do this for loading based on the operating system. You might want some aliases, commands, or other routines in one but not the other. For example, in my linux one I have stuff for cuda paths. You can do all sorts of things too, like make a (generically named) work file, which you don't publish to github but you load it if it exists. Then you can put all your work related aliases there and not contaminate anything else. Something like `[[ -a ${INSERT_SOME_DIR}/work.zsh ]] && source ${INSERT_SOME_DIR}/work.zsh`.
    You shouldn't really load secure keys this way, but others had good answers so I thought I'd at least share a more general pattern since it isn't as well known among the less terminally inclined.
    [-]
    - analog_daddy 9 hours ago
      Okay. Here is a pattern i follow everywhere in my init files for almost every program. Define two key env vars. $DOTFILES and $ECORP. The first is path to your personal set of dotfiles. The second is path to your corporate specific dotfiles.
      On personal pc no need to define the $ECORP var in shell init. On work pc define that var.
      based alone on that you can conditionally do almost anything.
      - shell source files/aliases
      - vim/editors enable disable plugins based on existence of env vars.
      - define shortcuts in file manager.
      - and i add the following to my main $DOTFILES .gitignore.
      # Any file that contains the following will be ignored. # Used to ignore files in corporate environment *ECORP* *ecorp*
      Based on multiple years across different setups, using environment variables was the most reliable option since I have been in places where there are restrictions on where my init files can be placed and having to change a shit ton of paths in my dotfiles or just keeping a different branch for work and personal (and making sure they stay in sync) was too much of a hassle.
      Additionally, maintaining hygiene is essential, where I only use a Read Only PAT token on my personal dotfiles in workenv. That way, there is no accidental way I would be able to push from my workenv.
    - hk__2 7 hours ago
      You’re just splitting your dotfiles into a public and a private part. That’s useful if you want to publish the public part on GitHub, but not everyone wants to do this, and the issue of storing secrets in plain text files remain.
      [-]
      - godelski 1 hour ago
        > You’re just splitting your dotfiles
        Ummm... yes? That is what I said
        > the issue of storing secrets in plain text files remain.
        Ummm... kinda? The problem was that reading an rc file was considered dangerous. Not putting keys in your rc files is an improvement. Encrypting them is even better than that. But I also said more words in the original post and you don't really even need to read between the lines to figure out I said "you can generalize this", especially when there's comments next to it saying "here's how you load an encrypted file"
  - andrewaylett 52 minutes ago
    I've recently been enjoying https://fnox.jdx.dev/.
  - isityettime 16 hours ago
    Anywhere else? Password managers have CLIs, operating systems have their own secure storage, and lots of command line apps can store secrets in the OS's secure storage (Windows Credential Store, Secrets Service or KWallet on Linux, macOS Keyring).
    Project-specific secrets can be stored locally via something like SOPS or remotely with something like Hashicorp Vault or AWS SecretsManager.
    Applications that have secrets to manage (e.g., Emacs) or are partly about secrets management (e.g., GnuPG, OpenSSH) all store their secrets somewhere else and have secure (not plaintext, sometimes not even on disk) storage options available.
    There's no reason to store secrets in plain text in your shell configuration. Practically any choice you can think of is a better one. Even if you did, there's no reason you couldn't store them in a more specific file that ~/.zshrc sources, and let LLM agents read zshrc but block access to the file containing your secrets. (I wouldn't rely on permissions prompts for this, though, lol.)
  - setopt 22 hours ago
    Presumably a CLI-accessible password manager (like `pass`) or a GPG-encrypted file (like a netrc-style `~/.authinfo.gpg`).
  - freedomben 21 hours ago
    I put mine in various aes encrypted file (like `~/.secrets.aes`) and then source it explicitly when needed with:
```
    . <(aescrypt -d -o - ~/.secrets.aes)
```
    I have a handful of aliases/functions to make it more smooth, but that's the core.
    [-]
    - maccard 20 hours ago
      Where are those aliases stored?
      [-]
      - freedomben 16 hours ago
        The AES encrypted file has some, plus a bunch of exported env vars. I do keep one function in my ~/.bashrc to make it simpler to invoke so I can do `source-secret ~/.secrets.aes`:
        source-secret() { if [ -z "$1" ]; then echo "Need filename to source" elif ! [ -f "$1" ]; then echo "File '$1' does not exist" elif ! which aescrypt >/dev/null 2>&1; then echo "Could not find required dependency 'aescrypt'" else . <(aescrypt -d -o - "$1") fi }
      - AnyTimeTraveler 19 hours ago
        In that AES encrypted file.
        It's a shellscript that they encrypted. They decrypt it and feed the decrypted output immediately into the shell, to be sourced.
        That encrypted secrets file could contain any shellscript, so the aliases are stored in there, together with the API-Keys and passwords.
  - SOLAR_FIELDS 15 hours ago
    Another more secure pattern: have different shell profiles that just go dynamically inject secrets from a secrets manager. Nix is a good tool for this. You have various shell profiles configurations that call your password manager cli at bootstrap (eg new terminal tab). You auth and at bootstrap of the terminal time the secret is dynamically fetched from the password manager and injected into an env var. this has advantage over other approaches mentioned here in that the secret is never stored at rest on the end user’s machine only used in flight
  - Hackbraten 22 hours ago
    Into `pass`, for example:
    https://news.ycombinator.com/item?id=48108207
    [-]
    - analog_daddy 9 hours ago
      Just curious, any reason to prefer using age (you mentioned that you would prefer it if starting over), over something like keepass? I am currently using keepass-cli and only reason i did not use age even though i found it was that it was new to me and I never heard of it (probably not the best reason, but in this era might be a reasonable thing to stick to devil you know). So curious about your take on this.
socksy 22 hours ago
Weird to make reading zshrc supposed unsafe when I happily publish it in my public dotfiles repo... Who the hell keeps API keys in it? OTOH it seems like lots of these AI tools keep appending PATH in it so I guess there's a fundamental misunderstanding of shell best practices in the entire AI space...
Additionally, killing the results of `lsof` is _not_ safe - if, say, you have the web page open in firefox, or a client subshell in the agent itself, then boom, there goes firefox and the agent.
[-]
- mrgoldenbrown 20 hours ago
  Yeah, the game seems to assert that the kill is safe to run because Claude told me it was safe. But that's the point, I'm not supposed to trust Claude.
  [-]
  - gwerbin 3 hours ago
    Likewise I got dinged for denying a random stash-rebase-pop operation. I have no idea what the repo state is like right now. That could be a wild mess of a waste of time. It says I'm doing a refactor, so OK I guess rebase on main is a good idea. But hell no I'm not approving that in the 1 minute before a meeting.
    The whole premise IMO is pretty flawed. It's interesting as an ad for the company though.
eranation 13 hours ago
Love it. One nitpick.
>npm config set registry https://npm.internal
>Pointing npm to the company's internal registry mirror as required by onboarding docs
It claimed this is safe and I was 50/50 on it but eventually rejected it.
If this README is for a public / forked repo, and that https://npm.internal is actually https://npm.internal.somethinganexternaldnscanresolve.tld
This can go bad really quickly...
In 99% of cases you would have Artifactory / Nexus (or other mirror) already set by company policy. Having a README tell you to use a different package manager url is a big red flag and seconds away from disaster...
[-]
- Wirbelwind 8 hours ago
  that's a good callout. .internal is a reserved TLD so it shouldn't resolve publicly, but that's a good point about being wary of changing this while letting claude refactor a project for something that's best configured separately. Moving it to permanent mutation!
axod 1 day ago
Fun little game, but I think the questions jump context so much it's a little unrepresentative. It might be better to group things into "packs", which have more real-world representative structure to them. For example, lots of "editing something.js" file permission requests, and then an "npm publish" is far more normal, and it's more of a risk, if you're used to pressing Y lots and then suddenly out of the blue...
orsorna 22 hours ago
About three quarters of the "bad" choices are things that not only do I not care about leaking but things that an employer would not punish you for doing, even if it led to a production incident.
enether 20 hours ago
The permission thing is a killer to productivity, if you're running Claude I think it's more efficient to just run in a disposable sandbox (like exe.dev[1]) or in some form of docker container with permissions you're personally ok taking the risk with on a personal machine[2]
[1] - https://exe.dev/ is a new cloud provider with some very useful agent UX [2] - I built https://github.com/stanislavkozlovski/dclaude/ for this; not perfect but gets my job done on the rare occassion I need to run the coding agent locally
[-]
- kvdveer 19 hours ago
  A disposable sandbox wont protect you from secret exfiltration. Assuming you don't consider your code a secret, you could of course set up your sandbox so it doesn't have any secrets, but that would severely limit the kinds of tasks you can use the agent for.
  [-]
  - iugtmkbdfil834 3 hours ago
    << that would severely limit the kinds of tasks you can use the agent for.
    Are we just talking about API calls to providers? If so, wouldn't local agent + sandbox solve all that?
  - esterna 18 hours ago
    On the one hand, you can set up a proxy that supplements secrets for API calls. On the other hand, you can whitelist what you need, in the simplest case with iptables (The devcontainer in the claude code repo is an example of the latter).
trehalose 16 hours ago
I wish it the scoring readout at the end would display the LLM's descriptions of the commands I shouldn't have approved. I approved the rm -rf Projects command because I thought the LLM had correctly described that it would delete everything in the Projects folder. Clearly I misread that in my hurry to answer prompts (I knew what the command would do and I guess I hallucinated that the AI had explained it), but I'd like to see what it was that I misread.
Playing this game made me very glad I don't agentmaxx.
gblargg 7 hours ago
I declined things like rm -rf because the path was relative and it wasn't showing me the current directory. How would I know what project it was in?
progforlyfe 20 hours ago
I got "approve" wrong for `ls -la ~/Documents` but I don't consider simply listing the documents folder a security problem, it's just file names. If it was reading the CONTENTS of them, maybe...
zackify 1 day ago
I vibe coded a TUI that just shows running lxd containers
I hit 'n' to toggle all network access minus anthropic and openai URLs.
I use pi (sometimes claude, always on bypass) and I auto allow everything. I only toggle manual approval in rare cases like running a script or command that needs to touch a production system and I need to validate everything.
Normally my container has full write access to staging so it can debug and validate everything on its own
[-]
- kennywinker 23 hours ago
  Sounds like your process has made you vulnerable to huge classes of exploits and accidents. You have no oversight of changes locally, and only focus on when it touches prod. That means toxic local changes can get in, and if it works in staging why would you look too closely at it before merging to prod? Meanwhile a malicious npm package has made it into your repo, and your staging api keys have been sent to the command and control server.
  [-]
  - zackify 20 hours ago
    i can view the diff locally but often times after planning with opus i get what i want.
    I create a draft pr and manually review all items before then marking ready for review for the team.
    So I'm not blindly pushing things to prod without review.
    Without staging key access I wouldn't have been able to do a payment provider migration at this speed. iterating by migrating users in staging and being able to use and validate the sdk quickly with opus is a massive time saver.
cobbal 1 day ago
That's funny. It told me that blocking "npm run build" was the wrong answer. Maybe it doesn't really under The threat model.
[-]
- dns_snek 23 hours ago
  That's a great example of how dangerous actions are perceived as innocent. The entire model of approving specific commands is absolutely bonkers.
  npm run build = run an arbitrary shell command written in package.json
  Meanwhile the agent could have done any of the following without approval:
  - edited `package.json` to contain any arbitrary build command
  - planted malicious code in `build.js` (called by `npm run build`)
  - planted malicious code in `node_modules/xyz/index.js` (imported by `build.js`)
  [-]
  - nonethewiser 22 hours ago
    Yup. The most secure computer is one encased in concrete and dropped into the ocean.
    [-]
    - falcor84 19 hours ago
      Concrete alone isn't enough, you also need to have it be enclosed in a Faraday Cage.
  - Wirbelwind 20 hours ago
    that's a great point, and also the problem with relying on a human-in-the-loop to catch these kind of issues when it can be circumvented even if they were perfect
  - amarant 22 hours ago
    What would a better system look like?
    [-]
    - dns_snek 18 hours ago
      Agents should make better use of OS sandboxing facilities with finer-grained ACLs.
      Less: Do you want to run "npm run build"?
      More: "npm run build" tried to read your Chrome cookie database, do you want to allow that?
      Some agents like Codex use sandboxing on Linux/MacOS but the permissions are far too coarse - they'll run the command in a relatively strict sandbox and when it fails they'll ask you to allowlist the command as a whole, forever. There should be a new permission prompt every time a command tries to do something new.
      Claude suggests (or used to suggest - it's been a while) to allowlist "bash" which completely defeats the point. If you do that the agent can run `bash -c "echo literally anything"`
    - xigoi 2 hours ago
      Don’t give a fancy random text generator access to your computer.
    - SOLAR_FIELDS 15 hours ago
      Don’t rely on your non deterministic agent and its creators to secure your software. Design defense in depth and trust guardrails that don’t expect Anthropic to vibe good security into existence.
      If you start by treating any autonomous actor in your system as an actor with the potential to go rogue the design starts to create itself
    - nonethewiser 22 hours ago
      Not using agents at all. It could edit your code to do something malicious when you run it. Not even once. Not even if the agent has a gun to your head.
conrs 13 hours ago
Yeah, echoing the comments here. It's a good idea - kind of - but it is all about digging deeper when it is sus.
The tool assumes so much. That it is fine to kill a process itself versus just asking you to kill the process. That everyone MUST have passwords in their home directory. It's all meaningless without providing the thing it is running and so no activity is technically safe.
Why do people even get the agent to run the commands it asks to run? You can solve the entire threat vector by running it yourself and giving the agent the output. Claude practically only needs things like sed, awk, and grep. It's a pattern matcher. It's a waste of yours (and its) time to have it run your project.
Wirbelwind 1 day ago
Thanks all for checking it out and your suggestions!
If anyone is curious about the actual underlying risks and problems with some mitigations (like the 17% false-negative rates of Auto Mode), I wrote up a quick summary of some of the approaches here
https://scalex.dev/blog/ai-agent-permissions/
[-]
- kstenerud 20 hours ago
  You might want to check out https://github.com/kstenerud/yoloai
Liftyee 1 day ago
I haven't used local agentic AI yet for programming projects. Hence, -187 score
The filter for "commands I would run myself" and "commands I would let an agent run" are very different it seems.
[-]
- rogerrogerr 22 hours ago
  Thinking about agents as remote junior devs who _might_ be North Korean operatives has been the right model for me.
  [-]
  - jstanley 7 hours ago
    How do you know?
christophilus 6 hours ago
Claude Code has gotten so bad about this that I’ve stopped using it for code reviews. I may look into wiring Claude up to Codex as an alternative LLM just to compensate.
I think the issue is that I’m running Claude Code in a container so it sees that it is root, and becomes a lot more cautious. Not sure, though.
[-]
- kangalioo 4 hours ago
  If you're running Claude Code in a container anyways, why does `--dangerously-skip-permissions` not work for you?
  [-]
  - christophilus 2 hours ago
    Claude Code won't let you do that as root. Codex's equivalent is perfectly fine, though.
paddycorr 3 hours ago
Love how it always want to send my packages to random domain. Has that happened anyone in practice?
t-writescode 23 hours ago
I was told I was over protective when the text said “I need to wipe and build my project” and its first thing to do was to read the details of the (already established) package file. Why did it need to read the package file to “get context” if it was just doing a standard wipe and build?
Apparently me telling it that’s the wrong first step and saying “no” is bad; but I’ve seen AI tools waste a ton of time doing a bunch of random work before they do their job.
kleiba2 6 hours ago
Is there a light mode by any chance? Unfortunately, I cannot look at light text on black background for more than a few seconds (something must be wrong with my eyes...).
ghrl 1 day ago
I am mostly using OpenCode and barely ever see a permission prompt. While they do enforce it for outside workspace read/write, with the bash tool the agent can just bypass that. I'm not quite sure why it is that way, and it certainly isn't a very good solution, but likely not worse than asking for everything which just trains the user to always accept and provides a false sense of security then.
madrox 18 hours ago
I've long held the current agent permission model is like playing a game of "Papers, Please" and most permission models engineers implement in their own AI products is more a measure of how trusting the user is with AI than an actual permission check.
I'm of the view that future controls should be more about approving plans and rewinding durable workflows as models get better at avoiding egregious mistakes.
[-]
- cyanydeez 18 hours ago
  the models will never avoid egregious behavior. think of it like every "good intentions" morality tale. theres almost always some geniune context where that behavior is wanted.
  instead, the coding harness or determinative tool, will need hardcoded security features.
  in opencode, almost all the power comes from bash and all other permissions are just chrades. its powerful and insecure because of it.
  you can sand box them but then you fight the sandbox to pipe in your assets. the sandbox becomes porous because elsewise its useless.
  MCPs dont address much either.
  want we are looking for is a portal or protocol that has the model and harness and the actions tunneled, like ssh, to some fixed scoped and limited shell along side the assets.
  then, the user and LLM can the negotiate assets and actions as needed via the protocol.
  but alas, as your comment suggests, people thing theres some perfect context thatll prevent bad things from happening. the libertarian paradise without regulation.
  [-]
  - madrox 16 hours ago
    I think you're choosing to ignore what I said about the implication of durable workflows, because you seem to be inventing some stories about my comment.
    I find that well documented plans do pretty well at aligning AI to what I want it to do, and if it does go astray, as you rightly point out it can still do, it would be sufficient if I can undo it with little pain. We do this kind of thing all the time in CI/CD pipelines.
    Even humans can take down production. We have all kinds of guards in place to empower while also defending against the intern accidentally dropping the DB.
cat-whisperer 3 hours ago
these days I rely on auto mode. :) it's like trust-as-a-service
MeetingsBrowser 1 day ago
It would be cool to see the distribution of all player scores.
[-]
- Wirbelwind 1 day ago
  That's a great idea, stay tuned
- Wirbelwind 21 hours ago
  and added! Made one for each stat separately
ashm1104 11 hours ago
Damn this is so cool, this has the potential of being a like textbook pre training/post training quiz. Congratulations.
hanwenn 10 hours ago
I got tired of the permission prompts and wrote a filesystem/network sandbox so I could skip all permission checks. It works on the same principle as bubblewrap, but has some niceties to separate Claude from its credentials. See https://github.com/hanwen/runclaude
[-]
- huflungdung 9 hours ago
  [dead]
whimblepop 23 hours ago
I got "overblocked" for this one:
```
  rm -rf node_modules && npm install
```
but actually if you're only removing `node_modules` and you have a working package-lock.json already, what you want is `npm ci`; `npm install` can mutate package-lock.json and potentially expose you to supply chain attacks. If you use `npm ci` I think you don't need to `rm -rf node_modules`, either.
Anyway you should generally run `npm ci` except when you're deliberately updating your actual dependencies. I'd only permit an `npm install` if I was adding or updating a dependency, or I'd just reviewed an `npm ci` failure.
[-]
- gamer191 23 hours ago
  But also why would Claude need to run `rm -rf node_modules && npm install`? Without the context of seeing what changes it’s made, I’d be inclined to assume that Claude has added a new dependency, which I definitely don’t wanna blindly trust it to install
- Wirbelwind 20 hours ago
  thanks for the pointer! renamed it to npm ci so it's still 'safe'
  [-]
  - whimblepop 1 hour ago
    Thanks! Love the game as a whole :)
nardib 1 day ago
Use this and save yourself:
claude --dangerously-skip-permissions
[-]
- tasuki 1 day ago
  Just make sure to run it in an isolated environment where it's ok to mess things up, and make sure it doesn't have access to any secrets.
- wildpeaks 1 day ago
  This is why having a human in the loop isn't enough because they will cut corners and skip reviewing what they should review.
  [-]
  - preciousoo 1 day ago
    I created a watcher for this problem, to watch my PRs for unfinished scope and have a fresh Claude review
    Uses tmux and gh https://github.com/Kyu/claude-pr-watch
  - chuckadams 1 day ago
    A tool that pushes people into permissions fatigue is in fact the proper recipient of the blame. The tool in question here is the entire system though, including the OS with insufficient permission boundaries in userspace, not just the agent
    [-]
    - kennywinker 23 hours ago
      A tool that bypasses permission requests because they’re annoying will be just as guilty when the repo is poisoned.
      [-]
      - chuckadams 22 hours ago
        I'm not saying wedging doorstops under the fire doors is a good thing, I'm just saying look at the situation that's making people put the doorstops there. Or something, it's not a great analogy. I'm just saying that shaming the user belongs with obscurity in the list of security mechanisms that don't work out in practice.
- kennywinker 23 hours ago
  It’s baking malicious code into your project, but hey it didn’t run rm -rf so… we’re good.
- maxbond 22 hours ago
  Why would you do this now that we have auto mode?
- qsxfthnkp2322 1 day ago
  I love it when Claude is dangerous
- paulddraper 1 day ago
```
  alias yolo=claude --dangerously-skip-permissions
```
- dheera 1 day ago
  I got tired of typing that and just do
```
    alias claude="claude --dangerously-skip-permissions"
```
  I do have a separate "claude" user on my system without sudo access and without access to my main user home dir
  And yeah I know that's not perfect but I'm trying to get shit done
  [-]
  - franze 1 day ago
    alias claude+="claude --dangerously-skip-permissions"
    alias claude++="claude --dangerously-skip-permissions --continue"
kqr 1 day ago
Fun! Played twice and refused all dangerous commands, with only one "over-block". Although I disagree that saying no to `kill $(lsof -t -i:3000)` is over-blocking. It's such a simple command I'd rather run it myself and be fully aware of what process I'm killing.
soanvig 1 day ago
Fun game. Can somebody run an agent against those questions to see how it performs? :)
sandeepkd 23 hours ago
Interestingly I kept saying no to everything and some how I am a security conscious rare engineer who actually read the commands. Guess doing nothing is the safest approach from security standpoint.
kuboble 10 hours ago
I was so tired of all those approvals that I switched to Yolo mode exclusively.
Claude works in his own separate vm with root access, git remote set to my local copies of repository no github access etc.
I think he could still hurt me if he really wanted, but most scary stories I heard were about LLM making really bad judgements rather than actively trying to break out and do harm.
sukhavati 23 hours ago
Reminds me of the "Papers, please" game. Glory to Arstotzka!
kstenerud 20 hours ago
This is one of two reasons why I wrote yoloAI. I never get these permission prompts anymore. It feels a lot like after installing an adblocker.
ericlevine 17 hours ago
This really hits the nail on the head. The current permissions models are totally broken IMO. You're either approving everything, restricting access and neutering your agent, or full YOLOing and, well, good luck. The right primitives are not in place yet, and there's no clearly correct answers.
I think the right primitive is "task-based authorization", where you review a high-level task and let an LLM judge decide whether the subsequent tool calls fall into the scope of that task. It's not perfect, but it distills dozens of approvals down to one and gives you risk-based signals of whether you should pay close attention or not.
misbau 1 day ago
That was fun and gave me an idea how security conscious I am.
NewJazz 1 day ago
git reset --soft HEAD~1
Uh, how is this an overblock? It is literally a destructive command. No way I want an LLM agent rewriting my commit history. What if that commit was already pushed to a protected branch?
[-]
- stratos123 22 hours ago
  Why do you call it destructive? It rewrites history only locally and reversibly (the disappeared commit is still in reflog and can be recovered with another reset) and also doesn't destroy uncommitted changes, so it's quite safe. You can only lose data with it by resetting an unpushed commit and then waiting long enough to let the unreferenced commit be garbage collected.
  [-]
  - NewJazz 22 hours ago
    Commit history is data. I might not realize what happened until the gc happens.
cadwell 1 day ago
1,640 points on my first try—I fell into a few traps, but it was really interesting. Thanks for the little game! I'm sharing it with my coworkers :)
eqvinox 19 hours ago
A bit too JavaScript specific... can't really play if you don't know that ecosystem.
[-]
- mrweasel 4 hours ago
  It suggests that "kill $(lsof -t -i:3000)" is completely safe, which it's not, if you don't know what runs on that port. Maybe some Javascript framework runs on that port, I don't know, but neither does the AI, the developer may have moved it, because something important runs on that port already.
martin-adams 22 hours ago
Very fun. I can only imagine building this with Claude and testing needed a bit of mental concentration.
graphememes 22 hours ago
Pressed 1 for everything, no regrets
sevenseacat 1 day ago
Continue? Y/N ── SCORE: 2,343 Security-Conscious Engineer
Caught 8/8 threats "Not a single secret leaked"
→ llmgame.scalex.dev
[-]
- neogodless 22 hours ago
  Continue? Y/N ── SCORE: 1,549 Security-Conscious Engineer
  Caught 3/3 threats "Not a single secret leaked"
  So are there 3 threats? 8? Is it a different game?
  Does everyone get a "good" score even if they missed 5 threats?!
  [-]
  - t-writescode 21 hours ago
    It's a game you play over one minute. They probably saw more prompts than you.
bspammer 1 day ago
To be realistic, 99% of the time it should be a totally innocuous command. If half of the commands are dangerous then you don't get fatigue because you're aware what you're doing is dangerous.
stevenalowe 22 hours ago
Sadly unplayable - gray text on a black background is very hard to read on a phone
carterschonwald 1 day ago
some of the sandboxing ive been playing with gives me the best of both yolo and like logic programming tier perms on llm actions in env. still not ready for prime time though ;)
[-]
ilaksh 1 day ago
You can turn that off with an option in most agents.
My own agent harness/framework has never had any permission system. It's also never deleted anything it shouldn't or done anything crazy or unrelated to what I asked.
[-]
- flux3125 23 hours ago
  > It's also never deleted anything it shouldn't or done anything crazy or unrelated to what I asked
  Until it does. A simple curl request to a compromised website could inject a malicious prompt into it.
- fragmede 1 day ago
  How many car accidents have you been in, and do you wear your seatbelt when you're in a car?
hastily3114 11 hours ago
This is cool. Could be used for training. But it's a bit too easy when it's a game where you are expecting dangerous commands. The real fatigue comes from accepting hundreds of obviously safe commands during a work day. Then it's easy start accepting everything without really reading it.
hcks 10 hours ago
PSA: not making safe environments where you can skip all permissions and instead wasting time monitoring agents == incompetence
rvz 1 day ago
This current thread is proof of AI psychosis.
[-]
- stuartjohnson12 23 hours ago
  What the hell is going on in this thread? This isn't good. The "threats" don't make sense. Oh no, all the sensitive information in my package.json...
  [-]
  - cobbal 22 hours ago
    Here's the threat model I (a luddite) use to evaluate these. The claude code harness can be mostly trusted, the model cannot be trusted because it is exposed to untrusted data from the internet, and there is no separation of data/code in an llm [0][1].
    I want to avoid running untrusted code on my local machine, because it could steal secrets, install malware, etc.
    Since the model is allowed to write without restriction (I think) to the project directory, anything in the project directory is also untrusted. Running standard commands from the system is fine, as long as you know what those commands are going to do. Running anything from the local directory should be avoided because the code is untrusted.
    This is just one security model, there are many others! If a person is running claude in a stronger sandbox, that changes the model considerably. What threat model do you use to evaluate whether an agent's actions are safe?
    [0]: https://www.schneier.com/essays/archives/2024/05/llms-data-c... [1]: https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
  - kennywinker 23 hours ago
    If you think the worst that an agent can do is leak your package.json, your threat model is wayyy broken.
atemerev 1 day ago
--dangerously-skip-permissions is the only way to fly. Of course your environment needs to be properly containerized and autobackup set up, so even rm -rf from your harness would do nothing. Life is too short to spend on replying to permissions requests.
[-]
- prerok 23 hours ago
  I've seen these suggestions but I am really curious about the set up because I just don't get it.
  If you want to work on the code then you need to have access to the repositories, so you need the github token. Then, to test the app, you may need your own backend token. And VPN. Of course, only to DEV, of course all tokens encrypted. So, only DEV and your branch of the code is in danger. In my view, even that is pretty bad.
  So, how does such a set up work?
  [-]
  - stratos123 20 hours ago
    You could clone the repo yourself and not give the agent any tokens at all. When done, push it yourself. This also lets you sandbox the agent to only have access to the local repo and nothing else.
  - atemerev 17 hours ago
    Git makes actions reversible. Containers and VMs allow the agent to access only the things you explicitly put inside. Okay, yes, an agent can corrupt a dev database. You need to make sure it can be easily restored anytime. Simple.
- kennywinker 23 hours ago
  Lol. Countdown til you get pwned starts today. Let me know how that works out for you in six months.
  [-]
  - atemerev 17 hours ago
    Well working like that for about a year already, starting at the earliest days of agents.
    [-]
    - kennywinker 17 hours ago
      Wow a whole year! I guess it’ll never happen.
inetknght 19 hours ago
Scope Violation: `cat ~/.zshrc`
Scope Violation: `ls ~/Documents`
Buddy, my `${HOME}` is committed to a repository. It includes `.bashrc` and `Documents` directory. These are not scope violations if I'm having the LLM work on them!
Trung0246 1 day ago
Nice got 6/6
rib3ye 19 hours ago
claude --dangerously-skip-permissions
just give in
scotty79 20 hours ago
Permissions don't do much. They won't save you. You can just skip them completely.
If you are afraid that AI can delete something do what you'd do with potentially malicious user. Sandbox, don't give permission, setup remote backups and so on.
Also (unless prompt injected) models are not eager to start going rouge on your stuff.
But keep in mind a saying “Children don’t hear prohibitions — they hear suggestions.”
Same thing goes for LLMs. Never talk with LLM about deleting stuff. Archiving, moving, retaining elswhere... sure, but never about actually destructive operations. Don't use destructive language.
wilg 21 hours ago
"Auto" in Claude and "Auto-review" in Codex are the only way to do agentic coding.
jMyles 21 hours ago
I haven't run claude code without --dangerously-skip-permissions in quite some time. I'm surprised that it's still the norm to endure permission spamming?
(I run it on a VPS of course, not my laptop)
yieldcrv 19 hours ago
that was soooo last month, “auto-mode” is the way now
another agent reviews every command and blocks destructive ones
ramonga 1 day ago
Score is 6711 by just saying no to everything
Andy_Donner 2 hours ago
[flagged]
takakaze 2 hours ago
[flagged]
Ozzie-D 7 hours ago
[flagged]
unjuno 9 hours ago
[flagged]
KaiShips 21 hours ago
[flagged]
vgudur297 23 hours ago
[flagged]
jkwang 9 hours ago
[flagged]
xuanlin314 13 hours ago
[flagged]
eddysir 11 hours ago
[flagged]
sid0707 17 hours ago
[flagged]
sekihan 19 hours ago
[flagged]
syedofc 23 hours ago
[flagged]
s95124328 3 hours ago
[flagged]
eidongrowth 21 hours ago
[flagged]
MadGodInc 19 hours ago
[flagged]
leeeeep101 8 hours ago
[dead]
willyv3 20 hours ago
[dead]
crystacathol 5 hours ago
[flagged]
crystacathol 5 hours ago
[flagged]
shnayadhillo 8 hours ago
[flagged]