I built something like this but much worse. No extension, no recording, I literally sit there with Chrome devtools open, do the action manually, copy the 3-4 network requests into a Python script, and replay them with urllib and a cookie jar.
It's absurd but it works. Gumroad's cover image upload for example, their actual API can't do it, but the browser makes 3 requests (presign to their Rails Active Storage endpoint, PUT the binary to S3, POST the signed_blob_id to attach it). Captured those once in April, been replaying them since. I uploaded covers and thumbnails to 9 products today without opening a browser.
Obviously falls apart the second they change anything.
Yes exactly imagine now anyone, even non-technical people, can just prompt and interact with this hidden/deeper layer of the web, all in their regular browser!
Hey thats a great idea, we will take a look into exploring this export option. But how would it save time by being a Playwright script?
Right now since we have a custom sandbox to re-execute the code in, we are using our own syntax and exposed methods. So even now you can edit the generated script.
Maybe there’s a middle ground where a small local model can roll with the variations in a site that would break a script, while saving the per token costs?
The reason we open our client side code is to bring in the trust in putting rtrvr's DOM intelligence in your web apps - https://github.com/rtrvr-ai/rover/tree/main . Our monetization is super straight forward with subscription - https://www.rtrvr.ai/pricing . The experiences of some extensions shipping anything or selling user data comes in when people build them as side-gigs not when we pour more than year in building the highly accurate automation engine. We have cloud sandboxes too if you prefer executing with the same intelligence on cloud and not on your own device.
auditing the code is fairly straightforward if it isn't obfuscated. so long as it doesn't execute dynamic code that is. but the big issue is you can't control when the extension itself gets an update (to my knowledge). and it isn't uncommon to sell browsing data, or the extension itself to someone more shady than the original author down the road.
oh this is clever. running in main world dodges a lot of the usual scraping pain. how do you handle sites with strict csp that block inline scripts, is the extension somehow exempt?
The bigger goal is to build and maintain a global library of popular automations. Users can also quickly re-record and recreate the scripts to update.
Since it runs inside your own browser, there should be no captchas or challenges. On failure it can fallback to our regular web agent that can solve captchas.
Big picture wise with the launch of Mythos it might just become impossible for websites to keep up, and they will have to go like Salesforce and just expose APIs for everything.
It's absurd but it works. Gumroad's cover image upload for example, their actual API can't do it, but the browser makes 3 requests (presign to their Rails Active Storage endpoint, PUT the binary to S3, POST the signed_blob_id to attach it). Captured those once in April, been replaying them since. I uploaded covers and thumbnails to 9 products today without opening a browser.
Obviously falls apart the second they change anything.
Having to redo recordings once they break sounds like too much hassle.
Right now since we have a custom sandbox to re-execute the code in, we are using our own syntax and exposed methods. So even now you can edit the generated script.
We are thinking through on self healing mechanisms like falling back to a live web agent and rewriting script.
PS: Also, our data policy if you are interested: https://www.rtrvr.ai/blog/rtrvr-ai-privacy-security-how-we-h...
Does this work on sites that have protection against LLMs such as captchas, LLM tarpits and PoW challenges?
I just see this as a never ending cat and mouse game.
Since it runs inside your own browser, there should be no captchas or challenges. On failure it can fallback to our regular web agent that can solve captchas.
Big picture wise with the launch of Mythos it might just become impossible for websites to keep up, and they will have to go like Salesforce and just expose APIs for everything.