We built agents to test github repo quickstarts associated with arXiv papers a couple months before this paper was published, wrote about it publicly here: https://remyxai.substack.com/p/self-healing-repos
To limit the attack surface we added PR#1929 to AG2 so we could pass API keys to the DockerCommandLineCodeExecutor but also use egress whitelisting to block the ability of an agent to reach a compromised server: https://github.com/ag2ai/ag2/pull/1929
Since then, we've been scaling this with k8s ray workers so we can run this in the cloud to build for the hundreds of papers published daily.
By running in Docker, constraining the network interface, deploying on the cloud, and ultimately keeping humans-in-the-loop through PR review, it's hard to see where the prompt-injection attack comes into play from testing the code.
Would love to get feedback from an expert on this, can you imagine an attack scenario, Simon?
I'll need to work out a check for the case where someone creates a paper with code instructing my agent to publish keys to a public HF repo for others to exfiltrate.
AI & ML engineering in particular is very research-adjacent.
That's why we began building agents to source ideas from the arXiv and implement the core-methods from the papers in YOUR target repo months before this publication.
No doubt, this toy demo will break your system if the research repo code runs unsecured code.
We thought about this out as we built a system that goes beyond running the quickstart to implement the core-methods of arXiv papers as draft PRs for YOUR target repo.
Running quickstart in sandbox is practically useless.
To limit the attack surface we added PR#1929 to AG2 so we could pass API keys to the DockerCommandLineCodeExecutor and use egress whitelisting to limit the ability of an agent to reach out to a compromised server: https://github.com/ag2ai/ag2/pull/1929
Been talking publicly about this for at least a month before this publication, and along the way we've built up nearly 1K Docker images for arXiv paper code: https://hub.docker.com/u/remyxai
The ability to accurately estimate distances from RGB image input is just at the frontier of current AI model capabilities.
Nonetheless, distance estimation is a critical for perception and planning in embodied AI applications like robotics which must navigate around our 3D world.
We just released SpaceThinker, a 3B open-weight VLM designed specifically for spatial reasoning tasks like distance and size estimation from RGB images. It’s small and fast enough for on-device use, trained entirely on open-source data/code.
Interesting finding: By switching model name in this colab, using the non-reasoning variant SpaceQwen (https://huggingface.co/remyxai/SpaceQwen2.5-VL-3B-Instruct), you'll find using the step-by-step reasoning prompt actually hurts performance, challenging the convention that reasoning models don't benefit from complex instructions the way non-reasoning models do.
Feedback, suggestions, and collaborators are welcome!
If you're interested in contributing, we open-sourced VQASynth—our implementation of the SpatialVLM approach for generating VQA-style datasets from images. VQASynth was used to create the SpaceThinker dataset, which powered the fine-tuning of the SpaceThinker model showcased here.
Exciting news, thanks for sharing! We've been applying this technique to create custom models on the fly with our no code platform at https://remyx.ai
We're trying to build up our user base and get more feedback, try it out or check out our walkthrough: https://youtu.be/7SMySnRRTew?t=39
We've been pushing it farther to implement draft PRs in your target repo, published a month before this preprint: https://remyxai.substack.com/p/paperswithprs
To limit the attack surface we added PR#1929 to AG2 so we could pass API keys to the DockerCommandLineCodeExecutor but also use egress whitelisting to block the ability of an agent to reach a compromised server: https://github.com/ag2ai/ag2/pull/1929
Since then, we've been scaling this with k8s ray workers so we can run this in the cloud to build for the hundreds of papers published daily.
By running in Docker, constraining the network interface, deploying on the cloud, and ultimately keeping humans-in-the-loop through PR review, it's hard to see where the prompt-injection attack comes into play from testing the code.
Would love to get feedback from an expert on this, can you imagine an attack scenario, Simon?
I'll need to work out a check for the case where someone creates a paper with code instructing my agent to publish keys to a public HF repo for others to exfiltrate.