For small codebases, you can run this tool on the entire directory and it would generate a well-formatted Markdown prompt detailing the source tree structure, and all the code. You can then upload this document to either GPT or Claude models with higher context windows and ask it to:
- Rewrite the code to another language.
- Find bugs/security vulnerabilities.
- Document the code.
- Implement new features.
You can customize the prompt template to achieve any of the desired use cases. It essentially traverses a codebase and creates a prompt with all source files combined. In short, it automates copy-pasting multiple source files into your prompt and formatting them along with letting you know how many tokens your code consumes.
What do you mean my small codebase. I hope not the normal todo repos or basic apps. Can this be run on production codebase like a java application having dozens of microservices inside it.
Yes, this just depends on the model you're using. Small-medium size codebases would fit inside Claude's 200K context window and Gemini 1.5 has a 1M context window which would essentially fit 99% of codebases.
For reference:
- The Flask web framework for Python: 131880 tokens
I have made a simple CLI utility[0] with this purpose in mind. It scans your entire filesystem for README.md and FUNDING.yml files for a set of donation/sponsor links and tag it with the associated repo (No HTTP calls, just the assumption that most repos link their support URL in either of these files). The output is a CSV sheet containing the open-source dependencies/libraries you use in your system that accepts donations.
I have plans to expand/plug this into a donation aggregator platform like you mentioned if time permits. But if there is an existing effort for the same, I am happy to contribute. :)
Hey this is pretty great, and the code is so simple. I guess it only works if you have the sources checked out somewhere, which isn't the case for all build tooling and package managers, but I could see an extended version of this that hooks into the standard package managers to fetch the required information to complete the report.
If you can also hook into an accounting system (eg plaintextaccounting.org) then you could also calculate the whole dollar amounts to donate as some percentage of income from the product.
It currently does gzip compression by default. Compression modes for specific files sounds interesting, I will definitely get around to implementing that.
Backstory: What started as a personal project to quickly host some web pages turned into a rabbit hole of yak shaving and that is how I ended up making Binserve. I automated the steps I usually take to host static pages into this project which is tweakable via the configuration file. And its also pretty fast.
TLDR; Just a fun little project of mine that born out of an idea to utilize Gist as a blogging platform, I hope you find it useful.
All I wanted to do was write a blog but got completely distracted on choosing a static site generator or a blogging solution. I have tried some static site generators but I personally find entering the metadata section (frontmatter) for each markdown file repetitious. I have used blogspot and wordpress a lot so I am used/into writing blog into a web page (rather than a text editor) and just publishing it without thinking about files or metadata.
So I thought it would be useful to utilize Gist as it comes with a good Markdown editor, an integrated comment section, starring (bookmarking), and revisions (commits) for metadata. I didn't complete the blog but for those who actually want to write one, try out writing them via Gist.
NOTE: I hope someone who's good at CSS will contribute a better starter theme, I would really appreciate it.
- Rewrite the code to another language.
- Find bugs/security vulnerabilities.
- Document the code.
- Implement new features.
You can customize the prompt template to achieve any of the desired use cases. It essentially traverses a codebase and creates a prompt with all source files combined. In short, it automates copy-pasting multiple source files into your prompt and formatting them along with letting you know how many tokens your code consumes.