I see there are lots of courses being sold for Evals in Maven. Some are as costly as USD 3500. Are they worth it? https://maven.com/parlance-labs/evals
Maybe correct but not too polished sentences may be a better give away that it's not AI generated, but still good enough to get through ATS or AI screening?
I prefer Reddit communities over SO any day. SO, folks are so high headed, they will bash you with anything that doesn't suit their framework. I am sure with GPTs slowly they will lose traffic.
Threads don't get closed due to age on Reddit (they used to be archived but this stopped a while back). Mods can lock threads but this is used to moderate content.
And which subreddit locked your thread because a similar question was asked six months ago? I find that difficult to believe.
In the end in few years, it will be whosoever has better AI wins in all fields. Monopoly sort of thing. I finance world maybe they win most of the trades.
I use github copilot chat right now. First I use ask mode to ask it a question about the state of the codebase outlining my current understanding of the condition of the code. "I'm trying to x, I think the code currently does y." I include a few source files that I am talking about. I correct any misconceptions about the plan the llm may have and suggest stylistic changes to the code. Then once the plan seems correct, I switch to agent mode and ask it to implement the change on the codebase.
Then I'll look through the changes and decide if it is correct. Sometimes can just run the code to decide if it is correct. Any compilation errors are pasted right back in to the chat in agent mode.
Once the feature is done, commit the changes. Repeat for features.
I also do the same. I am on the 200$ maxpro plan. I often let the plan go to pretty fine level of detail, e.g. describe exactly what test conditions to check, what exact code conditions to follow.
Do you write this to a separate plan file? I find myself doing this a lot since after compaction Claude starts to have code drift.
Do you also get it to add to it's to-do list?
I also find that having the o3 model review the plan helps catch gaps. Do you do the same?
Yes, it can't change between edit and ask/agent without losing context but ask <-> agent is no problem. You can also change to your custom chat modes https://code.visualstudio.com/docs/copilot/chat/chat-modes without losing context. At least that's what I just did in VSCode Insiders.
Also, I am using tons of markdown documents for planning, results, research.... This makes it easy to get new agent sessions or yourself up to context.
Yes. I think it used to be separate tabs, but now chat/agent mode is just a toggle. After discussing a concept, you can just switch to agent mode and tell it to "implement the discussed plan."
GitHub Copilot models are intentionally restricted, which unfortunately makes them less capable.
I'm not the original poster, but regarding workflow, I've found it works better to let the LLM create one instead of imposing my own. My current approach is to have 10 instances generate 10 different plans, then I average them out.