You don't need a startup. Millions of people have an effective tax rate that is 0% and they have a net tax rate that is negative. They do this simply by having no meaningful skills or knowledge.
- additional modalities
- Faster FPS (inferences per second)
- Reaction time tuning (latency vs quality tradeoff) for visual and audio inputs/outputs
- built-in planning modules in the architecture (think premotor frontal lobe)
- time awareness during inference (towards an always inferring / always learning architecture)
Steve here, one of the co-authors. Totally valid on OpenBio. I will say that comparison numbers for this paper were such a challenge, in part because we found that a lot of the LLMs on the Medical LLM leaderboard struggled to follow even slight changes in instructions. On one hand it felt inaccurate to just print '[something very low]% Accuracy' on structuring/abstraction tasks and call it a day, but it also seemed like the amount of engineering effort needed to get non-trivial results from those LLMs was saying something important about how they worked.
I think that's especially true when you look at how well GPT-4o worked out of the box -- it makes clear what you get from the battle-hardening that's done to the big commercial models. For the numbers we did include, the thought was that was the most meaningful signal was that going from 8B to 70B with Llama3 actually gives you a lot in terms of mitigating that brittleness. That goes a step towards explaining the story of what we're seeing, moreso than showing a bunch of comparison LLMs fall over out of the box.
In the end, we presented those models that did best with light tuning and optimization (say a week's worth of iteration or so). I anticipate that we'll have to expand these results to include OpenBio as we work through the conference reviewer gauntlet. Any others you think we definitely should work to include? Would definitely be helpful!
We're excited to share pitchpilot with the HN community. Our beta users have found the embedded audio particularly useful for enterprise sharing. We're keen to keep improving, and our mission is to make communication easier.
In the roadmap is adding video export, digital twin presentations, and real-time presentations. We don't wrap a public LLM, so we don't share any data.
Given that Generative AI can now read brain scans [1] and this, I wonder how far away we are from "you thought negatively about something, the authorities are on their way".
Well we’re not infinitely far away from it, which is why we need to build political and legal systems that can respect human dignity even in the presence of such technologies.
1. First, models will predict pollution. The outcomes will help shape urban policy. But these won't solve crime or stop people from driving.
2. Second, models will predict individual behavior and track person level emissions. The outcomes will force behavior changes, mostly freedom limiting.
3. Third, and finally, models will predict thoughts. The the thought of driving instead of walking might trigger a response.
It's a slippery slope and we need to draw a line between prediction and policy.
Even allowing for the ridiculously massive technical leap from 1 to 2 and then 2 to 3, it doesn't make much sense.
For one thing, if states are determined to enforce individual emissions limits, they can do it today with legislation. You don't need a predictive model. What does the model add?
Also, the only difference between 2 and 3 is whether a person acts on a thought.
So are you suggesting with #3 that predicted thoughts (e.g. not literal mind reading) which a person doesn't act upon will prompt state action?
Using the unqualified word "freedom" has an ambiguity that political actors exploit. Freedom to do something is entirely separate to "free to live in a world where ___".
To be honest, I feel the latter sense of the word is a bit of a stretch - semantically, not politically.
But you see it because "freedom" is a powerful word in politics, and rather than argue against "freedom", pundits go up the ladder of abstraction and argue the definition instead.
Good question! eHealth Exchange (eHEX) is one of 3 national HIEs that we connect to (currently through Carequality). eHEX is mainly focused on connecting to state-level regional HIEs, which cover a different portion of providers than CommonWell, or Carequality do.
For example, Cerner is a major EHR vendor (used by the VA and others) whose data can only be accessed through CommonWell, since they don't participate in other HIEs.
> that have many years of experience
Relatively speaking, modern HIEs are a relatively new concept (Carequality was founded in 2014) - so extra years of experience doesn't necessarily add any value, and usually just results in more legacy tech to deal with!
TL;DR just to get started it's going to cost you $20k + some months to integrate, $12.5k/yr as the base membership fee (up to $400k if you make a lot of money!), and they charge a per-query price.
The caveat here is per-query in eHEX, isn't what a query is in Metriport. They literally mean every single query (remember the HTTP requests to thousands of endpoints to find patient records, each one of those would be a query). So, if you want to integrate with eHEX only to get limited, messy C-CDA data, then you're looking at paying ~$0.80 per full record retrieval for a patient with 2k documents.
I think we've largely arrived in terms of capabilities and companies are just competing to work out the kinks and fully integrate their products. There will be some new innovations, but nothing like the moon that caps off "you've won". The winner(s) will just be whoever can keep funding long enough to find a profitable use for them.
It seems to me like the moon is "chatbots which are somewhat convincing" and everybody is landing there in OpenAI's wake. The real problem is Mars - make a computer which can learns as quickly and reason as deeply as, say, a stingray or another somewhat intelligent fish[1].
[1] This task seems far beyond the capability of any transformer ANN absent extensive task-specific training, and it cannot be reasonably explained by stingray instinct:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8971382/
This is true in more ways than one. My question is – what happens once we do land on the moon? Will we become a spacefaring civilization in the decades to come, or will the whole thing just...fizzle out?
I don't think a pure language model of the sort under consideration here is heading towards AGI. I use language models extensively and the more I use them the more I tend to see them as information retrieval systems whose surprising utility derives from a combination of a lot of data and the ability to produce language. Sometimes patterns in language are sufficient to do some rudimentary reasoning but even GPT4, if pushed beyond simple patternish reasoning and its training data, reveals very quickly that it doesn't really understand anything.
I admit, its hard to use these tools every day and continue to be skeptical about AGI being around the corner. But I feel fairly confident that pure language models like this will not get there.
reply