Hacker Newsnew | past | comments | ask | show | jobs | submit | govideo's commentslogin

Cool product. I see the need. Many vendors might look like they have similar capabilities on a powerpoint, but when you look one level deeper, they really don't. Eg, webhook notifications at 1 minute intervals with one big json of multiple emails, vs continuous/instant/single.

btw, would love to hear stories of you journey thus far. fwiw, I think you're really onto smtg!


Thanks! We've posted some vlogs on X, but I think we have some good content for engineering blogs too. Appreciate the support


I'd love to hear more of our thoughts re open questions in biomedical ML. You sound like you have a crisp, nuanced grasp the landscape, which is rare. That would be very helpful to me, as an undergrad in CS (with bio) trying to crystalize research to pursue in bio/ML/GenAI.

Thank you.


Thanks, but no one truly understands biomedicine, let alone biomedical ML.

Feynman's quote -- "A scientist is never certain" -- is apt for biomedical ML.

Context: imagine the human body as the most devilish operating system ever: 10b+ lines of code (more than merely genomics), tight coupling everywhere, zero comments. Oh, and one faulty line may cause death.

Are you more interested in data, ML, or biology (e.g., predicting cancerous mutations or drug toxicology)?

Biomedical data underlies everything and may be the easiest starting point because it's so bad/limited.

We had to pay Stanford doctors to annotate QA questions because existing datasets were so unreliable. (MCQ dataset partially released, full release coming).

For ML, MedGemma from Google DeepMind is open and at the frontier.

Biology mostly requires publishing, but still there are ways to help.

After sharing preferences, I can offer a more targeted path.


ML first, then Bio and Data. Of course, interconnectedness runs high (eg just read about ML for non-random missingness in med records) and that data is the foundational bottleneck/need across the board.

Interesting anecdote abt Stanford doctors annotating QA question!

Each of your comments get my mind going... I'm going to think about them more and may ping you on other channels, per your profile. Thanks!


More like alarming anecdote. :) Google did a wonderful job relabeling MedQA, a core benchmark, but even they missed some (e.g., question 448 in the test set remains wrong according to Stanford doctors).

For ML, start with MedGemma. It's a great family. 4B is tiny and easy to experiment with. Pick an area and try finetuning.

Note the new image encoder, MedSigLIP, which leverages another cool Google model, SigLIP. It's unclear if MedSigLIP is the right approach (open question!), but it's innovative and worth studying for newcomers. Follow Lucas Beyer, SigLIP's senior author and now at Meta. He'll drop tons of computer vision knowledge (and entertaining takes).

For bio, read 10 papers in a domain of passion (e.g., lung cancer). If you (or AI) can't find one biased/outdated assumption or method, I'll gift a $20 Starbucks gift card. (Ping on Twitter.) This matters because data is downstream of study design, and of course models are downstream of data.

Starbucks offer open to up to three people.


Wow, you sound like a mega voracious reader, to put it lightly. And prolific writer too.

Have you posted even just snippets (or a favorite sentence or graf), or perhaps pseudo/anonymously, anywhere?

It literally hurts my heart when personal, creative expression is stifled or not given a chance to grace our greater world.


Thanks for everyone's perspectives. Very educational and admittedly lots outside the boundaries of my current knowledge. I have thus far relied on CloudFlare's automatic https and simple instant subdomain setup for their worker microservice I'm using.

There are evidently technical/footprint implications of that convenience. Fortunately, I'm not really concerned with the subdomain being publicly known; was more curious how it become publicly known.


I had to scroll pretty far down to see the first comment refering to the second most likely leak (after certificate transparency lists): Some ISP sold their DNS query log, and your's was in it.

People buying such records do so for various reasons, for example to seed some crawler they've built.


I have been just relying on CloudFlare's automatic https. But I will look into my own certs, though will likely just use CloudFlare's. I don't mind the internet knowing the subdomain I posted about; was curious how the bots found it!


I'm so often amazed (but no longer surprised) at the depth of niche (relatively) info and tools out there.


Thanks for mentioning. I checked it out, and am learning lots of new stuff (ie, realize how much I do not know).


Interesting! Just checked them out.

"MerkleMap gathers its information by continuously monitoring and live tailing Certificate Transparency (CT) logs, which are operated by organizations like Google, Cloudflare, and Let's Encrypt. "


I made this, thank you!


If it does, I did not set it up; it would have been automatically done by CloudFlare when I told it to use my custom subdomain for the upload urls.


Nope, never emailed or posted to anyone. Just me (it's my solo project at the moment).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: