it really varies, you are correct most modern ones search the byte string for @ characters but there are probably hundreds of different methods out there in black hat marketing circles to scrape emails.
you know what's funny is that llms are also good at detecting spam as they are generating it. I've got an automation that scores incoming emails and it's getting better and better each day (also more expensive haha)
I can’t explain it well, but I think there is an asymmetric issue here… that the ability for an LLM to write a plausible email, and the ability for an LLM to detect that it’s spam are mismatched.
If an LLM and make a plausible email, the best another LLM can do is to rank it as plausible. Blackbox creation and detection have to be on the same level.
Perhaps if you said the detection LLM had all your context and websearch. That it could know that a Penny Pollytree at Coco Co isn’t a real person, but… that just seems like burning a ton of coal to detect fraud where the creation LLM was able to easily come up with the fictitious spam cheaply.
The real story here is this will go beyond email verification. That every system we have is going to need to up its security. Paper birth certificates and social security cards and email addresses and all manner of identity is going to need new systems of auth. The challenge will be to prevent authoritarian centralization.
But I think there's also an asymmetry strongly favouring the defense, namely that for a spam mail to be worthwhile, it needs some call to action, a way to lure in the victims.
A link to a shady website, an infected attachment, a weird freemail address in the body or Reply-To header that doesn't match the forged From header, etc. They're trying to get cleverer for sure -- I started getting phishing mails where the malicious link is only in a QR code in an embedded image -- but I think the need to somehow link to the trap is an inherent weakness against any defense. SpamAssassin rules give a good overview of stuff that help detection no matter how the rest of the mail is generated.
also I'm not a hardware hacker - are there any ESP32 kits with the speaker / led / button / amp already wired or in a kit at least? (that you recommend for this)
its really wild at all the AC to DC changes. for those non electric engineers / hardware hackers (like myself) one of the biggest "examples" I've seen of this has been ceiling fans.
Installing a ceiling fan used to be treacherous and so heavy. Also loud and buzzy after installed. Now the fans in these things are so lightweight and easy.
seeing the same in many more areas (lighting, etc)
Would love to see more mainstream DC lighting options and an updated code to match. I just finished a remodel of my workshop and blew over a hundred bucks on 14/2 for a 15 amp lighting circuit that is unlikely to ever see more than a 1 amps load.
The irony is all the recessed lights I picked out are DC, they all have little AC-DC boxes hanging off them using a proprietary connector. If I hadn't needed to pass a rough-in inspection going all DC would've been trivial.
I worked at fb, and I'm 100% certain we sponsored VLC and OBS at the time. It would be strange if we didn't sponsor FFMPEG, but regardless (as the article says) we definitely got out of our internal fork and upstreamed a lot of the changes.
I worked on live, and everyone in the entire org worships ffmpeg.
Meta has made more positive contributions to society and the world than every HN commenter combined, and more than most of the other FAANGS (Amazon being the exception).
Damned for virtual signalling if they make posts about their contributions, damned for destroying tech when they don't. I love these kinds of articles and share them with students all the time.
While contributing back to ffmpeg is great, this is insanely hyperbolic lol. Do you genuinely think Instagram and Facebook are positive contributions to society?
Which is kinda crazy to me, in light of how durable their business laptops have been in my experience. I’ve owned maybe 6 pc laptops in my career, and the only 2 that’ve survived that nearly 20 year space are both dells.
Access logs were one of the main motivations (lots of repeated queries like IP/user-agent/path/status). If you try it, two tips:
1) Index once, then iterate on searches:
qlog index './access*.log'
qlog search 'status=403'
2) If you’re hunting patterns (e.g. suspicious UAs or a specific path), qlog really shines because it doesn’t have to rescan the whole file on each query.
If you run into anything weird with common log formats (nginx/apache variants), feel free to paste a few sample lines and I’ll make the parser more robust.
reply