How would flying drones be useful to a drug runner? Their priorities are to transport a large amount of material over a long distance and to avoid detection. Drones have a relatively low payload capacity, have limited range, and are easily detected - they're not practical.
(A very different kind of "drone" has seen quite a bit of use in drug running - remote-controlled submarines! They've proven able to carry a large load over a long distance while remaining hard to detect.)
There are commercially available drones that can carry a payload of high-single-digit to low-double-digit kilograms for at least 10km.[1] They fly low enough and are small enough to avoid most radar.
Their use in cross border smuggling of weapons and drugs is well documented[2]; interception rate is low enough that they can make multiple runs before being downed, and they can pay back their purchase cost with only a few successful runs. Typical concept of operations is similar to manned ground crossings, but with drones covering the most dangerous 5-10km of actually crossing the border: a team on one side loads them up and sends them to a team on the other side, with both having a LOT of real estate to hide in because of the drone's range.
(I work on counter drone EW, and border-control customers are under intense pressure to get this under control.)
> Things like [...] are now considered table stakes.
One other feature that's absolutely considered table stakes now is persistent server-side history, with the ability to edit and delete messages. Modern chat services are less like IRC, and more like a web forum with live updates.
(Yes, you can poorly emulate server-side history on IRC with a bouncer. That's not enough, and it's a pain for users to set up.)
There's also quassel which solves the problem a bit like a bouncer but it's way more integrated, it just loads the scrollback on demand instead of just banging the latest 200 lines into my buffer when I connect. Solves the problem perfectly IMO and there's a really excellent android client.
It's still not server-side history, though - you can't join a channel and see what happened before you joined, or edit a message you've already sent. It's just a slightly cleaner implementation of an IRC bouncer.
Hmm no but that's usually a good thing. I've had some late night chats where I knew all the other people around and it would not be so cool if anyone else could just join and scroll back to it.
In fact this is the reason some irc networks blocked matrix bridges at first (they now have settings to disable this)
I'm not saying mainstream people should use IRC though. Matrix is better for that.
Even wilder - they're claiming to look at a user's activity on the platform - like what servers they're on, what games they play, and what hours they're active - and infer adulthood from that. No way that'd pass legal muster.
Account age and credit card history can tell a lot. If Discord can assume you were at least 7 when you first signed up for Nitro and you've been a Nitro member off and on since Discord started 11 years ago, you are at least 18.
It seems like these systems would be very easy to reverse engineer. Pretend to be an old person on Discord (whatever that entails) long enough to get them off the case.
> It's such a successful strategy, even Bitcoin scammers use it:
For years, email spammers have claimed to have tracked victims' porn habits to try to extort them. That's a far cry from actually doing so. (And no, they aren't actually doing it.)
Not to mention "seriously", "really", "truly", "very", "verily", etc. There's a long history of using words related to truth as intensifiers in English.
There's an easier and more effective way of doing that - instead of trying to give the model an extrinsic prompt which makes it respond with your text, you use the text as input and, for each token, encode the rank of the actual token within the set of tokens that the model could have produced at that point. (Or an escape code for tokens which were completely unexpected.) If you're feeling really crafty, you can even use arithmetic coding based on the probabilities of each token, so that encoding high-probability tokens uses fewer bits.
From what I understand, this is essentially how ts_zip (linked elsewhere) works.
Concur. Zstandard is a good compressor, but it's not magical; comparing the compressed size of Zstd(A+B) to the common size of Zstd(A) + Zstd(B) is effectively just a complicated way of measuring how many words and phrases the two documents have in common. Which isn't entirely ineffective at judging whether they're about the same topic, but it's an unnecessarily complex and easily confused way of doing so.
Mostly. There's also confounding effects from factors like the length of the texts - e.g. when compressing Zstd(A+B), it's more expensive to encode a backreference in B to some content in A when the distance to that content is longer, so longer texts will appear less similar to each other than short texts.
I do not know inner details of Zstandard, but I would expect that it to least do suffix/prefix stats or word fragment stats, not just words and phrases.
The thing is that two English texts on completely different topics will compress better than say and English and Spanish text on exactly the same topic. So compression really only looks at the form/shape of text and not meaning.
Yes of course, I don't think anyone will disagree with that. My comment had nothing to do with meaning but was about the mechanics of compression.
That said, lexical and syntactic patterns are often enough for classification and clustering in a scenario where the meaning-to-lexicons mapping is fixed.
The reason compression based classifiers trail a little behind classifiers built from first principles, even in this fixed mapping case, is a little subtle.
Optimal compression requires correct probability estimation. Correct probability estimation will yield optimal classifier. In other words, optimal compressors, equivalently correct probability estimators are sufficient.
They are however not necessary. One can obtain the theoretical best classifier without estimating the probabilities correctly.
So in the context of classification, compressors are solving a task that is much much harder than necessary.
It's not specifically aware of the syntax - it'll match any repeated substrings. That just happens to usually end up meaning words and phrases in English text.
(A very different kind of "drone" has seen quite a bit of use in drug running - remote-controlled submarines! They've proven able to carry a large load over a long distance while remaining hard to detect.)
reply