So if I understand correctly, they want to scan all your photos, stored on your private phone, that you paid for, and they want to check if any of the hashes are the same as hashes of child porn?
So... all your hashes will be uploaded to the cloud? How do you prevent them from scanning other stuff (memes, leaked documents, trump-fights-cnn-gif,... to profile the users)?
Or will a huge hash database of child porn hashes be downloaded to the phone?
Honestly, i think it's one more abuse of terrorism/child porn to take away privacy of people, and mark all oposing the law as terrorists/pedos.
...also, as in the thread from the original url, making false positives and spreading them around (think 4chan mass e-mailing stuff) might cause a lot of problems too.
> and they want to check if any of the hashes are the same as hashes of child porn?
... without any technical guarantee or auditability that any of the hashes they're alerting on are actually of child porn.
How much would you bet against law enforcement to abuse their ability to use this, and add hashes to find out who's got anti government memes or police committing murder images on their phones?
And that's just in "there land of the free", how much worst will the abuse of this be in countries who, say, bonesaw journalists to pieces while they are alive?
I remember the story where some large gaming company permanently banned someone because they had a file with a hash that matched a "hacking tool". Turns out the hash was for an empty file.
A malware will definitely be created, almost immediately, that will download files that are intentionally made to match CP - either for the purposes of extortion or just watching the world burn.
I'm usually sticking my neck out in defence of more government access to private media than most on HN because of the need to stop CP, but this plan is so naive, and so incredibly irresponsible, that I can't see how anyone with any idea of how easy it would be to manipulate would ever stand behind it.
Signal famously implemented, or at least claimed to implement, a rather similar-sounding feature as a countermeasure against the Cellebrite forensics tool:
If they told that, people would (try to) remove it. The whole point is that you can't know which (if any) of the hundreds of thousands of files on your device it is. So they aren't telling. Could be they have (or at least claim to have) written their system so it chooses files at random; I think that's what I would do (or claim to have done).
If you can recreate a file so it’s hash matches known CP then that file is CP my dude. The probability of just two hashes accidentally colliding is approximately: 4.3*10-60
Even if you do a content aware hash where you break the file into chunks and hash each chunk, you still wouldn’t be able to magically recreate the hash of a CP file without also producing part of the CP.
The Twitter thread this whole HN thread is about shows just how to make collisions on that hash. So any image can be manipulated to trigger a match, even if that image isn’t CP.
It's the weights from the middle of a neural network that they're calling a "hash" because it encodes and generates an image it has classified as bad. Experts have trouble rationalizing about what weights mean in a neural network. This is going to end badly.
If this was a hash then it would be as the parent describes, this is at best a very fuzzy match on an image to take into account blurring/flipping/colour shifting.
It's vastly more likely that innocent people will be implicated for fuzzy matches on innocuous photos of their own children in shorts/swimming clothes than it is to catch abusers.
The other thing is, when you have nothing to hide you won't take efforts to hide it - meaning you'll upload all of your (completely normal) photos to iCloud without thinking about it again.
The monsters making these images know what they're doing is wrong, so they'll likely take efforts to scramble or further encrypt the data before uploading.
tldr; it's far likelier that this dragnet will only even apply to innocent people, than it is to catch predators.
All this said, I'm still in support of Apple taking steps in this direction, but it needs far more protections put in place to prevent false positives than this solution allows. A single false accusation by this system, even if retracted later and rectified, would destroy an entire family's lives (and could well cause suicides).
Look what happened in the Post Office case in the UK as an example of how these things can go wrong - scores of people went to prison for years for crimes they didn't commit because of a simple software bug.
> The monsters making these images know what they're doing is wrong, so they'll likely take efforts to scramble or further encrypt the data before uploading.
The ones that make national news from big busts do, because the ones that don't get caught much sooner and only make local news, because Google and other parties are have automatic CSAM identification online already (server side, not client side, AFAIK), and are sending hits to Homeland Security.
That document you downloaded that is critical of the party will land you and your family in jail. Enjoy your iPhone.
Seriously, folks, we shouldn't celebrate Apple's death grip over their platform. It's dangerous for all of us. The more of you that use it, the more it creates a sort of "anti-herd immunity" towards totalitarian control.
Apple talks "privacy", but jfc they're nothing of the sort. Apple gives zero shits about your privacy. They're staking more ground against Facebook and Google, trying to take their beachheads. You're just a pawn in the game for long term control.
Apple cares just as much for your privacy as they do your "freedom" to run your own (un-taxed) software or repair your devices (for cheaper).
And after Tim Cook is replaced with a new regime, you'll be powerless to stop the further erosion of your liberties. It'll be too late.
But is there a realistically better alternative? Pinephone with a personally audited Linux distro? A jailbroken Android device with a non-stock firmware that you built yourself? A homebuilt RaspberryPi based device? A paper notepad and a film camera and an out of print street map?
The best bet is probably a pixel phone with GrapheneOS. (Do note, that copperhead os is a scam and is not to be used)
Gnu/linux phones have nonexistent security, other than being niche (so security by obscurity at most). And also, they are not yet usable as a daily driver for me personally, at least.
Whether or not I am allowed to check that my entrance has no locks whatsoever doesn’t make it harder to open it. And the reverse, even if I don’t know the details of the lock in my door, it will not let others pass through.
What if I have a locksmith verify it for me? Like Apple and Android have been checked by several security researchers and while they absolutely have holes, there is are at least gates that can have them. Sandboxing is the bare minimum an OS should do if it wants to have third party applications installed.
Basically Micay is a legitimate security researcher who created the project and it was later hijacked by the company funding some of it. That company since then try to badmouth Micay at any place they find and is doing shady things on top of the still open source code base. Micay was so professional to destroy the verification key at the time of the forking.
> Pinephone with a personally audited Linux distro?
Even if you don't personally audit it, you still benefit from other people doing it. Especially if the software is reproducible (and many packages are).
An Android device running non-stock is a realistically better scenario. The big problem there is that the state of Android drivers means your hardware options are severely cut down (in practice, to a selection about the size of Apple's - the Pixel line and some assorted others).
With non-stock (assuming not jailbroken but just a totally different operating system) I think (I might be wrong... I should know for sure but I awkwardly don't) you aren't even allowed to use Google Play Services at all?
You are allowed to use Google services. There is even an alternative called microG which is compatible with apps requiring Google serivices but it sends "fake" data to Google.
Viable alternatives were long gone. I really miss the days of Symbian and Meego, phones that are hackable yet intuitive to use (I.e. Nokia N900, N9).
Realistically now we have Tizen and Jolla OS, which had backings from Samsung but nobody gave two damn about it.
I bet even if any of these vanilla mobile OS gets big enough they’ll get bought by the 3 giants and suffocated to death just like how Microsoft sniped Nokia.
Not the parent commenter, but for me - Samsung are just as morally vacuous as Google, but are way less competent, at least on there software side (their component manufacture seems to be world class in at least some areas).
They'll happily do evil shit, and execute it poorly. Samsung are _way_ more likely to leak the unnecessarily and possibly illegally collected personal data they hoover up than Google are.
Not really, and I'm not going to sway anyone deeply into the ecosystem.
My hope is that those of you that share my viewpoint will call your legislators and demand regulations or a break up. There are forces of good within the DOJ that are putting together an antitrust case against Apple, and the more of us that lend our voices, the louder and more compelling the argument.
The DOJ is really the last lever we have, and that's pretty good measure for the power Apple wields.
It always starts with child porn, and in a few years the offline Notes app will be phoning home if you write speech criticising the government in China.
This technology inevitably leads to the sueveillance, suppression and murder of activists and journalists. It always starts with protecting the kids or terrorism.
Perceptual hashes like what Apple is using are already used in WeChat to detect memes that critique the CCP.
What happens on local end user devices must be off limits. It is unacceptable that Apple is actively implementing machine learning systems that surveil and snitch on local content.
> in a few years the offline Notes app will be phoning home if you write speech criticising the government in China.
A totalitarian autocracy like China does not need this technology to search for wrongspeech, sadly. You are of course aware that all Chinese iCloud users get their data stored in a special set of datacenters that Apple actually doesn't control.
The problem is, that this will be done in other, (currently) freer countries. Eg. Reddit has removed a lot of anti-china posts in the last few years. And of course, local leaders will use this to find anti-local-leader stuff on their citizens phones too.
This seems to be done voluntarily by NVIDIA, at least partially. While in Seattle I set up geo-blocking on my LAN as an experiment. Later when I tried to create an NVIDIA account I couldn't because it was attempting to store my PII at nvidia.cn. When I changed the url to nvidia.com everything worked just fine. I've always wondered what non-evil reasons one could use to explain that choice by NVIDIA. Ping was at least 2x longer to .cn.
Which'd be fine if we had one global government with world wide jurisdiction. Or technology choices from companies which couldn't be pressured by governments outside your personal regulation jurisdiction.
I wouldn't hold your breath waiting for those regulations to become law in, say, Chine or Turkey or Saudi Arabia. I'd bet even Israel won't pass them, surely NSO have enough political lobbying swing (and probably also suitable blackmail material on sitting politicians).
>Which'd be fine if we had one global government with world wide jurisdiction.
I wholeheartedly disagree. A world wide goverment would be catastrophic for whistleblowing. Atleast as a whistleblower you can live somewhat safely in a country opposing your own. With a one world government you would have nowhere to run.
And I wouldn't expect them to protect citizen's interests any better than current governments do. Contrary, I think this lack of balance in the world would embolden them further.
They sell software designed to break into people's phones to oppressive regimes all over the planet. They definitely have the capability and requisite lack of ethics to compromise their own politicians; there's no probably about it.
But we don't have a global government, so the next best thing is for individual countries to pass such regulation, which would prevent products violating privacy like this from being offered and sold in those countries.
Think of GDPR, which is essentially each member of the EU saying in unison, "your product/service must comply with these data protection laws, or you can't legally do business with any of our citizens".
Come to think of it, I wonder if this Apple thing would even fly under GDPR?
> Come to think of it, I wonder if this Apple thing would even fly under GDPR?
Possibly? I’m not a lawyer, but if this is about compliance with a legal obligation, and they’re under that category of pressure? I think GDPR would allow that?
Certainly seems more likely allowed than the stuff Facebook complained Apple was preventing them from doing.
I agree, I would add that people have generated legal images that match the hashes.
So I want to ask what happens if you have a photo that is falsely identified as one in question and then an automated mechanism flags you and reports you to the FBI without you even knowing. Can they access your phone at that point to investigate? Would they come to your office and ask about it? Would that be enough evidence to request a wiretap or warrant? Would they alert your neighbors? How do you clear your name after that happens?
edits: yes, the hash database is downloaded to the phone and matches are checked on your phone.
Another point is that these photos used to generate the fingerprints are really legal black holes that the public is not allowed to inspect I assume. No one wants to be involved in looking at them, no one wants to be known as someone who looks at them. It could even be legally dangerous requesting to find out what has been put into the image database I assume.
>I would add that people have generated legal images that match the hashes.
That seems like a realistic attack. Since the hash list is public (has to be for client side scanning), you could likely set your computer to grind out a matching image hash but of some meme which you then distribute.
At least one of these two things must be true: either Apple is going to upload hashes of every image on your device to someone else's server, or the database of hashes will be available somehow to your device.
Replying to my own question since I can’t edit anymore: it turns out “perceptual hashing,” which I didn’t know much about, has exactly this property, that small changes in the input result in small changes in the output.
One thing to note is these are not typical cryptographic hashes because they have to be able to find recompressed/cropped/edited versions as well. Perhaps a hash is not an accurate way to describe it.
There have been a number of cases where people have found ways to trick CV programs in to seeing something that no human would ever see. If you were sufficiently malicious I imagine it would be possible to do with this system as well.
No need to upload every hash or download a huge database with very hash. If I were building this system, I'd make a bloom-filter of hashes. This means O(1) space and time checking of a hash match, with a risk of false positives. I'd only send hashes to check against a full database.
No, your hashes are not uploaded to the cloud, yes, hashes are downloaded to your phone. Yes, it will be interesting to see if it gets spammed with false positives, although it seems as though that can easily be identified silently to the user.
If the details of the "hashing" scheme used is publicized, I imagine it will be near trivial. It's a long-standing problem in computer vision, to find a digital description of an image such that two similar images compare equal or at least similar.
State-of-the-art for this field is deep learning, and a /huge/ problem with the DL approach is that you can generate adversarial examples. So for example, a picture of a teacup that is identified by /most/ networks as a dog. It's particularly damning, because it seems like you don't have to do this for particular deep networks, they get tricked the same way, so to speak.
Indeed, at which point we’ll know if Apple has implemented an obviously broken solution which opens us up to egregious government surveillance, or whether that is all just speculation without a factual basis.
This isn't cryptographic though. That would make the entire database absolutely trivial to bypass with tiny imperceptible random changes to the images.
It cannot be a cryptographically secure hash, simply because avoiding detection would then be trivial: change one channel in one pixel by one. Imperceptible change, different cryptographic hash.
The probability of Apple and all its devices winking out of existence due to quantum fluctuations is not zero. ‘Not zero’ is effectively zero if the number is small enough.
128 bit hashes are “expect your first collision when each human buys 2305 iPhones, all with one terabyte storage, and then fills them up with photos that are an average of 1MB in file size”
That's the stated purpose, but keep in mind that these databases (NCMEC's in particular, which is used by FB and very likely Apple) contain legal images that are NOT child porn.
Think of it this way, take a regular, legal set of adult pornographic pictures. While legal, we'd still classify this set of pictures as known porn if we were tracking it.
Now the first few images might be the model completely clothed and not even be porn, maybe there's a picture of her lounging around a pool, then another picture of the pool itself. Still its part of a set of pictures that is known porn.
Heck most porn starts off with actors being clothed (so I hear lol).
No-no-no. It's not your phone. If it was your phone - you would have a root access to it. It's their phone. And it's their photos. They just don't like when there's something illegal on their photos, so they will scan it, just in case.
So... all your hashes will be uploaded to the cloud? How do you prevent them from scanning other stuff (memes, leaked documents, trump-fights-cnn-gif,... to profile the users)?
Or will a huge hash database of child porn hashes be downloaded to the phone?
Honestly, i think it's one more abuse of terrorism/child porn to take away privacy of people, and mark all oposing the law as terrorists/pedos.
...also, as in the thread from the original url, making false positives and spreading them around (think 4chan mass e-mailing stuff) might cause a lot of problems too.