They wouldn't, well there's Etag and alike but it still a round trip on level 7 to the origin. However the pattern generally is to say when the content is good to in the Response headers, and cache on that duration, for an example a bitcoin pricing aggregator might say good for 60 seconds (with disclaimers on page that this isn't market data), whilst My Little Town news might say that an article is good for an hour (to allow Updates) and the homepage is good for 5 minutes to allow breaking news article to not appear too far behind.
Based on the post, it seems likely that they'd just delay per the robots.txt policy no matter what, and do a full browser render of the cached page to get the content. Probably overkill for lots and lots of sites. An HTML fetch + readability is really cheap.
Yes, but this leaves the only way to identify this behavior as by reporting from a minor. I'm not saying I trust TikTok to only do good things with access to DMs, but I think it's a fair argument in this scenario to say that a platform has a better opportunity to protect minors if messages aren't encrypted.
I'm not saying no E2E messaging apps should exist, but maybe it doesn't need to for minors in social media apps. However, an alternative could be allowing the sharing of the encryption key with a parent so that there is the ability for someone to monitor messages.
> I think it's a fair argument in this scenario to say that a platform has a better opportunity to protect minors if messages aren't encrypted
Would it be a fair argument to say the police have a better opportunity to prevent crimes if they can enter your house without a warrant? People are paranoid about this sort of thing not because they think law enforcement is more effective when it is constrained. But how easily crimes can be prosecuted is only one dimension of safety.
> However, an alternative could be allowing the sharing of the encryption key with a parent
Right, but this is worlds apart from "sharing the encryption key with a private company", is it not?
> Would it be a fair argument to say the police have a better opportunity to prevent crimes if they can enter your house without a warrant?
This is a false equivalency. I don't have to use TikTok DMs if I want E2EE. I don't have a choice about laws that allow the police to violate my rights. I'm not claiming that all E2EE apps should be banned.
> Right, but this is worlds apart from "sharing the encryption key with a private company", is it not?
Exactly why I suggested that as a possible alternative.
I'm not making an equivalency. I'm just trying to get you to think how something that is at surface level true is not necessarily a "fair argument".
> I don't have to use TikTok DMs if I want E2EE.
I don't know why you think this is a convincing argument. It is currently illegal to tap people's phone lines, but when phones were invented it obviously was not illegal. It became illegal in part because people had a reasonable expectation of privacy when using the phone. They also have a reasonable expectation of privacy when using TikTok DMs - that's why people call them "private messages" so often!
> Exactly why I suggested that as a possible alternative.
My point is that you are offering these as alternatives when they are profoundly different proposals. It is like me saying I am pro forced sterilization and then offering as an alternative "we could just only allow it when people ask for it". That's a completely different thing! Having autonomy over your online life as a family rather than necessarily as an individual is totally ok. Surrendering that autonomy is not.
> Police cannot access your E2EE DMs with a warrant.
They can and do, regularly. What they can't do is prevent you from deleting your DMs if you know you're under investigation and likely to be caught. But refusing to give up encryption keys and supiciously empty chat histories with a valid warrant is very good evidence of a crime in itself.
They also can't prevent you from flushing drugs down the toilet, but somehow people are still convicted for drug-related crimes all the time. So - yes, obviously, the police could prosecute more crimes if we gave up this protection. That's how limitations on police power work.
It certainly can be - destruction of evidence is a crime. If they can prove you destroyed evidence, even if they can't prove that the destroyed evidence incriminates you, that's criminal behaviour. For instance if it's known by some other means you have a conversation history with person X, but not whether that conversation history is incriminating, and then when your phone is searched the conversation history is completely missing, that is strong evidence of a crime.
And they shouldn't be able to. Police accessing DMs is more like "listening to every conversation you ever had in your house (and outside)" than "entering your house".
>Police cannot access your E2EE DMs with a warrant.
Well the kind of can if they nab your cell phone or other device that has a valid access token.
I think it's kind of analogous to the police getting at one's safe. You might have removed the contents before they got there but that's your prerogative.
in public is the operative word (and surveillance cameras in public are extremely recent and very controversial, so not as strong an argument as you might be thinking)
> I'm not saying no E2E messaging apps should exist, but maybe it doesn't need to for minors in social media apps. However, an alternative could be allowing the sharing of the encryption key with a parent so that there is the ability for someone to monitor messages.
The problem with that idea, that you are implying E2E should require age verification. Everyone should have access to secure end to end encryption.
Are you suggesting all messaged photos should be scanned, and potentially viewed by humans, in case it depicts a nude minor? Because no matter how you do that, that would result in false positives, and either unfair auto-bans and erroneous reports to law enforcement (so no human views the images), or human employees viewing other adults' consensual nudes that were meant to be private. Or it would result in adult employees viewing nudes sent from one minor to another minor, which would also be a major breach of those minors' privacy.
There is a program whereby police can generate hashes based on CSAM images, and then those hashes can be automatically compared against the hashes of uploaded photos on websites, so as to identify known CSAM images without any investigator having to actually view the CSAM and further infringe on the victim's privacy. But that only works vs. already known images, and can be done automatically whenever an image is uploaded, prior to encryption. The encryption doesn't prevent it.
Point being, disallowing encryption sacrifices a lot, while potentially not even being that useful for catching child abusers in practice.
I'm sure some offenders could be caught this way, but it would also cause so many problems itself.
I have a side project of a new type of JSON database. Schema discovery is performed on the fly and that schema is then used to compress the stored data. This eliminates the need to use short key names to save space in addition to reducing overall storage requirements.
> First, it is hard, especially in at least somewhat portable manner.
I'm curious what portability concerns you've run into with JSON serialization. Unless you need to deal with binary data for some reason, I don't immediately see an issue.
> Such representation, which is by the way specified to be executable bidirectionally (roll back capabilities), is a full blown program
Of course this depends on the complexity of your problem, but I'd imagine this could be as simple as a few configuration flags for some problems. You have a function to execute the process that takes the configuration and a function to roll back that takes the same configuration. This does tie the representation very closely to the program itself so it doesn't work if you want to be able to change the program and have previously generated "plans" continue to work.
> I'm curious what portability concerns you've run into with JSON serialization.
The hard part concerns instructions and it is not technical implementation of serializing an in-memory data structures into serialization format (be it JSON or something bespoke) that is the root of complexity.
> You have a function to execute the process that takes the configuration and a function to roll back that takes the same configuration.
Don't forget granularity and state tracking. The opposite of a seemingly simple operation like "set config option foo to bar" is not a straightforward inverse: you need to track the previous value. Does the dry run stop at computing the final value for foo and leaves possible access control issues to surface during real run or does it perform "write nothing" operation to catch those?
> This does tie the representation very closely to the program itself so it doesn't work if you want to be able to change the program and have previously generated "plans" continue to work.
Why serialize then? Dump everything into one process space and call the native functions. Serialization implies either strictly, out of band controlled interfaces, which is a fragile implementation of codegen+interpreter machinery.
Probably a much better idea to just go ahead and hit shutdown if you're on that screen anyway, since many phones are more susceptible to gear like Greykey or Cellebrite if they have ever been unlocked since the last power-on.
We wanted S2 to be one API. Started out with gRPC, added REST - then realized REST is what is absolutely essential and what most folks care about. gRPC did give us bi-directional streaming for append/read sessions, so we added that as an optional enhancement to the corresponding POST/GET data plane endpoints (the S2S "S2-Session" spec I linked to above). A nice side win is that the stream resource is known from the requested URL rather than having to wait for the first gRPC message.
gRPC ecosystem is also not very uniform despite its popularity, comes with bloat, is a bit of a mess in Python. I'm hoping QUIC enables a viable gRPC alternative to emerge.
I found at least one example[0] of authors claiming the reason for the hallucination was exactly this. That said, I do think for this kind of use, authors should go to the effort of verifying the correctness of the output. I also tend to agree with others who have commented that while a hallucinated citation or two may not be particularly egregious, it does raise concerns about what other errors may have been missed.
Note that the headline is from Langfuse, not ClickHouse. Reading the announcement from ClickHouse[0], the headline is "ClickHouse welcomes Langfuse: The future of open-source LLM observability". I think the Langfuse team is suggesting that they will be continuing to do the same work within ClickHouse, not that the entire ClickHouse organization has a goal of building the best LLM engineering platform.
I'm curious what definition the author is using of professionalism.
reply