In the US government's view, as expressed in its brief in the Supreme Court:
"Because of the authoritarian structures and laws of the PRC regime, Chinese companies lack meaningful independence from the PRC’s agenda and objectives. As a result, even putatively ‘private’ companies based in China do not operate with independence from the government. Indeed, “the PRC maintains a powerful Chinese Communist Party committee ‘embedded in ByteDance’ through which it can ‘exert its will on the company.’ ... the committee includes “at least 138 employees,” including ByteDance’s “chief editor”
...
"Even assuming that the law would recognize Zhang as a bona fide domiciliary of Singapore and not the PRC, ByteDance would nevertheless qualify as being “controlled by a foreign adversary” under one or more of the other statutory criteria. For instance, ByteDance is “headquartered in” China, which is sufficient on its own.... ByteDance also is “subject to the direction
or control of ” Chinese persons domiciled in China (in particular, Chinese Communist Party officials), which likewise is sufficient on its own."
The saddest part of this to me was watching congressional representatives try to wrestle with the Singapore thing and fail in hearings. It really made me feel like they thought they had some kind of gotcha when in reality all they did was publicly demonstrate how little they actually grasp the real national security threat at play.
So, what is the closest thing in the open source world to what the author describes? (Setting aside the question of is it right for you, which, of course, depends.)
Any OLAP database that accepts unstructured data can be used in this manner.
The ELK stack is a popular choice, albeit with a focus on search rather than OLAP.
If SaaS is an option, a simple staring point in AWS might be Data Firehose into S3 with Athena. Snowflake can load and query the data too. All of these tools have multiple frontend options with a proportional relationship between cost and user-friendliness.
I honestly just do this in PostgreSQL until my project outgrows it. Create a table with a JSONB column and as few indexes as possible to improve write throughput. Cover a timestamp column with a BRIN index to filter by date range.
Where I work we’ve set up OpenTelemetry SDK in the applications to expose traces, logs and metrics.
Grafana agent as OTEL collector on the application hosts, Grafana Tempo as backend for traces, Loki for logs and Prometheus for Metrics.
The cool thing about Tempo it generates metrics for ingested spans and their labels (spanmetrics) so this allows us to explore “unknown unknowns” as the author calls it in a very cost efficient way.
"Because of the authoritarian structures and laws of the PRC regime, Chinese companies lack meaningful independence from the PRC’s agenda and objectives. As a result, even putatively ‘private’ companies based in China do not operate with independence from the government. Indeed, “the PRC maintains a powerful Chinese Communist Party committee ‘embedded in ByteDance’ through which it can ‘exert its will on the company.’ ... the committee includes “at least 138 employees,” including ByteDance’s “chief editor”
...
"Even assuming that the law would recognize Zhang as a bona fide domiciliary of Singapore and not the PRC, ByteDance would nevertheless qualify as being “controlled by a foreign adversary” under one or more of the other statutory criteria. For instance, ByteDance is “headquartered in” China, which is sufficient on its own.... ByteDance also is “subject to the direction or control of ” Chinese persons domiciled in China (in particular, Chinese Communist Party officials), which likewise is sufficient on its own."
http://www.supremecourt.gov/DocketPDF/24/24-656/336144/20241...