> Recommended practice is to refer to the subject with a URI. If this is not possible or feasible, a literal value that identifies the subject may be provided. Both should preferably refer to a subject in a controlled vocabulary.
This would be similar to how wikidata expresses knowledge [2]:
Thank you. Honestly, working with GraphDB and SPARQL inspired me to create this. I did consider if I could create a "real" knowledge graph and even went as far as searching for in-memory graphdb that lives in client browser (maybe one that's built on top of IndexDB. I thought, hey if in-mem RDBMS like H2 exists, is there an in-mem GraphDB available? :D) so that I can query it using SPARQL, but couldn't find anything on it. I wanted to do this without any infrastructure while keeping the bundle sizes low but yes, the way you explained is how it should have actually been done.
I recommend implementing this in a 3D WebXR AR/VR experience for immersive navigation or look at your data through Flow Immersive. Seeing your data in free space arround you is a great way to gain insights.
This is an incredibly useful swiss-army-knife like tool. I have similarly found rclone to be quite useful. I'm wondering if people know of more tools of similar nature than rclone and/or benthos.
> it's pretty difficult to build One ELN to Rule Them All given how flexible many kinds of biological experimental designs are - especially when you're working on the bleeding edge.
RDF is quite flexible and using a combination of domain specific ontologies like cheminf[1] and other top level ontologies like BFO[2] should allow you to capture most of the semantics.
> Ok, fine. But I'm not sure how this helps if you have six different systems with six different definitions of a customer, and more importantly, different relationships between customers and other objects like orders or transactions or locations or communications.
If you have this problem, consider giving RDF a look - you can fairly easily use RDF based technologies to map the data in these systems onto a common model, some examples of tools that may be useful here is https://www.w3.org/TR/r2rml/ and https://github.com/ontop/ontop - you can also use JSON-LD to convert most JSON data to RDF. For more info ask in https://gitter.im/linkeddata/chat
To pile on a bit here, JSON-LD is based on RDF, which is an abstract syntax for data as semantic triples (i.e. RDF statements), there is also RDF* which is in development which extends this basic data model to make statements about statements.
RDF has concrete syntaxes, one of them being JSON-LD, and it can be used to model relational databases fairly well with R2RML (https://www.w3.org/TR/r2rml/) which essentially turns relation databases into a concrete syntax for RDF.
schema.org is also based on RDF, and is essentially an ontology (one of many) that can be used for RDF and non RDF data, but mainly because almost all data can be represented as RDF - so non RDF data is just data that does not have a formal mapping to RDF yet.
Ontologies is a concept used frequently in RDF but rarely outside of it, it is quite important for federated or distributed knowledge, or descriptions of entities. It focuses heavily on modelling properties instead of modelling objects, and then whenever a property occurs that property can be understood within the context of an ontology.
This tells me that the entity identified by the IRI <example:JohnSmith> is a person - and their birth date is 2000-01-01. I however don't expect that i will get all other descriptions of this person at the same time, I won't necessarily get their <https://schema.org/nationality> for example, even though this is a property of a <https://schema.org/Person> defined by schema.org
I can also combine https://schema.org/ based descriptions with other descriptions, and these descriptions can be merged from multiple sources and then queried together using SPARQL.
> If you want to 'protect' FOSS projects you care about, take some time to find out what help is useful to the maintainers and contribute towards items that make sense to you. Joining OSI won't help those struggling projects you gain from using.
Indeed, it is an incredibly rewarding experience. Take one thing you use and like, go to it's issue backlog and start fixing/improving things - if there is nothing take the next thing, there are likely 10s of thins you rely on every day that need contributors and contributions. The first issue will be hard, the next one easier, you will be a happier person for doing it, you will make a bigger impact than starting another project you won't finish and that nobody will use, and you will become a better engineer.
Conan can be made to do what homebrew does with minimal effort, I have written some convenience wrappers around it which makes it slightly easier to use for this use case, you can have a look here: https://gitlab.com/aucampia/proj/xonan
> I’ve been considering using it for some projects, but the main thing that’s keeping me away is the concern that some moderator will decide that my data doesn’t fit and remove it.
To me the greatest value of Wikidata was making me aware of RDF and SPARQL.
In most cases, if you are relying on data business needs, it would be best to maintain your own RDF dataset and host it either just on HTTP, or on something like https://dydra.com/.
WikiData deseperately needs RDF ingestion, and if this is made available (can be done outside of Wikidata) then it would be easier to periodically sync datasets with Wikidata.
On that note however, you could export all Wikidata triples you need and just host that on your own SPARQL server (e.g. Jena) or use it with RDF tools like rdflib.
RDF ingestion is problematic for Wikidata, because importing a dataset to Wikidata requires reconciling existing entities so as to avoid duplicare entries. The easiest way to achieve that is to publish your dataset online, create a linking Wikidata property for it, then ask for it to be imported in https://mix-n-match.toolforge.org where reconciliation can be done by the crowd.
Last I checked mix-n-match was using CSV, while this is okay, it still would be nicer to have direct RDF ingestion. And yes, I realize the reason why Wikidata does not have it, but it is not impossible to provide, just really difficult. I would work on it if I had more time and would likely sometime in the future.
So for example, using turtle syntax [1], instead of
<https://engineering.zalando.com/posts/2022/04/functional-tes...> <http://example.com/graph-edge> <https://www.testcontainers.org/>
have
<https://engineering.zalando.com/posts/2022/04/functional-tes...> <http://purl.org/dc/terms/subject> <https://www.testcontainers.org/>
The semantics of http://purl.org/dc/terms/subject is given at the url itself, but in brief:
> A topic of the resource.
> Recommended practice is to refer to the subject with a URI. If this is not possible or feasible, a literal value that identifies the subject may be provided. Both should preferably refer to a subject in a controlled vocabulary.
This would be similar to how wikidata expresses knowledge [2]:
<http://www.wikidata.org/entity/Q28315661> <http://www.wikidata.org/prop/direct/P921> <http://www.wikidata.org/entity/Q750997>
Or in English:
"Go To Statement Considered Harmful"(Q28315661)'s "main subject"(P921) is "goto"(Q750997)
This also makes it easier to query [4], for example, you could get all articles covering a "goto" with the following SPARQL[5] query:
SELECT ?item WHERE { ?item <http://www.wikidata.org/prop/direct/P921> <http://www.wikidata.org/entity/Q750997> }
May help to read the RDF primer [3] also.
[1]: https://www.w3.org/TR/turtle/
[2]: https://www.wikidata.org/wiki/Q28315661
[3]: https://www.w3.org/TR/rdf11-primer/
[4]: https://w.wiki/5RW2
[5]: https://docs.stardog.com/tutorials/learn-sparql