I'm working on a JSON schema discovery tool, JSONoid[0]. JSONoid can discover many more features of JSON Schema than existing tools such as regular expression patterns, formats, and dependencies. I'm also working on integrating this with some past work I've done on using LLMs to augment JSON Schemas[1,2].
There are a number of use cases for such a tool. One is for helping data analysts who are handed a pile of JSON documents to be able to more quickly and effectively craft analytics pipelines for heterogeneous data where just inspecting a few documents isn't sufficient. Another is to help automate API specification generation and regression testing. Definitely interested in any feedback.
There are a number of use cases for such a tool. One is for helping data analysts who are handed a pile of JSON documents to be able to more quickly and effectively craft analytics pipelines for heterogeneous data where just inspecting a few documents isn't sufficient. Another is to help automate API specification generation and regression testing. Definitely interested in any feedback.
[0] https://github.com/dataunitylab/jsonoid-discovery/ [1] https://michael.mior.ca/blog/llms-for-schema-augmentation/ [2] https://arxiv.org/abs/2407.03286