IvanHall's comments

IvanHall · on March 19, 2022

Hello Ivan here (one of the founders)

At the moment we primarily need help with development and funding. But if you have suggestions or want to help in some other ways, please let us know!

IvanHall · on March 19, 2022

Hello, Ivan here (the other founder).

1. We would prefer to be funded with donations like Wikipedia.

2. I don't think we can avoid it completely, perhaps with volunteers helping us determine the trustworthiness of websites. Do you have any suggestions?

3. I think programmers and people with experience raising money for nonprofits could help the most right now. But if you see some other way you would want to contribute, please let us know!

linspace · on March 19, 2022

Regarding the raise of money I wouldn't be surprised if given the current state of things in the EU you could manage to get some funding. I have no experience on it but there are companies specialized on helping with writing grant proposals.

IvanHall · on March 19, 2022

Hello, Ivan here (the other founder).

1. Yes, any structured data could definitely help improve the results, I personally like the Wikidata dataset. It's just a matter of time and resources :)

2. The first step will probably be to handle this in our "post processing". We query several servers when doing a search and often get many more results than we need and in this step we could quite easily remove identical results.

3. The ranking is currently heavily based on links (same as Google) so we will have similar issues. But hopefully we will find some ways to better determine what sites are actually trustworthy, perhaps with more manually verified sites if enough people would want to contribute.

4. I think that Gigablast and Marginalia Search are really cool and interesting to see how much can be done with a very small team.

Seirdy · on March 20, 2022

> Yes, any structured data could definitely help improve the results

Which syntaxes and vocabularies do you prefer? microformats, as well as schema.org vocabs represented Microdata or JSON-LD, seem to be the most common acc to the latest Web Data Commons Extraction Report[0]. The report is also powered by the Common Crawl.

[0]: http://webdatacommons.org/structureddata/2021-12/stats/stats...