Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have a couple of similar scrapers as well. One is a private repo that I collect visa information off Wikipedia (for Visalogy.com), and GeoIP information from MaxMind database (used with their permission).

https://github.com/Ayesh/Geo-IP-Database/

It downloads the repo, and dumps the data split by the first 8 bytes of the IP address, and saves to individual JSON files. For every new scraper run, it creates a new tag and pushes it as a package, so the dependents can simply update them with their dependency manager.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: