Hacker Newsnew | past | comments | ask | show | jobs | submit | reincoder's commentslogin

I work for IPinfo. We do not provide reputation scoring, by the way. Reputation is such a subjective matter.

It would be easy for us to make a very quick sales if we start offering reputation scoring, but we, as a company, would rather support fraud detection, threat intelligence and bot detection services with raw data from us.

In fact, the 1400 servers we operate for internet measurement all have very sophisticated honeypots baked into them, but still, we have not productized that data. In our experience of the fast-moving world of IP addresses, reputation scoring, even with the best intentions, can introduce some downsides. We can do many things which will be better than most things out there, but we have to really balance the consequences of our product.


Thank you for your work and insights. I am a very satisfied paid user for many years. Keep up the good work!

Appreciate the balanced view as well.

Reputation scoring is useless metrics IMHO exactly for reasons you stated - risk appetite and risk model are generally different for everyone. We actually do have IP scoring build on datapoints we have + what ipinfo API gives us. This is tuned to specific projects and practically useless for anyone else.

One of practical point for OP is perhaps to consider an PoV that providing this sort of service will require a lot of intelligence collected from many sources, which OP may not have at this point. Even 1400 servers probably cover limited scope.


Yes, we do use user agents to separate requests to our websites and API service. However, we request users to switch to api.ipinfo.io (dedicated API infrastructure) and use ipinfo.io/json for an explicit request for the API endpoint.

There are indeed some limitations to this implementation. The primary one being IPv6 support. The implementation prioritizes convenience over internet limitations, requiring us to roll out IPv6 dual stacking on the web request level as opposed to the DNS level. On an API level, this results in users with IPv6 addresses making API requests to v6.ipinfo.io.

So, last year we rolled out dedicated API infrastructure for api.ipinfo.io


I work for IPinfo. Even though we are in the VPN detection business, I will give Mullvad the benefit of the doubt, to be honest. They were one of the three VPN providers we found that did not attempt to submit inaccurate geolocation information to IP geolocation providers like us. I am sure they will fix the issue.


Who else ?


Windsribe and iVPN.

https://ipinfo.io/vpnreport


> five providers offered locations labeled as “Bahamas”: [...]. For all of them, measured traffic was in the United States, usually with sub-millisecond RTT to US probes.

Foiled by light speed once again :). Interesting blog post, thanks for sharing.

Checking out Windscribe pricing just now, I get a Cloudflare captcha. Really nice of them to make vendor selection that much easier: only two contenders left!


Looks like Windscribe is also recommended by Kagi Specials


Why do so many VPN submit inaccurate info ? Are we talking intention to mislead or is it more about just scrambling / obscuring location ?


You have to ask them. I tried but did not get a clear answer.

We operate nearly 1,400 servers across almost 160 countries ourselves. From our perspective, it is VERY hard to maintain and expand a network infrastructure of this scale. When you start getting servers in West Africa, Northern Africa, the plains in North America, or Oceania, the Eastern Indian Ocean, you are expected to pay magnitudes more compared to servers with equal performance in NYC or Amsterdam. Maintaining such a diversified network infrastructure from a technical point of view is extremely challenging. Then there is the official and bureaucratic process.

Now, we are just scratching the surface. VPNs require high volume traffic throughput. Some countries (entire countries) just do not have the capacity to offer that.

So, most of the time VPN companies tend to work with specialty VPN infrastructure companies. They provide everything from hosting to networking across dozens of locations they operate in. I believe there are even white-label VPN companies that handle everything from infrastructure handling all the way to billing and even support handling. You just bring your branding. It can be argued that there is little incentive to go out there, do it all from scratch.

Is it intentional or just obscuring? From what we see, it leans intentional. The location they report is not inaccurate information by accident, it looks quite deliberate. Legacy IP geolocation services rely on something called a geofeed. A geofeed is a self-reported unverifiable report published by a network operator. Geofeeds are not widely adopted (1.5% of IPv4 and 0.70% of IPv6 allocated prefixes, 2023 data), but VPN providers maintain theirs diligently. They actively publish the locations they want IP geolocation providers to report.

One point raised by a journalist on the reporting side: imagine your VPN server points to one of the offshore islands in the Caribbean that sit outside US jurisdiction, only to find out the actual VPN server is in Miami. That is a bit risky.


I used the same methodology to observe AI crawlers. This is not an investigative blog but is rather designed to address our (IPinfo) customers who are asking us to identify IP addresses as "AI Agents" or, more accurately, "AI Crawlers".

https://community.ipinfo.io/t/can-we-detect-ai-agents-we-can...

Most AI crawlers self-identify with a UA. However, Grok uses resproxies and sends a high volume of simultaneous requests. Even though we can detect resproxies, it is not possible to map these resproxy IPs to grok.

I still could not figure out why I saw legitimate Googlebot IPs when I requested Perplexity to review the website. I verified those Googlebot IPs using both using UA and the listed IP address ranges published by Google.


I work for IPinfo and we operate a distributed network consisting of around 1,400 servers. I think we have reached a point where it is extremely hard for us purchase VPSes from interesting ASNs.

To support lots of ISPs, universities, and different organizations we have been asking them if they have an old laptop lying around that they can host our software on. Goal is to reach 70,000 probes within the next couple of years.

It is a simple probe software and we share some data or we can pay 20-30 bucks a month for it. We have a couple of NUCs in remote regions but no laptops yet. Basically, we are even happy if an ISP (or any one) hosts our software from a laptop dangling by a charging cable from a socket in some random corner.

We can send over a RPI or NUC, but with remote hands, and setup and all that it can get quite expensive. So, we always first ask if they have an old laptop lying around and can install our software there.

For us, at least, we are not interested in the hardware aspect. We are interested in the network. The old laptop approach only acts as a last resort. We will be more than happy to go with the predictability of a traditional VPS hosted in a traditional data center. Colocation, no matter what form it takes, involves a lot of moving parts.


  ---
  Edit to my parent company
  ---
I am scaling back this enthusiasm a bit. We need to work with mature organizations instead of individuals.

We need SSH through a public IP address and use Ubuntu as an operating system. Since we need to determine where these servers are located, we need to collaborate with universities, various organizations, IXP, DCs, telecom/ISP consulting companies with a mature understanding of network engineering.


Interesting challenge! My first thought: 70k probes is a lot and having to set that up is quite a task. Why not develop an phone app with exit node capabilities (similar to Tailscale) so you can use that for probing? The real win is that people move around, obtaining you even more data points from other network.


We actually have app-based data collection capabilities and initiatives. Our goal, or more appropriately, vision, is to map the internet in real time. This involves SSH access to devices to run different forms of measurements at a very high frequency and have control over those devices.

Managing 70k probes is not going to be super hard.

Managing 1,400 servers is just a normal business operation, not a technical challenge. Each probe has a standard OS-level configuration. Automation and configuration are deployed from a central system. Each probe is actively monitored and troubleshot. Data is dumped to a data warehouse. We make incremental improvements to our network. When servers go down, we talk to vendors.

We do a lot of novel engineering things from the infrastructure, data, and research team. Having a very identical set of servers really allows us to focus on product and performance engineering, not troubleshooting engineering. With application-based probing, I assume it will complicate things quite a bit, as there are different operating systems, different devices, etc.

For us, lately the challenge is not technical. It has been exclusively procurement. This quarter (https://ipinfo.io/blog/probenet-q1-2026-expansion), we exclusively focused on regional diversity which involved outreach to national ISPs or telecoms. Securing servers from telecoms is an extremely bureaucratic and expensive process. So, we are hoping to partner up with eyeball networks and the larger NOG community.


I work for IPinfo. We provide a free country and ASN database on a free tier with an unlimited amount of requests. You can download the entire database or use the API services. For country and ASN, it is free.

However, we do not offer city level data for free. > How would one even go about verifying it?

We believe we are the most accurate IP data provider out there, but you should come to that conclusion yourself.

I can tell you why our data is super accurate compared to the rest of the industry. The industry as a whole uses self-reported information that is offered by ASN and ISPs. It is called "geofeed". The issue with geofeed is that IP geolocation providers do not tend to verify the accuracy. Many providers just aggregate these public records and repeat what the ISPs and ASNs want them to tell them. This is a quite bad practice.

So we built a network of distributed servers (currently 1360 servers across 160 countries) that run ping, traceroute and other internet measurements and try to infer the location of IP geolocations. This means when you come to asking how do I know you are accurate, we can share our active measurement data and tell you that this is the evidence.

Now, comes the qustions of how you identify accuracy yourself.

First, if you have access to a large pool of known locations of IP addresses, you can run comparisons across different vendors. You need a GPS-backed device to locate IP addresses.

If you do not have a large pool of well-known location IPs, you can take a sample of IP addresses and check them yourself across multiple vendors. You can then use a tool like ping.sx or our own tool ipinfo.io/probenet/live to see evidence of where these IP addresses are located based on latency.

Do not bet on consensuses among IP geolocation providers; run your own tests.

Our data was evaluated by peer-reviewed academic research. You can take a look at that as well, if you want.

> I am not really using this data for anything other than have enough data to troubleshoot customer support/fraud.

Now, I will be honest...you should not pay anything to us. The way you have describing your issue, it seems like the free services we already offer that should satisfy your need.

Do you really need large scale IP address enrichment of all the IP addresses that visit your website? If yes, then for the first layer use our free data that provides ASN and country information.

Then, when you need troubleshooting with your customers, you can look up those individual IP addresses for free on our website, where we provide all our data for free access.

---

Let me know if you need any help, always happy to answer questions.


So many of my open questions answered in one answer. Thank you.

A follow up based on new information - if 'geofeed' identifies something with wrong geo location, and your method detects different geolocation, what do I see as the consumer consuming your API? I am assuming the inferred data, but that also feels counter-intuitive (since the data does not align with what ASN/ISP are reporting).

How often does your active measurement data disagree with geofeed data?

How do you handle mobile/cellular IPs

> Do you really need large scale IP address enrichment of all the IP addresses that visit your website? If yes, then for the first layer use our free data that provides ASN and country information.

If I am troubleshooting a support case that is days/weeks/months old, wouldn't this mean that enriching this information at a later date may give me different data than what it was associated with at the time the requests were made? My understanding was that IPs get re-assigned.

How frequently do IP-to-location mappings change in practice?

Do you offer historical IP data snapshots?


> I am assuming the inferred data, but that also feels counter-intuitive (since the data does not align with what ASN/ISP are reporting).

That is a very good question. Now, geofeed does not have a verification system. Active measurement is something we use to verify ASN or ISP itself.

Even active measurement has its own limitations. Now in those case where we see active measurements not producing reliable data, we do reach out to ISPs and ASNs to purchase a server in their facility. Geofeed as a system is voluntary and most major ISPs actually do not maintain or even publish that. For example, today I found out a major UK-based telecom geolocated 500k IP addresses in a town with 200k people. ISPs are not inherently incentivized to maintain the accuracy of their self-reported, voluntarily published location data. So, we do proactive outreach to purchase a server from them so we can provide consistent accurate data for their IP addresses.

On the matter of advertised locations not matching actual location, I highly recommend reading this: https://ipinfo.io/blog/vpn-location-mismatch-report

For residential ISPs, we do a lot of outreach and open communication to build a good partnership with them. The goal is that we pay for the privilege to report accurate data for them.

> How often does your active measurement data disagree with geofeed data?

Very frequently.

Here is the summary peer reviewed research paper on this matter: https://community.ipinfo.io/t/ip-geolocation-and-geofeeds-wh...

Active Measurement (1,330 probes, 27.7M RTTs):

  - Country-level: 92.0% accurate → 8% wrong country
  - City-level: 79.6% accurate → 20.4% wrong city
Mobile Device GPS (169 devices, 24 countries):

  - Country-level: 84.5% accurate
  - City-level: 29.9% accurate → 70% wrong city
> How do you handle mobile/cellular IPs

Primarily through active measurement, we are also running a lot of research around more reliable mobile geolocation data.

Because our data is updated daily, I think due to the refresh rate we have an accuracy advantage.

> If I am troubleshooting a support case that is days/weeks/months old, wouldn't this mean that enriching this information at a later date may give me different data than what it was associated with at the time the requests were made? My understanding was that IPs get re-assigned.

You will be surprised to know that historical IP location does not have much demand.

If you are evaluating a support case after some time, you should work with your current data. If the customer raises a question, you address this in real time with their current IP address.

Usually, I do not recommend storing historic IP geolocation information. In most operations, the enrichment happens in real time within the day. Unless you want to do periodic reporting of some sort.

Internally, we of course have the data, but because our IP geolocation is so accurate, it currently sits at around 700 MB. If you add a historical layer to that data, it will be a terabyte of data. There is not much consumer need for it.

> How frequently do IP-to-location mappings change in practice?

https://ipinfo.io/blog/how-many-ips-change-geolocation-over-...

On the city level is 1.3% each day and 16% each month.

> Do you offer historical IP data snapshots?

I highly recommend that you work with current day's data.

In cases where we provide historical data, it is usually for academic research.

---

Let me know if you have any more questions.


> On the matter of advertised locations not matching actual location, I highly recommend reading this: https://ipinfo.io/blog/vpn-location-mismatch-report

Good read

Do you happen to know if anyone is compiling all of this data about VPNs into one place? It would be super interesting to know which VPNs are providing genuine services vs masquerading the locations. Maybe even an SEO for you.

> I highly recommend that you work with current day's data.

Just to clarify: You are suggesting that we don't pro-actively enrich every IP address, store IPs, and only enrich them when troubleshooting something?


> Do you happen to know if anyone is compiling all of this data about VPNs into one place? It would be super interesting to know which VPNs are providing genuine services vs masquerading the locations. Maybe even an SEO for you.

We made that report independently and, according to our analysis, we only identified three VPNs: Windscribe, Mullvad, and iVPN to not have virtual VPN server locations.

> Just to clarify: You are suggesting that we don't pro-actively enrich every IP address, store IPs, and only enrich them when troubleshooting something?

I think you should experiment with this yourself a little. The Lite API is completely free. So you can do ingestion enrichment and post-enrichment enrichment. See what works best for you.


Did a quick dive to explore viability of migrating to ipinfo. My idea was: use lite version for enriching everything and then use pay-as-you-go for enriching authenticated user sessions.

I couldn't get /lite/ to work. In a sample of IPs I've tried with, multiple are returning 404. Your website for the same IPs is returning information. Looks like these are just not included in the lite dataset?

Turns out there is no pay-as-you-go tier. Subscription is the only option. Not a deal breaker, but dissapointing setup.


> I couldn't get /lite/ to work.

Email me: [email protected]

I think there is an issue with setting up our API.


Just to close the (public) loop, the issue was that we were using wrong API endpoint: ipinfo.io/lite instead of api.ipinfo.io/lite.

Thank you Abdullah


I tried using the /lite/ endpoint to get country data, but it is giving 404 errors for valid IPs.

{ "status": 404, "error": { "title": "Wrong ip", "message": "Please provide a valid IP address" } }


I work for IPinfo. The accuracy you see is inferred data actually. Our IP address location should not perfectly pinpoint anyone, unless that IP address is a data center of some sort. The highest accuracy for a non-data center IP address is usually at the ZIP code level. In terms of carrier IP addresses, currently we do one data update per day. If we did more, I guess the accuracy of mobile IP addresses would improve, but on an overall scale, it would be quite miniscule.

Our country-level data (which is free) is 10-15 times larger than the free/paid country-level data out there. We constantly hear that the size of the database is an issue. The size is a consequence of accuracy in the first place. So, it is a balancing act.


> Our IP address location should not perfectly pinpoint anyone, unless that IP address is a data center of some sort.

By perfectly, I meant it got my city and zip correct, but I looked up the lat/lng and its a 5 min drive away. So pretty dang close!

Not sure how you got it that close if its only supposed to point to the nearest data center.


I work for IPinfo. Has our data been inconsistent for you? We actually invest heavily and continuously in data accuracy. I think for hosting IP addresses we are nearing the highest level of accuracy possible, especially with data center addresses. We are investing in novel, cutting-edge research for carrier IP geolocation.

I am curious about your experience with us so far.


I work for IPinfo. We track close to a hundred resproxy providers. So, if OP's router is compromised, the device IPs will likely be flagged.

From what I know, whenever a router is backdoored or a resproxy SDK gains access to a device to use their bandwidth, the access to that pool of devices is often shared among multiple resproxy vendors. Many resproxy vendors do not have their own SDKs for their services.

Also, as far as I know, not many resproxy operators manage their sim farms or hardware pools. It is mostly based on compromised devices or SDK access.


This is called a geofeed. Companies that own or operate IP addresses can customarily share the location of those IP addresses. This is less of "IP-based Geolocation" rather "Geolocation of IP addresses.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: