Kubecost fully supports labels by default-- however until recent versions we relied on kube-state-metrics for that data, which requires users to whitelist the labels they want to see. If you upgrade to our latest version, I'd be surprised if this still wasn't working for you. Once you've upgraded, you can reach out to [email protected] if labels still aren't appearing or me personally [email protected] and I'd be happy to help troubleshoot.
Each use case is different, but I think expiring the data after 60d in your monitoring stack makes sense both from a scaling perspective of your monitoring stack and from a privacy perspective.
I wouldn't necessarily put _user_ data in labels, but team/product names and contact info of the coworkers responsible for the service seem fine to me.
Stackwatch is the company behind Kubecost https://github.com/kubecost/cost-model . We're building tools to help devops and finops work together to track, manage, and optimize containers in Kubernetes. We're a small team of 4 engineers and 1 salesperson today, all remote, and looking to grow. We're backed by some great investors and actively selling an open-code product to enterprise today.
There's definitely ways to do this by configuring your scrape configs to ignore sets of nodes. Curious though when you last tried kubecost out? We've built out some caching mechanisms in the product over the last month or so that should dramatically reduce load / memory consumption on prometheus. If you reach out on our slack kubecost.slack.com we can discuss more about expected Prometheus resource consumption.
Hey, former Google Cloud SRE here-- now work on projects to help companies manage cloud spend. Prediction and especially recommendations for lowering cloud spend can get tricky, but I wrote a piece of software and some grafana dasboards that with one click can deploy and calculate cluster costs if you're using kubernetes. Obviously this doesn't include, say, s3 storage costs per bucket, and doesn't much help if you're not using kubernetes, but this is just a first step.
More general solutions can be instrumented with cloudhealth https://www.cloudhealthtech.com/, but that's a bit more of an enterprise-y solution.
If you want to look the average comp at FAANG or any well-paying public company it's "only" 250k. That versus the average case at a startup being closer to 150k (average options are worth 0).
Peerkit's caching layer hacks around the HTML5 localStorage limit by opening iFrames to multiple top-level domains. Agreed that it's a messy solution that introduces some overhead (loading iFrames), but it seems to work.