A warning for people who, like me, long ago disabled telemetry in Visual Studio Code: Microsoft introduced a new setting that deprecates the two older settings... and the new setting defaults to "all"[1]. I just noticed this now after double-checking my settings. Sneaky, sneaky, Microsoft (but still totally expected).
If the message is to be believed (and the source code does agree: https://github.com/microsoft/vscode/blob/d32b92bd7a49ce8667b...), the old settings still apply, the new one may default to 'all' but the old settings will override it, so they've avoided turning telemetry back on for those who had disabled it (in fact if you didn't update the settings and for some reason had only turned off one of the telemetry options, they have in fact reduced the amount of telemetry they are getting from you).
You gotta love how the documented function return value (TelemetryConfiguration.OFF) doesn't match the actual return value (TelemetryConfiguration.NONE).
The end of the article mentions that VS Codium, while having less telemetry, still calls out to Microsoft services.
They try to remove all telemetry, read more about that here.
...
However, VSCodium can’t shut out all the data collection as it is the same codebase. And since extensions act independently with regard to data collection, you still need to be mindful of what extensions you install.
Sure, but in my opionion at least browser extensions are minor niceties that isn't worth installing whereas vscode extensions are integral to the use of vscode.
Vscode as an IDE is quite useless without it's extensions. And that includes official(?) extensions from Microsoft that also might have other policies regarding tracking.
I think it's great that open license terms enable these sorts of forks and experiments so that the community can work together by sending pull requests.
Is there a good way to install code-server (hosted VSCode/vscodium in a browser tab) plugins from openvsx?
(E.g. the ml-workspace and ml-hub containers include code-server and SSH, which should be remotely-usable from vscodium?)
> Even though we do not pass the telemetry build flags (and go out of our way to cripple the baked-in telemetry), Microsoft will still track usage by default.
Just as showing ads to people willing to pay to not see ads is supposedly the most lucrative thing humans can envision (not sure how royally pissing off your users is profitable but hey, that's me) it seems that tracking people that don't want to be tracked reveals the most precious data imaginable.
Futhermore, by the time humans react and disable the telemetry, things like system information and non-real time telemetry is already uploaded to Microsoft.
The case that we allow users to control privacy needs to be scrutinized more.
Telemetry is godsend for developers, since it lets them track down actual product usage and issues.
Telemetry is a privacy nightmare for users, since it sends data to the developers outside of an average user's control - data which is probably easily associated with an individual and kept for eons (disk space is cheap after all).
I think telemetry collection is starting to become a professional ethics issue for developers, and should be talked about more.
> Telemetry is godsend for developers, since it lets them track down actual product usage
This is is actually a terrible trap, as product usage tells you next to nothing about which features are useful. I suspect it' a big part of why a lot of software has turned to shit last decade or so.
Look at it this way: Your average fire extinguisher sits mounted on a wall its entire life and is never actually used. You still rather have it and not need it than need it and not have it.
> You still rather have it and not need it than need it and not have it.
And there's the trap. Data is a liability, but until a company gets burned for losing it, they won't do anything about it. Well, being real, even if they do get burned it's just the "cost of doing business".
I get what you're saying and I agree but I also think you misinterpreted the parent comment.
> You still rather have it and not need it than need it and not have it.
Given the context of the fire extinguisher, "it" in this case is a useful (possibly critical), not-commonly-needed feature, rather than a collection of data.
Why are we just focused on the "stupid" developer here - the sort who thinks your later "wow the 'close popup' button is so popular" example is a good reason to add more popups?
Do we think the people making calls like that would suddenly have great ideas in the absence of data?
Meanwhile, there are plenty of folks out there who can reason about data in less blindly idiotic ways.
Let's take a different example. Let's say you have a search box.
Option A generates 10 times more searches than option B. This may mean the search box is 10 times more useful than in the B case, or the users require 10 times as many searches to find what they want. Looking at the data, you really can't tell. Even without the data, you can dogfood your application and fairly quickly tell whether the feature is good or bad.
I'm not saying anyone is stupid. Naive, perhaps, but not stupid.
The problem is that interpreting data is difficult. Incredibly difficult. Scientists, who construct experiments and interpret experimental data for a living, with a decade of education behind their back, get this wrong all the time.
> Do we think the people making calls like that would suddenly have great ideas in the absence of data?
Personally, I would be so bold to suggest that yes the availability of data for people ill equipped to interpret it can lead to WORSE decisions than the absence of data.
That is because statistics and the general skill to correctly interpret data is actively counterintuitive. Untrained people will be more likely to generate wrong conclusions than right conclusions.
So, yes data in the wrong hands can and will be harmful.
I'm not sure software is getting worse. It does more, and there is more. A lot of complaints on HN are about businesses they don't like, or technology they thought should never have been invented. It's unfortunate that tracking you across the web is hugely profitable, but the reason that business works isn't due to software malfunction. The software is working great.
Certainly, there is a lot of software out there with poor polish. My pet peeve is that devices like multimeters that used to have segmented VFDs, so had a certain design aesthetic by default. Clean, straightforward, readable. Now those have moved to low resolution LCDs, so you notice that the font they picked is terrible, and the font rendering engine they picked is even worse. It just looks bad, in exchange for more flexibility and features.
But, none of this is indicative that the software itself is getting worse. In the 80s, trivial software bugs were literally killing people: https://en.wikipedia.org/wiki/Therac-25 In terms of radiation therapy machines, the underlying business hasn't changed much; people with tumors want them to be killed, and the difference between now and 40 years ago is that the machine doesn't kill them because of integer overflows.
(One interesting note about the Therac-25 is that it reused software for previous models that had hardware interlocks that masked the software defects. They didn't have any telemetry on when the hardware interlocks triggered, so they didn't know that the hardware interlocks were masking bugs in the software when it came time to remove them. If they had the data, they might have kept the hardware interlocks, or fixed the bugs before 6 people died. So maybe lack of telemetry is a greater ethics issue than including it!)
I don't know if that's an interesting data point. Humans are terrible at driving cars. AIs are terrible at driving cars. I'm not sure that Tesla is obviously doing a sub-par job here, as it's a challenging problem space.
The main effect of the telemetry in self-driving cars is that we can all Monday morning quarterback the exact defect that caused a particular crash. With human drivers, we don't have as much data to get to the root cause. If someone falls asleep at the wheel and drives into a truck, was it their medication? Their sleep schedule? An emotional conversation they just had? The music that the radio station chose to play? A combination of all of those factors? What actions do we take to prevent it from happening again?
> I don't know if that's an interesting data point.
And people are terrible at putting in the wrong coordinates or dosages in medicinal machines. It's as interesting a data point as the Therac.
> The main effect of the telemetry in self-driving cars is that we can all Monday morning quarterback the exact defect that caused a particular crash.
Unnecessary, and has been unnecessary for almost as long as we've had cars. Investigators (the NTSB) are fantastic at looking at wrecks and determining the underlying cause, even in the absence of the black and red boxes. There's no need for always-on-always-phoning-home telemetry.
And this is where I get annoyed - there's no value in companies having that data when their most public use of it is to get popular opinion on their side in a fatal wreck.
> This is is actually a terrible trap, as product usage tells you next to nothing about which features are useful. I suspect it' a big part of why a lot of software has turned to shit last decade or so.
This seems exactly backwards to me. No one who has ever managed a deployed product with good failure telemetry would say that it tells you "next to nothing". Telemetry detects failures that mere testing simply never can and never will.
And no one who lived through earlier ages of consumer software would ever say that it's "turned to shit". Software is getting better. It's getting better steadily and inexorably, and it's been doing that for half a century. It may not be doing what you want, but it's doing it with higher quality than people in the 90's or 00's would have ever dreamed. When was the last time[1] you saw a major consumer application (the TikTok or Instagram client, or your web browser, or VSCode) crash hard and fail on you? This was a daily (or worse) event for most of software history!
And certainly robust telemetry practice is one of the big drivers of this quality revolution we've seen.
[1] Heh, cue the peanut gallery of everyone wanting to post crash reports. That's on me for poor framing, I guess. But seriously, everyone: Both anecdotally and via actual studies, consumer-visible software is getting much better.
> When was the last time you saw a major consumer application (the TikTok or Instagram client, or your web browser, or VSCode) crash hard and fail on you? This was a daily (or worse) event for most of software history!
Huh. I've used software since the mid 1990s, and I can't recall using software that crashed daily, let alone frequently. Which is honestly a bit of a miracle, since a lot of software back then was shipped without the ability to patch it at all.
> Telemetry detects failures that mere testing simply never can and never will.
Product usage and failure reports are different. It wasn't called telemetry when apps just asked to send failure reports.
> When was the last time you saw a major consumer application (the TikTok or Instagram client, or your web browser, or VSCode) crash hard and fail on you?
Google search on Chrome on my home computer has a habit of just returning a white page after about a day of computer uptime. I have to switch to a different profile to get it to show anything.
Magento CMS will silently hang if you try to perform any query on a tab that's logged out. Default login cookie expiration time is one hour.
iOS keyboard has started suggesting german words in autocorrect. A third of the words I get on swipe input are now worthless. German is not in the languages list on that device. I read a suggestion to uncheck "German" in the dictionaries list. It was indeed checked (Why?? Because I frequently type in my German last name?) but unchecking it did nothing.
Discord app on iOS has a hard crash bug on the emote picker if you type a query that has no results, backspace it, then try to pick one of the new results. I actually tried reporting that one and they told me to fuck off. (Which is the default response of every customer service team to any bug report, of course, since 99% of bug reports will be from civilians who have no idea how to report a bug)
Oh another iOS keyboard annoyance: if you do a swipe input and the keyboard guesses that you were trying to type a keyboard, it will set the caps lock, and then remember that setting if you backspace it! Do a keyboard swipe, get "SOS", backspace it, try to swipe "the", get "The" instead, even if it's in the middle of a sentence, which the iOS keyboard is otherwise really good at.
When was the last time[1] you saw a major consumer application [] crash hard and fail on you? This was a daily (or worse) event for most of software history!
The apps I use are now riddled with hang bugs rather than crash bugs. Huzza.
> When was the last time you saw a major consumer application (the TikTok or Instagram client, or your web browser, or VSCode) crash hard and fail on you?
Yesterday. The kindle app. It has a habit of soft locking on a book's cover. TikTok's failure mode tends to be more of a "refuses to play a video" kind. Especially if you're scrolling through videos and one does not exist. It doesn't like that. Instagram soft reboots pretty often (you can tell because you're back to the home feed, as opposed to where you left off), especially after being suspended.
I have to reboot my Mac weekly, both for updates and to keep it from misbehaving oddly.
It's one thing our development culture actually fosters these days - let it fail and restart in a known state; as opposed to doing what you can to keep the software alive through non-fatal errors.
If your application keeps opening popups all the time, I will use the close popup button very often. Doesn't mean I enjoy closing your popups or want to have to close them. The only reason I'm using the feature is to get around an annoyance in your hypothetical application.
Frequent use of a feature is just as much a signal that the feature is an obstacle as it is a signal that the feature is beneficial.
Exactly. If I use feature X very often, that can be because:
* Feature X is good and I like to use it.
* Or feature X is broken so I have to try multiple times.
* Or feature Y is broken so I have to try using another feature instead.
Like, most of the times I open a settings window it's not because I like settings but because the program is not good enough as it is. Or if I follow a lot of links on your web page, it might not be because I like your web page but because I'm unable to find what I'm looking for.
I do not understand this. Because this mythical moronic figure everyone in this thread is talking about doesn't account for confounding variables, telemetry is just bad generally?
This is not a nuanced argument or compex issue. Include an OFF switch. It's that simple.
Personally I refuse to work with or consult for companies that don't get this. It's a matter of professional pride. If I wouldn't use it myself, I'm not going to foist it on my users.
One of the challenges is defining what counts as telemetry.
So they include an OFF switch. Does it also switch off checking for new versions of VSCode? Does it turn off consulting online sources for dictionary updates? Does it turn off viewing the extension library? Those requests are technically also telemetry but if you disable them they look like broken features. The license agreement is written with open language to say "There are some things you can't turn off" because MS doesn't know a priori whether a court would consider "updating the extension library" telemetry until someone drags them into court and presses the issue legally, and they want their asses covered either way.
And the OFF switch doesn't switch off any extensions because the API is open enough that extensions can do their own network access independent of VSCode.
In practice, it turns out to be very hard to build one guaranteed-to-work OFF switch. Not that it isn't a laudable goal. But in general: if it has online features, assume telemetry == true.
Those examples are not telemetry because they only pull data from internet, without sending any specific information. You could check for new versions of VSCode or browse extensions library using just a browser.
As a developer tool, I think it is absolutely possible for Microsoft and extension developers to write a very deep and detail explanation to the users and explain why do they need to collect what telemetry data.
As a user and engineer, I am more than happy to share the usage and crash data with them, but at the same time, I would appreciate if they can be fair to us the user.
Not hardly. In the early days of the Internet it was seen as tit-for-tat among peers; you're accessing my server for data I have, of course I'm allowed to log and analyze your request. That was the exchange.
The exchange was rarely made explicit in any kind of formalized consent sense.
I spend most of my time in Google Docs as the editor I'm typing in.
And when I'm coding in emacs, I'm using Tramp to puppeteer a server somewhere else.
The veil is thin between the internet and offline computing these days.
> Telemetry is godsend for developers, since it lets them track down actual product usage and issues.
Extremely true. Data gleaned from actual usage instrumentation is radically higher quality than self-reported.
> Telemetry is a privacy nightmare for users, since it sends data to the developers outside of an average user's control - data which is probably easily associated with an individual and kept for eons
None of that is usually true. In the common case, users don't care (after all, MS has the telemetry to estimate how many users disable telemetry). And product improvement data is pseudonymized and usually worthless after a version iteration.
I think the folks that have an anxiety about product use data collection don't realize how worthless individual data points are. The data is pseudonymized out of necessity if for no other reason; it comes in as such a firehose that it has to be immediately analytically bucketed or it overwhelms storage at the volume these tools get used. However, the telemetry options get defaulted to "on" because volumes of data are invaluable for concrete analysis of how the product is used, counts on feature access frequency, understanding pain points in usage flow, etc.
I've done some work in this space (not for MS) and am happy to answer any questions people might have that aren't "Who did you work for?"
>The data is pseudonymized out of necessity if for no other reason; it comes in as such a firehose that it has to be immediately analytically bucketed or it overwhelms storage at the volume these tools get used
> Telemetry is godsend for developers, since it lets them track down actual product usage
I have come to think that telemetry used to track actual product usage is a bad thing. It encourages shallow thinking about how users really use the product, and often leads to terrible product decisions.
I like them. I don't really understand how it would apply to telemetry - it's not something I've read into - and I don't think many other folks are looking into GDPR compliance with telemetry either.
All of the big companies have armies of lawyers deeply familiar with these laws and regulations. They make their engineering teams implement processes to mitigate the issues with storage and processing.
Examples: replacing/removing external IDs, ensuring data can't be linked to users, aggregating data, not retaining data, only processing data within data centers (no data extracted to laptops), deleting old data, etc.
This is why it's slow and painful to work at big companies compared to startups. It doesn't mean that accidents don't happen, but it's pretty well thought out. The same applies to how they do qualitative research. It's got reasonably high standards which are fairly consistently applied. It's not perfect, but nothing is. There is a lot of job security and motivation for people to make it better.
Smaller companies have almost zero understanding of how to deal with this due to lack of resources and expertise. The only way they get compliant is by using good tools which force the user into a safer, more compliant posture.
I am one of those who began coding in Notepad and then quickly moved on to Vim and then to Emacs.
When the whole world of internet services began consolidating around the web and then later when the whole web began consolidating around a few powerful walled gardens, I saw how these trends began chipping away at our anonymity, privacy, web performance and software performance. Computers are ubiquitous but privacy is under attack. Hardware is faster but software is slower. Internet pipes are faster but websites are sluggish.
In all this turbulence I thought at least my software development tools are not affected by these terrible user-hostile trends. Emacs has become faster with faster hardware. While Vim was always fast, I am sure the Vim folks too would agree that they can now run Vim with a lot of plugins thanks to faster hardware. These editors do not do hidden telemetry. They don't add user hostile features. The core editor experience remains more or less the same year after year.
Even after all the disruption (I mean literally, disruption, not in some positive metaphorical way) in the rest of tech world, I have found consolation in my modest code editor, Emacs for me, Vim for some, other editors for other people. My code editor has always served my best interests. It helps me write code and documentation without any distractions. But when I read articles like this about VSCode and its telemetry, it really makes me anxious. Perhaps Emacs or Vim will never be afflicted by issues like this. But still ... If developer tools meant for ordinary software developers like me are going to start sneaking on my data and begin bundling user-hostile features, what hope is there for all other kind of software tools!
I agree with you, and I'm a long-time vimmer who moved to Emacs full-time last December, but... It's not that simple. If an entire development team is using VSCode, or you're doing development inside of docker containers, it can be hard to stick with Emacs, (neo)vim, etc.
I struggled to make the change. I think I tried half a dozen times to go from (neo)vim to Emacs and it never stuck. My problem was that I kept reaching for spacemacs and Doom Emacs, etc., right out of the gate, and I would be mystified by Emacs itself and Emacs Lisp as a result.
Two things helped get me into Emacs full-time (and this is after > 15 years of using vim):
1. I went step-by-step through Susam's Emfy Emacs config [0]. That helped me understand some of the basics at a foundational level. I extended that base configuration a little bit and became comfortable with the environment.
2. I then went step-by-step through the entire "Emacs from Scratch" playlist that System Crafters put out [1]. I pushed my personal configuration pretty far with that over the course of 2-3 months.
I eventually moved to Doom Emacs and married in pieces of my own configuration. That's been my daily driver for months now.
What a gross oversight from their part. Not only they build a product that, as pointed by the post, sends a excessive amount of data, but also allows anyone to do the same.
> Why have licenses like these if they are just concerned with product improvement?
That nails it for me. I have no reason whatsoever to use vscode.
What's the alternative? Blocking extensions from accessing the Internet? How are you even supposed to do that given extensions already have (completely justified) access to the filesystem and can invoke executables?
There certainly is a valid argument for Internet access, for example for documentation lookup, schema validation, database editing, etc.
There would be a strict, and yet fallible, vetting process on the extension marketplace to make sure every extension complies with whatever rules are defined.
Expose the setting to extensions (if it isn't already), and set a policy in their extension catalog that extension telemetry must honor the global setting. It isn't perfect, but it's better than nothing.
You have a point. There is no way to block access to resources given the nature of the product, but they could, at least, make more effort to create a better community/environment around the tooling (regarding data handling).
> extensions may separately send telemetry, that do not have to adhere to the main VSCode telemetry configuration settings.
From what I see things that should never happen and should yield a block from the marketplace.
Like you mention, in the end of the day that is not enforceable, but more transparent data privacy policy would show some good faith from their side.
That is fine from my point of view. You, vscode and the extension have an agreement of how the data is handled, where is sent to and how it is processed. It becomes a problem when the vscode or an extension ignores the no telemetry settings, keeps sending data, and we have no much of a clue how it is going to be store/processed/used for.
vscode extensions are open source so the community should report and vscode should ban extensions behaving badly. That's the same way all plugin marketplaces work, nothing bad vscode is doing with their extension setup.
That too seems to be only true in most cases but not all. How could I inspect the code of the Adobe extension linked above before installing it in Visual Studio Code for example? There doesn't seem to be any links to the actual code for it.
Yeah, that's what I'm trying to figure out how to do. The website I linked before doesn't seem to have any links to download it, so how do I actually do that? What's the link for downloading it?
From the 2 marketplace links you linked above, click "Download Extension" on the right side of the page. Rename that file from .vsix to .zip then extract the zip file to get a folder containing the source files for the extension. It's not as readable as if they published open source on GitHub, but you can still see what the extension is doing.
The extension author docs [1] say that extensions _should_ obey the global setting, even if using their own telemetry libraries:
> Extension authors who wish not to use Application Insights can utilize their own custom solution to send telemetry. In this case, it is still required that extension authors respect the user's choice by utilizing the isTelemetryEnabled and onDidChangeTelemetryEnabled API.
I suppose the quote in the article is technically correct, because there's no guarantee that _will_ follow this. I'm curious if I could report an extension for abuse and have it removed if it doesn't honor the global setting.
The article also says
> Microsoft’s C# extension (ms-vscode.csharp) sends data to Microsoft. There does not appear to be any setting offered by the extension to turn telemetry off.
I unzipped the extension and looked at the package.json, and it appears to use Microsoft's recommended extension-telemetry library, so I presume it is following the global setting.
I wish Microsoft required extensions to publish detailed telemetry info (or, really, info on any and all external connections an extension might make) on their Marketplace page.
I can't take a "maybe it does what it says" to my boss and expect approval to use that extension. It might be different if the license didn't agree to telemetry, but that means they can't even sue if the extension leaks private or secure information.
IMO it's because they've done very smart pivots / rebranding (VSCode, Github, NPM, even the Surfaces) and people who even know start thinking "new Microsoft"
They were losing hearts and minds in the Web dev space by waging war against open source. It makes far more sense to co-opt it, as per the Halloween-document strategy.
TBH I saw this coming 20 or so years ago, about the time when IBM started putting on its open-source cheerleader outfit. Open source is formless, shapeless, like water per Bruce Lee. You cannot oppose it with force, it will just get everywhere; but by aligning yourself with it you can direct it. That's what I saw IBM doing and I knew Microsoft would follow suit out of lack of choice. And now here we are.
I'm a bit surprised at how many of my friends have jumped ship to Visual Studio Code, including those who are very much for free software. They have always been in the business of embrace, extend, extinguish[0]. People tend to forget how evil M$ used to be because recently they have seemed like a beacon for Open Source, but I think they are just still evil[1].
I think we're still dealing with the same Microsoft that we've dealt with through the 90s. They are not a champion of open source, and they are still up to their old tricks.[2]
Okay but companies don’t just add telemetry because it’s a fun and exciting engineering challenge. They do so for legitimate engineering and business reasons.
These solutions have privacy nuance they are not black and white/evil v good
If you are doing telemetry in a way that respects privacy this won’t be possible. Or at least there won’t be a way to verify whatever data you have locally to their database
Not to mention that if you do anything requiring security, then this all becomes a giant nightmare that results in the entire editor and toolchain being disallowed.
This is a political opinion and regardless an implementation detail. My response was that telemetry serves actual purposes and very useful ones at that.
Our goal should be to adapt and ensure privacy via regulations and standards. Wishing, or enforcing, that every application not have telemetry by default is both unrealistic and unwanted.
Telemetry can be an invaluable source of information to improve products, everyone knows that; it's sorta like having 100% of the users continuously reporting bugs and information from which software vendors can infer most wanted features or modifications to improve usability. Which business wouldn't want that? If done right it could be great for everyone, including users.
Unfortunately there's no way to ensure that invaluable tool won't one day be used against users to sneak out data that wouldn't contribute in any way to products improvement, namely making profits from users profiling.
But the point isn't being good or evil; businesses have no sense of morals, they don't obey to common sense but rather take the path that leads to the higher profits, and these days profiling users unfortunately brings high profits. It's not a matter of which company is good or bad, or if telemetry can be done right or not, but rather how long until a software with built in telemetry will use it the wrong way, because soon or later everyone of them will be presented the scenario in which doing a bad thing brings them more profits; and this principle applies way beyond telemetry.
Factory farming is also an engineering challenge, with legitimate business reasons - its way more economical. I think most of the time horrible things just make sense, in some other framing.
My point is - something being useful is a very low bar. It's not the point of the discussion here. Maybe some privacy zealots won't acknowledge this, but telemetry can be very useful, and at the same time a breach of privacy.
And regarding Microsoft, it's also about their attitude towards it. It's their way or the highway. It has always been like this - the software might be the way it is, but regarding their business dealings they are and always were ruthless. That's what made them great. So now what we see is a continuation of their previous attitude. And a slice of the current zeitgeist which is always-on internet with everyone slurping up user data.
My concern is that anybody at a company can suggest a telemetry point and it's added with little or no oversight, and store it forever. It may not be a fun engineering challenge (though it could be, depending on the metric), but sifting through the results is a "fun data interpretation" event. Especially for marketing and sales groups.
The solutions should have a privacy nuance... but don't. For example from TFA, collecting git branch and repo names. That seems like a terrible idea, since it's not like Microsoft's devs will have access to reproduce issues from those repos. It's also a potential leak of sensitive corporate data.
I just checked, and the only telemetry setting that I have in my VS code config is `"telemetry.telemetryLevel": "off"`. I have the non-free version installed (i.e. not Codium). I installed OpenSnitch and started running it, and it caught a grand total of 7 requests over a few minutes of coding: 2 to marketplace.visualstudio.com (one to port 443 and one to 52), presumably which occurred during startup to check for updates to my extensions, and then 5 to a subdomain of vo.msecnd.net (split across ports 443 and 53). I hadn't heard of this domain before, but from a quick google, I found it listed in the VS Code documentation as their CDN (https://code.visualstudio.com/docs/setup/network). Overall, this seems...not that crazy? I'm not sure if they just are less egregious on Linux machines or if the CDN is just where they send the analytics to, but it doesn't really seem like much sketchy stuff was happening behind my bacl.
It just makes me feel very happy that I use neovim. OSS used in all dev tools is the only way around this problem. Not that I have full OSS in my whole pipeline; I still use GitHub (though I am seriously considering a move to sr.ht).
If you installed it as a flatpak, just use flatseal. And to install new extensions, just download them from open-vsx or the official vscode website, and then drag-drop them into VSCode
What we need, and this may already exist, is away to sanbox apps with the ability to firewall of incoming and outgoing http request.
Does this exist?
What would be even better was a solution that had profiles for common usages and apps, and the profiles could be created by all and shared via a community or repo and could be extended and tweaked.
Does this exist, is it feasible, or does anyone have a better simpler idea?
On MacOS you can use Little Snitch [1] to do exactly this. You can watch all network connections per app, even grab a pcap of traffic just for the app as well. You can setup restrictions to block incoming and outgoing traffic. It basically is a per-app firewall.
On linux if you want to block all network access to an app you can use FlatSeal. It's great for VSCode / VSCodium if you don't want to worry about malicious extensions extraditing your data. To install new extensions, just download them from open-vsx or the official vscode website, and then drag-drop them into VSCode
How often is this updated? I looked at it previously but there were enough caveats/comments to make me weary on using it. I checked the website too but couldnt get (or missed) a definitive answer
It's updated several times a month https://github.com/VSCodium/vscodium/releases
The biggest caveat would be to be aware of the default connexion to an alternative extension store, https://open-vsx.org, instead of Microsoft's own store, which does not have all the extensions the official store has. But that's less and less an issue, thanks to projects such as https://github.com/open-vsx/publish-extensions. In the worst case, I just manually `git clone`d the desired extension in my local extension folder. Nothing to complain about otherwise
Dunno about PyChsrm, but IntelliJ developers solve WSL2 issues in low-priority mode. I posted a few and followed a few others, resolution time is many months long.
JetBrains just in general treats all issue as low priority. There’s issues open for literal years that they refuse to do anything about. They say “we looked at this issue, and it’s not on our roadmap, so no.” Ok, but what is your roadmap? Why can’t it include a fix for a 5 year old bug? Oh, because you’re working on a replacement that’ll take 3 years to finish. It’s extremely frustrating sometimes.
the more people complain about telemetry, the more i know I'm right not to be concerned. if it is a problem for you, go back to emacs 19 or vi or whatever and leave the rest of us alone.
alternatively, go climb the ladder at any company that collects this terrible scourge of invaluable personal explosive data and get them to stop.
I'm ok with this. and just like you didn’t like me belittling your position, as i did above, i’ll kindly ask that you understand my reaction here to being told (without asking, by the way) that telemetry is an unparalleled bad thing.
i wish people who are convinced that some illuminati-like cadre is behind all of these things were employed at large companies which could collect this info. there is nowhere nearly enough cohesion between employees to make anything useful out of all the data collected, much less to act upon it, outside of the developers trying to chase the bug that has haunted them for 36 months, and the UX team trying desperately to reduce the amount of choices in menus. i mean, manage a few projects with people who are all on board with the effort - even those groups are hard to keep focused, most of the time.
these companies use this data for the stated purposes; to help make their products better. if i could explain some of the technical considerations behind the apparently odd choice of data to collect, it would make a lot more sense, but i’m pretty awful at communicating, and even if i weren’t, and i were able to succinctly convey that info, i feel like most of you would just argue with me anyway.
The TL;DR is Microsoft won't tell you what they collect and ignore settings that limit telemetry.
If you use VSCode or its extensions you can't keep commercial secrets from those working at Microsoft unless you block network traffic from VSCode. This seems to be true even in the open source builds like VSCodium.
Except the article does not. If you turn off telemetry and crash reporting, then there is no telemetry. (Except possibly the telemetry from plugins that choose not to honor the telemetry toggle). This post provides no examples of telemetry being sent when the option is turned off.
Now the program does have a variety of spots where it is designed to contact Microsoft services, like say the extension store, or update checks, or other similar functionality. Those services will (obviously) keep some level of logging data, which you cannnot opt out of if VSCode Talks to those services. This is what the terms are talking about when they indicate that not all collected data can be opted out of.
There is no "Don't talk to any Microsoft cloud service" feature toggle, sure, but would not really be a sane toggle people actually want. For example that toggle would need to forcibly disable the git support to ensure those features don't try to talk to GitHub if you open code in some GitHub checked out repo. It would disable the extensions store. It would prevent the update checking, etc. VSCodium would end up wanting to rip chunks of that toggle's functionality back out, since for example, VSCodium uses a different non-Microsoft extension store by default, so would want want the extension store to remain disabled.
FTA: "Visual Studio Code collects telemetry data, which is used to help understand how to improve the product. For example, this usage data helps to debug issues, such as slow start-up times, and to prioritize new features."
I don't work at Microsoft but I work somewhere similar on a product that collects similar types of telemetry and those words are not empty, they are very, very true. We absolutely add, remove, and change things about the UI based on what our telemetry says people do.
And it's not just about moving things around (which I know HN hates), it's about making the page load faster (which I know HN loves!). We are going through a concerted effort to lower page TTVC at the moment and a big part is going to have to be turning off some features that show by default that take too long to load and guess how we'll be making the call about which features to turn off?
I agree but I think this is a niche case where we are talking specifically about a development tool. I assume VSCode developers are dogfooding VSCode. They know what users want because as developers they themselves are VSCode users. Telemetry, other than crash information, seems unnecessary.
VSCode supports a wide range of languages and workflows. The workflow of the VSCode developers doesn't necessarily align with the workflows of other VSCode users.
MS sale team would probably love to get a grep of all "#include <foo.h>" of a company to see of they can do some sales pitch to eat some vendors lunch.
IMO: Whether they make money off of it or not doesn't matter. They're collecting and storing gobs of data that they don't need. It's arguably not their data (i.e. the branch and repo from my .git directory), so they shouldn't be collecting or storing it.
Found this while looking for a way to fully disable telemetry on vscode.
After reading, this seems impossible, short of going down to DNS or network blocks of some kind: the "disable telemetry" setting doesn't; extensions have their own telemetry; MS consider the data sufficiently anonymous to be exempt from GDPR, even the vscodium maintainers say they found it impossible to block vs code's eager telemetrizing.
I wonder what is the most effective way of actually completely stopping it phoning home, apart from yanking its ethernet cable?
I use IntelliJ for $dayjob, and while I love it, it too submits telemetry. I do not know how pervasive it is, but the controls for it are not obvious in the general config window.
The only way to really escape telemetry today is a local firewall that can block network access by application.
I would like to see some analysis of the telemetry Jetbrains sends and whether you can turn it off. It's a paid pro IDE so it wouldn't surprise me if you can really disable it because some commercial customers will not want telemetry. In high security environments it's often considered bad due to the possibility of accidentally exposing secret information.
And even if you don't like fiddling with dotfiles you can install some really nice distribution like spacemacs or doom for emacs and lunar for neovim. They have a pleasant dev experience out of the box.
On windows notepad++ was my goto for basic text editing. These days I use the jetbrains editor of choice for the language in question or IntelliJ for basic text.
even the vscodium maintainers say they found it impossible to block vs code's eager telemetrizing
Source for this? Does VSCodium with no extensions send telemetry, or is it the extension issue?
EDIT: Oh the quote from VSCodium is in the article:
Even though we do not pass the telemetry build flags (and go out of our way to cripple the baked-in telemetry), Microsoft will still track usage by default.
The funny thing is that I was about to install VSCodium for the first time TODAY, because I want live markdown preview. (I want a doc editor, not necessarily a code editor)
The rationale for VSCodium says it's about telemetry, although strictly speaking it's doesn't promise they removed all the telemetry!
but the product available for download (Visual Studio Code) is licensed under this not-FLOSS license and contains telemetry/tracking
-----
I should also say that on the dev side, crash reports are EXTREMELY useful and really do improve the quality of the software. But you used to have to confirm this data would be sent.
It's safe to assume Microsoft is GDPR compliant. There's no sense in risking the fines GDPR could potentially levy for the sake of ... what exactly? There's no upside even if this data was somehow de-anonymized. The sensible thing to do is to delete everything within 90 days. Even if the developers behind VS Code didn't agree with that, legal and privacy at MS would make sure they would.
All this talk about how this data can be used to target people seems far fetched. Microsoft doesn’t have a serious ads business. What they do have are various products and services to sell to developers - Github and Azure among them. That’s the whole point of VSCode - give it away for free so they can understand developer practices and also upsell paid products. And if anyone doesn’t like that, they can simply … not pay Microsoft any money.
How can you possibly tell without knowing what exact data is being collected, how it is being used, for how long it will be kept, and where it will end up?
I work around and with a climate scientists. What I worry about has changed.
Metrics and crash statistics are fine. And if they are collecting some dystopian level of data, then I bet that google and facebook has them beat already.
Why even make vscode if not for telemetry? It’s the whole point of the app, isn’t it? It’s like chrome or edge or that open source png viewer that started selling your info a few months ago
Telemetry is a major focus of windows since the current ceo too over. It’s kind of his thing
Note there is an open source version called vscodium that iirc has telemetry disabled but I found my extensions didn’t work with it last year when I tried it
I honestly wouldn't mind the telemetry but seen as VSCode on Windows will always inevitably completely freeze the entire OS to the point of requiring a restart if you leave it open for long enough, it doesn't give me much faith anyone is looking at it that closely in terms of using it to fix fundamental issues.
I guess you are not referring to some MS extension but to the editor itself, right?
I had an issue around a year ago with cpptools on Linux, where a single 80k LOC header would cause obscene memory usage (10+ G) and the extension would basically stop working.
[1] https://imgur.com/a/RbrsuyA
You will want these three settings in your default settings JSON (the first two just to be safe):