A fork of a private repo is private. When you make the original repo public, the fork is still a private repo, but the commits can now be accessed by hash.
According to the screenshot in the documentation, though, new commits made to the fork will not be accessible by hash. So private feature branches in forks may be accessible via the upstream that was changed to public, if those branches existed at the time the upstream's visibility changed, but new feature branches made after that time won't be accessible.
OK but say a company has a private, closed source internal tool, and they want to open-source some part of it. They fork it and start working on cleaning up the history to make it publishable.
After some changes which include deleting sensitive information and proprietary code, and squashing all the history to one commit, they change the repo to public.
According to this article, any commit on either repo which was made before the 2nd repo was made public, can still be accessed on the public repo.
> After some changes which include deleting sensitive information and proprietary code, and squashing all the history to one commit, they change the repo to public.
I know this might look like a valid approach on the first glance but... it is stupid for anyone who knows how git or GitHub API works? Remote (GitHub's) reflog is not GC'd immediately, you can try to get commit hashes from events history via API, and then try to get commits from reflog.
> it is stupid for anyone who knows how git or GitHub API works?
You need to know how git works and GitHub's API. I would say I have a pretty good understanding about how (local) git works internally, but was deeply surprised about GitHub's brute-forceable short commit IDs and the existence of a public log of all reflog activity [1].
When the article said "You might think you’re protected by needing to know the commit hash. You’re not. The hash is discoverable. More on that later." I was not able to deduce what would come later. Meanwhile, data access by hash seemed like a non-issue to me – how would you compute the hash without having the data in the first place? Checking that a certain file exists in a private branch might be an information disclosure, but gi not usually problematic.
And in any case, GitHub has grown so far away from its roots as a simple git hoster that implicit expectations change as well. If I self-host my git repository, my mental model is very close to git internals. If I use GitHub's web interface to click myself a repository with complex access rights, I assume they have concepts in place to thoroughly enforce these access rights. I mean, GitHub organizations are not a git concept.
> You need to know how git works and GitHub's API.
No; just knowing how git works is enough to understand that force-pushing squashed commits or removing branches on remote will not necessarily remove the actual data on remote.
GitHub API (or just using the web UI) only makes these features more obvious. For example, you can find and check commit referenced in MR comments even if it was force-pushed away.
> was deeply surprised about GitHub's brute-forceable short commit IDs
Short commit IDs are not GitHub feature, they are git feature.
> If I use GitHub's web interface to click myself a repository with complex access rights, I assume they have concepts in place to thoroughly enforce these access rights.
Have you ever tried to make private GitHub repository public? There is a clear warning that code, logs and activity history will become public. Maybe they should include additional clause about forks there.
Dereferenced commits which haven't yet been garbage collected in a remote yet are not available to your local clones via git... I suppose there could be some obscure way to pull them from the remote if you know the hash (though I'm not actually sure), but either way (via web interface or CLI) you'd have to know the hash.
And it's completely reasonable to assume no one external to the org when it was private would have those hashes.
It sounds like github's antipattern here is retaining a log of all events which may leak these hashes, and is really not an assumption I'd expect a git user to make.
Is it hard to brute force? If so, it is still something that can be sniffed, though you can't unlock the car without first recording a genuine button press.
Barter economy is a myth. If you think that's wrong, do a mental exercise -- try to imagine how that would work in practice.
We never, as a species, had any meaningful instance of barter economies. We've had the concept of debt since before we had currency. I imagine this is what we would try to "return" to if cash disappeared, and people needed to exchange goods and services without authorities and intermediaries. This would though require a community, a social fabric -- something that has been steadily eroding for a long, long time.
Some rural communities already do this and have small private forums restricted to the locals. That does not exclude using cash, it's just another method of trade to fall back on or to use when it makes sense. If the internet is down they can all meet in a church parking lot or someones field.
Cigarette cartons are an unintuitively good currency object. They're high-density ($/volume or weight), storable, non-perishable, sub-dividable ("packs" in "cartons", I think that's how they're subdivided?). They're easy to count. They come in standard, audited sizes, the trusted-brand manufacturers certifying their weight (?) and stuff. The large population of tobacco addicts guarantees a stable demand, hence a stable value.
(From a 30,000-foot view, it's kind of analogous to cryptocurrency. In a failed state, you don't accept cigarettes as payment because you want to smoke them—but because you know you can exchange them again to someone else, and to someone else, eventually reaching a smoker, who creates the demand sink. Similarly, people accept cryptocurrency units because they know at the end of the commerce chain, there will be drug dealers guaranteeing an intrinsic demand).
Fakes in the sense of much lower quality tobacco, often mixed with cheaper materials to bulk them and in lower amounts. So people get a product at black market price that is not the expected one.
Tax stamp trafficking is used to cover for those fakes as it make them look real, and it is also used to avoid getting arrested for selling contraband.
Easylist should serve the Indian browser (based on user-agent) with a giant file (expensive), a corrupt file, or some response which causes the app the crash. If the browser crashes on every startup due to a malicious response from the Easylist server, users will likely delete it.
Serving a giant file is going to affect their servers more than the end device. If they could identify the user agent it would be a lot easier to just block it entirely.
And a terrible idea because the users are not at fault, but the developers of the app. And in this case it might crash so they'd only report "app crashes on startup", if at all.
Then the browser could just pretend it is Chrome (if it isn't already) and that would easily work around your solution.
I think something like this would be best served by moving to IPFS or Bittorrent. A magnet link could be provided and then browsers and plugins could use that to download the file. That way, you can distribute the load.
I don't believe the kids behind this Twitter account. I don't know why they're doing it exactly, probably some form of clout or to scam buyers on darknet marketplaces, but I know that many of their screenshots are faked. I know people at one of the companies they claimed to have hacked - they posted a Ruby on Rails directory structure as proof of hacking them but the company does not have Ruby code. So I would not trust any of their tweets.
I don't trust them (AgainstTheWest, not Troy Hunt) either — and frankly I'm surprised to see that they're still active.
Earlier this year they claimed to have discovered an NGINX 0-day RCE and tested it against a Canadian bank. Not only was it a big nothing-burger, but they ended up purging their Telegram channel aftwards with claims of infighting (screenshots for posterity: https://imgur.com/a/5AThvTv).
> they posted a Ruby on Rails directory structure as proof of hacking them but the company does not have Ruby code
I think it's extremely suspicious, but often times breaches like this aren't through the core platform itself. For example, Equifax was a support site that was hosted and built separately from their main platform.
This whole thing does smell like BS to me, though as well.
Trying to think how to anonymise datetimes hurts my head. You might want to randomise the date of an event. But you also need this random date to be consistent with respect to both the current time and the order of other related rows in the database.
The answer is always “it depends,” but I think if a date time is a UTC timestamp, such as a record of when an event happened, then with random sampling, it shouldn’t matter? It’s just a timestamp. The amount of information it contains might include location, might include timing to other events, could be correlated, but… on its own? It doesn’t need anonymization. Likewise the sequence of events, should be safe to use.
I get that you can look up or de-anonymize an event by its timestamp and the same is true of ID numbers. But it’s worse for ID numbers because these are often permanent and re-used for multiple events.
But yeah, the risk in anonymized data is that it’s never truly both anonymous and useful. Truly anonymous data might be considered junk or random data.
Anonymized data has some utility purpose to fulfil. Perhaps “realistic” analytics is required, or you want to troubleshoot a production issue without revealing who did what to engineers. So you anonymize the fields they shouldn’t see, and create a subset of data that reproduces the issue…?
Anonymized data is almost always a bad approach compared to generating data from algorithmic or random sources, but sometimes we need anonymized or restricted data to start that process.
True? But I wouldn’t call creating new data from non-anonymous data “making data anonymous”. Instead, that’s new random data whose values are constrained or based on real-world data. I’d call that newly generated data, not anonymized data.
To me, anonymized data has an inherent risk of leaking the original transaction because it is a one-to-one mapping of the original data. If you generate new data, it will by definition diverge from the production dataset in some way that might be unrealistic. For example, fields with address components might not actually point to real places, or might not be written the same way as they would be in production. Perhaps a portion of production data includes international addresses or rural routes that your software might fail to generate, or worse, maybe it would generate them incorrectly.
Frankly, generating data is a better approach than anonymized data. And I know of anonymization techniques where good data is mixed with bad data and statistically, the bad data can be filtered out later but only in aggregate, etc. But I’m drawing a line in the sand between anonymized data that closely matches real data, and that which is “generated data”, because you can still potentially learn from the anonymized data but you can’t learn from generated data much more than you would from the initial model that created the constraints used to generate the data. I’m probably explaining this poorly, it’s a bit late at night in my time zone. :)
Also, if you don't have a public access block in place, a private bucket can contain public files! Even if you can't list the files in the bucket, there are tools which try to guess common file names from guessed bucket names e.g. sega-secret-sauce.s3.amazonaws.com/.env - if someone uploaded a file there without setting the ACL correctly there could be an unprotected file in the private bucket.
By temporarily defacing the Sega website and modifying files I think they have crossed the line. Enumerating what access they have, rooting through S3 and reporting it is OK, but by messing around like script kiddies they can no longer claim good faith. Publicising that you've illegally defaced the website is a little silly. Of course, Sega should not have got themselves so completely owned. Sega deserved to be punished, but these VPN twits have clearly committed a crime and Sega should maybe sue their company.
The store owner was gone on vacation, and thus the side of his store was riddled with graffiti. He deserved to get graffiti because he didn't take basic security precautions.
You don't need to break security to spray the side of a store. You do need to break security to deface a website.
Analogies are analogies, they're unnecessary in this case (nowadays). Because we got law to punish people who deface a website, and the law stands on its own.
Its akin to people who call 'copyright infringement' 'theft'. Its not the same, its a different mechanic, damages are different, and... different laws apply. That doesn't mean one's right or wrong or anything like it; like I said: the laws stand on their own, respectively.
The store owner should have hired security staff to prevent their store from getting graffitied.
I can construct any sort of scenario such that victim blaming is always possible, when the reality is they shouldn't have to worry about their property being messed with.
To me this situation seems more like a store owner forgetting to lock the door the somebody noticed, came inside put up a sign on the front window saying that the store owner is too stupid to lock his own door and then calling the owner to tell him about this.
I think "deserves" is a better word than "deserved".
The punishment for grossly negligent handling of PII should not be a childish website defacement, and should not be from enforced by vigilantes. Obviously.
The punishment for mishandling PII like this should be a painful fine, a rigorous externally imposed technical audit, and possibly civil/criminal implications for senior leadership.
(If the last one sounds unreasonable, consider Equifax. Many executives in charge of security orgs do not have technical degrees and, more importantly, have not booked any time in the trenches. Being self-taught and having non-engineering degrees can be okay, but combining that with no in-the-trenches experience is inexcusable. Assignment security to corporate politicians who don't understand the work that they are managing should be criminally negligent.)
It's more like a store owner who left all his customer's names, addresses, credit cards, purchasing history and everything else just lying out there in the open. Public embarrassment is too light a punishment for the inevitable day when someone else comes and takes it. The real victims are all the people harmed by their negligence.
So the store owner can just leave all his customers’ credit card information lying around and ignore PCI compliance etc. because anyone who would possibly use it for nefarious purposes is a criminal?
The ones who are damaged by the negligence sues for negligence.
Similarly: those people who act recklessly can get sued for more, or even criminally prosecuted. Finally, someone who acts out with malicious intent can be sued / criminally charged with the highest crimes.
-----------
So in this "Sega" case: Sega can sue their security for the negligence.
Then, the hackers can be sued for something between recklessness and malicious intent.
Yeah, the law is flexible. "Justice" as a concept in the Western world revolves around both actions + intent. (With intent / state of mind in roughly 3 states: negligence, recklessness, and malice in that order).
Its a flexible system, albeit sometimes imperfect... but just applying it in a textbook manner to this case results in acceptable results IMO.
Strong disagree (not about the law claims, I'll leave that to the law-knowers), but the moral implications of 'crossing a line'. It reads like they revealed security vulnerabilities that had the possibility to harm others. I think they can be allowed some leeway in their methods.
Nope. That can come after responsible disclosure. Did they try the responsible path first? Looks like they notified and then kept going for another 10 days
This is the problem I have. They kept going without permission. Leaving your key in the door doesn't give someone the moral authority to go through your house and look for other issues.
What if there were signs of a current and urgent matter?
This seems like the wrong time to bring in analogies, given that we all understand whats being done well enough to talk about it directly. Given that there were obvious problems that implied a clear and present danger to people it seems reasonable to take more immediate, more effective measures.
My understanding is that the 'responsible' path can have groups pursue you while they try to cover up and deflect blame, instead of fixing the problem. Going down that path does not sound very responsible to me.
it seems like there's a couple of hundred consumer-facing VPN service providers, all with slick looking marketing websites to sell you a $5/mo service.
lots of them are nothing more than 1 or 2 people and some rented 1U servers or dedicated servers somewhere on whatever ISP that can find with cheap IP transit / DIA rates. maybe a part time website design/graphic arts person they found via fiverr to make things look cool.
from the perspective of a colocation-specialist ISP or medium sized generalist ISP that offers colo, they get lots of weird requests for colo and dedicated server services from VPN companies they've never heard of before. often with something like a corporate entity that exists in cyprus, panama or even weirder places.
looking at this in terms of the risk that a VPN provider presents to an ISP's reputation, IP space, attracting unusual volumes and numbers of DDoS, etc... there is a certain amount of "KYC" (exact same idea as finance industry KYC) that needs to go into a potential vpn service provider as a colocation client before quoting them a price or accepting them as a customer. fail to do that at your own risk.
it's very much in the weird/shady/grey market end of the ISP market.
the level of technical acumen and professionalism varies greatly between VPN providers.
Tangential to the thread, but I've never understood what people mean when they say this.
Do you run all your personal traffic through a VPS or something? That's not really offering the same thing as most VPN's. It hides your traffic from your ISP so they can't sell your data and snoop on you, but doesn't accomplish some of the anonymizing that an actual multi-user VPN can provide by adding additional traffic under the same IP.
So, what do YOU mean when you say you "do your own VPN"?
One of the VMs that I have on a system in colocation is my own customized OpenVPN setup, where I also run the openssl CA for it. My phone, laptop, etc all have their own keys.
It's set up for my own needs when I want to use a VPN from a weird place. Or simply to bypass artificial restrictions on traffic if I'm on amenity wifi in somebody's office, airport, hotel, etc. Since I can arbitrarily reconfigure it at will, and run multiple openvpn daemons from differnt .conf files listening on different ports with unique configurations (all relying on the same CA), I can do things like have one VPN that pushes a default route for my spouse's need to do internet things on restricted amenity wifi.
Another part of it pushes only routes to a few /24 that are my personal project servers, and the routing table on vpn clients remains otherwise unmodified. Sometimes known as a split horizon VPN.
>95% of the time I am not using it to run all my traffic through there.
It's also the gateway and pushes routing table entries to things that exist for my personal test/project/development VMs that are in private IP space, so I need to be connected to the VPN in order to talk to those.
I don't understand this way of thinking. They made a serious security oversight, but that doesn't mean that they deserve to have their website defaced.
So if they didn't create a new user account and IAM account what would you see? If they just used the remote shell and the installed aws cli e.g. `aws s3 ls` would you be able to detect it?
This article is an ad.
You'd still see the activity of that machine in AWS CloudTrail logs.
From [1]: "CloudTrail records two types of events: Management events capturing control plane actions on resources such as creating or deleting Amazon Simple Storage Service (Amazon S3) buckets, and data events capturing data plane actions within a resource, such as reading or writing an Amazon S3 object.