It's not just faces. When recognizing objects in the environment, we normally filter out a great number of details going through the visual cortex - by the time information from our eyes hits the level of conscious awareness, it's more of a scene graph.
Table; chair behind and little to the left of the chair; plant on table
Most people won't really have conscious access to all the details that we use in recognizing objects - but that is a skill that can be consciously developed, as artists and painters do. A non-artist would be able to identify most of the details, but not all (I would be really bad compared to an actual artist with colors and spatial relationships), and I wouldn't be able to enumerate the important details in a way that makes any kind of sense for forming a recognizable scene.
So it follows from that that our ability to recognize faces is not purely - or even primarily - an attribute of what we would normally call "memory", certainly in the sense of conscious memory where we can recall details on demand. Like you alluded to re: mammals and spaces, we're really good at identifying, categorizing, and recognizing new forms of structure.
You'd have to be stupid and desperate to steal from a garage.
The people who work there aren't office workers; you've got blue collar workers who spend all day working together and hanging out using heavy equipment right in the back. And they're going to be well acquainted with the local tow truck drivers and the local police - so unless you're somewhere like Detroit, you better be on your way across state lines the moment you're out of there. And you're not conning a typical corporate drone who sees 100 faces a day; they'll be able to give a good description.
And then what? You're either stuck filing off VINs and faking a bunch of paperwork, or you have to sell it to a chop shop. The only way it'd plausibly have a decent enough payoff is if you're scouting for unique vehicles with some value (say, a mint condition 3000GT), but that's an even worse proposition for social engineering - people working in a garage are car guys, when someone brings in a cool vehicle everyone's talking about it and the guy who brought it in. Good luck with that :)
Dealership? Even worse proposition, they're actual targets so they know how to track down missing vehicles.
If you really want to steal a car via social engineering, hit a car rental place, give them fake documentation, then drive to a different state to unload it - you still have to fake all the paperwork, and strip anything that identifies it as a rental, and you won't be able to sell to anyone reputable so it'll be a slow process, and you'll need to disguise your appearance differently both times so descriptions don't match later. IOW - if you're doing it right so it has a chance in hell of working, that office job starts to sound a whole lot less tedious.
Stolen cars are often sold for low amounts of money - like $50 - and then used to commit crimes that are not traceable from their plates. It hasn't really been possible to steal and resell a car in the United States for many years, barring a few carefully watched loopholes (Vermont out-of-state registrations is one example that was recently closed).
When Kia and Hyundai were recently selling models without real keys or ignition interlocks, that was the main thing folks did when they stole them.
In Canada there's been a big problem with stolen cars lately. Mostly trucks, and other high value vehicles though. Selling them locally isn't feasible, but there's a criminal organization that's gotten very good at getting them on container ships and out to countries that don't care if the vehicles are stolen. So even with tracking, there's nothing people can do. Stopping it at the port is the obvious fix, but somehow that's not what is being done. Probably bribery to look the other way.
Same thing in Australia - some gang was busted recently for stealing mid-range four wheel drives, packing them in shipping containers with partially dismantled cars (I guess so that a cursory inspection would just show "car parts" rather than a single nice looking car) and then shipping them around the world (I guess an overseas buyer isn't checking if a car with this VIN has been stolen on the other side of the world).
Yeah, the only way to do it would be a cash transaction where you'd have to forge a legitimate looking title/registration and pass it off to a naive buyer. So it's still technically possible, but not in any kind of remotely scalable way.
Well, there's a flip side of that, which is that all our critical infrastructure is now open source.
And if you're comparing where we're at now, culturally, with where we were at in the early days of the internet - John Postel, the RFC process, the guys building up the early protocols, running DNS and all that - there's been a different kind of shift.
The way I look at it is, a lot of us hackers (the category I'd put myself in), academics, and hardcore engineers who worked in industry but didn't give a damn about anything except doing solid work other people could rely on - we built up the modern tech stack, and then industry jumped to it as a cost cutting measure and it's been downhill from there.
And this puts us all in a real bind when the critical infrastructure we all rely on is dominated by a few corporate giants who still have the mindset that they want to own everything, and they only pay lip service to the community and even getting bug fixes in if it's something they don't care about is a problem.
This mindset invading the Linux kernel is a huge part of the reasons for the bcachefs split, btw. We had a prominent filesystem maintainer recently talking openly about how they'll only fix bugs if they feel like it or as a part of a quid pro quo with another established player - and that's just not OK. Open source is used by the entire world, not just Google/IBM/Facebook/Amazon.
"How we manage critical infrastructure as a commons - responsibly" needs to be part of the conversation.
People lose touch with reality when life becomes too rich and comfortable, and they become too focused on security. You miss all the other corrosive influences on society.
I've travelled the entire United States, multiple times over, and seen quite a bit of Europe and South America, and I'm in Colombia now.
Latin America, and Colombia in particular would be far more of a "narcostate" according to the popular Northern definition - but perception often isn't reality.
I've never seen the gripping poverty and desperation that's common in the United States anywhere in Latin America; even the poorer communities here tend to be vibrant and well functioning, with families and little farming communities everywhere that are living life well. The fabric of society functions pretty well - health care and healthy food is far more available, far less conflict with government apparatuses (try walking into a DMV anywhere in the states, vs. walking into a government office in Latin America - I think you'll find it enlightening).
The security-obsessed mindset in the United States and Europe leads people to want to stamp out the mafia and cartels, but if you look at the actual outcomes I think it's pretty clear that that approach fails in the long run. Look at Mexico for the worst example of what can happen - being next to the United States the pressures have been high, and it hasn't worked, and cartel violence is absolutely ludicrous.
When people have more of a "live and let live" approach, things tend to stabilize in unconventional arrangements that are on the whole much less toxic to society. So Colombia, which does have cartels, doesn't have the same level of warfare or violence that affects the average person as Mexico does - where you'll regularly see a half dozen army/swat guys on patrol in a pickup with M-16s. Even so, you don't feel the same level of tension about that in Mexico vs. seeing a LEO presense in the United States, where that often means outright harassment for the populace.
There's a lot more to having a functional society than just eliminating elements that run contrary to "popular order".
> I've never seen the gripping poverty and desperation that's common in the United States anywhere in Latin America; even the poorer communities here tend to be vibrant and well functioning, with families and little farming communities everywhere that are living life well.
with all the respect but what a naive paragraph. i suggest you to go away from touristics places or get into a poor part of any big city in Latin america. the stuff is nasty. what you are comparing is relatively stable rural families that would be an akin to a rural medium class on the USA... you can almost say in 100% of the cases a medium class North American is equivalent of someone from the upper class here. in term of goods/comfort, not work. and if you still romantize as a traveler these poor communities on the backcountry, i suggest to try a week or 2 of their work. just take the routine of a +40 y/o man to check what being 'medium class' is about. being on the hunger line with a bare house is poverty and Latin America has many examples
Have you seen the poorer parts of the United States? Or walked around the Tenderloin? Or seen what meth has done to parts of the rust belt, and the farming communities that have been hollowed out and eviscerated across the midwest?
you are comparing a marginalized demographic against people who belong to the middle class on Latin America. it's totally out of sense. we also have cracolandia and favelas and people dying of diarrhoea and dying of hunger in some regions.
please, don't visit a country with probably tourist type of visit and sum up a whole continent on socioeconomics or whatever category your empirical sociologic observation was
Ok, if you're actually from Latin America, I should apologize - I don't mean to say that those kinds of issues don't exist (and actually, I have seen some - Honduras) - I often assume I'm talking to someone from the states, and Americans have gotten insular and really out of touch, and most have no idea how much things have changed over the past 50 years.
That said, I'd rather live in middle lower class Latin America that Estados Unidos any day. The food is probably going to be better - too many places in the States Walmart is the only practical option now - health care won't bankrupt you, and people in Latin America are almost universally better educated and less depressed on social issues.
And I think a lot of that can be traced to a culture that's a bit less authoritarian, because people understand the history of why that doesn't work. Just going to war with the Mafia or the narcos is a trite answer, but it usually doesn't solve things in the long run.
Edit - also, you really should compare the poorer parts of the big cities you're talking about to Detroit or New Orleans or the Tenderloin. In my experience, people in Latin America can also have a skewed perspective. The world is a big place.
Yeah, that's not a good environment for this kind of engineering. You need long term stability for a project like this, slow incremental development with a long term plan, and that's antithetical to VC culture.
On the other hand, Rust code and the culture of writing Rust leads to far more modularity, so maybe some useful stuff will come of it even if the startup fails.
I have been excited to see real work on databases in Rust, there are massive opportunities there.
where do you see these opportunities? i didnt see a lot of issues personally rust would be better at than C in this domain. care to elaborate? (genuinely curious!)
personally i see more benefit in rust for example as ORM and layers that talk to the database. (those are often useful to have in such an ecossystem so you can use the database safe and sanely, like python or so but then u know, fast and secure.)
You need to be crazy to use an ORM. I personally think that even SQL is redundant. I would like to see a high quality embedded database written in Rust.
It's painful having to switch to another language to talk to the database, and ORMs are the worst kind of leaky abstractions. With Rust, we've finally got a systems language that's expressive enough to do a really good job with the API to an embedded database.
The only thing that's really missing is language support for properly ergonomic Cap'n Proto support - Swift has stuff in this vein already. That'd mean serializable ~native types with no serialization/deserialization overhead, and it's applicable to a lot of things; Swift developed the support so they could do proper dynamically linked libraries (including handling version skew).
If I might plug my project yet again (as if I don't do that enough :) - bcachefs has a high quality embedded database at its core, and one of the dreams has always been to turn that into a real general purpose database. Much of the remaining stuff for making it truly general purpose is stuff that we're going to want sooner or later for the filesystem anyways, and while it's all C today I've done a ton of work on refactoring and modernizing the codebase to hopefully make a Rust conversion tractable, in the not too distant future.
(Basically, with the cleanup attribute in modern C, you can do pseudo RAII that's good enough to eliminate goto error handling in most code. That's been the big obstacle to transitioning a C codebase to be "close enough" to what the Rust version would look like to make the conversion mostly syntactic, not a rewrite, and that work is mostly done in bcachefs).
The database project is very pie in the sky, but if the project gets big enough (it's been growing, slowly but steadily), that's the dream. One of them, anyways.
A big obstacle towards codebases that we can continue to understand, maintain and continue to improve over the next 100 years is giant monorepos, and anything we can do to split giant monorepos apart into smaller, cleaner reusable components is pure gold.
No multithreaded write benchmarks. That's a major omission, given that's where you'll see the biggest difference between b-trees and LSM trees.
The paper also talks about the overhead of the mapping table for node lookups, and says "Bf-Tree by default pins the inner nodes in memory and uses direct pointer addresses to reference them. This allows a simpler inner node implementation, efficient node access, and reduced con-
tention on the mapping table".
But you don't have to pin nodes in memory to use direct pointer lookups. Reserve a field in your key/value pair for a direct in-memory pointer, and after chasing it check that you got the node you expect; only fall back to the mapping table (i.e. hash table of cached nodes) if the pointer is uninitialized or you don't get the node you expect.
"For write, conventional B-Tree performs the worst, as a single
record update would incur a full page write, as evidenced by the
highest disk IO per operation."
Only with a random distribution, but early on the paper talks about benchmarking with a Zip-f distribution. Err?
The benchmark does look like a purely random distribution, which is not terribly realistic for most use cases. The line about "a single record updating incurring a full page write" also ignores the effect of cache size vs. working set size, which is a rather important detail. I can't say I trust the benchmark numbers.
Prefix compression - nice to see this popping up.
"hybrid latching" - basically, they're doing what the Linux kernel calls seqlocks for interior nodes. This is smart, but given that their b-tree implementation doesn't use it, you shouldn't trust the b-tree benchmarks.
However, I found that approach problematic - it's basically software transactional memory, with all the complexity that implies, and it bleeds out into too much of the rest of your b-tree code. Using a different type of lock for interior nodes where read locks only use percpu counters gives the same performance (read locks touch no cachelines shared by other CPUs) for much lower complexity.
Not entirely state of the art, and I see a lot of focus on optimizations that likely wouldn't survive in a larger system, but it does look like a real improvement over LSM trees.
Sure, but on principle, looking at the paper, I'd expect it to outperform B-trees since write amplification is reduced, generally. You thinking about cases requiring ordering of writes to a given record (lock contention)?
I think their claims of write amplification reduction are a bit overstated given more realistic workloads.
It is true that b-trees aren't ideal in that respect, and you will see some amount of write amplification, but not enough that it should be a major consideration, in my experience
You really have to take into account workingset size and cache size to make any judgements there; your b-tree writes should be given by journal/WAL reclaim, which will buffer up updates.
A purely random update workload will kill a conventional b-tree on write amplification - like I mentioned, that's the absolute worst case scenario for a b-tree. But it just doesn't happen in the real world.
For the data I can give you, that would be bcachefs's hybrid b-tree - large btree nodes (256k, typically) which are internally log structured; I would consider it a minor variation on a classical b-tree. The log structuring mean that we can incrementally write only the dirty keys in a node, at the cost of some compaction overhead (drastically less than a conventional LSM).
In actual real world usage, when I've looked at the numbers (not recently, so this may have changed) we're always able to do giant highly efficient b-tree writes - the journal and in-memory cache are batching things up as much as we want - which means write amplification is negligible.
Also you can use dense B+-Trees for reads possibly with some bloom filters or the like if you expect/profile a high fraction of negative lookups, use LSM to eventually compact, and get both SSD/ZNS friendly write patterns as well as full freedom to only compact a layer once it's finer state is no longer relevant to any MVCC/multi-phase-commit schemes. Being able to e.g. run a compression algorithm until you just exceed the storage page size, take it's state from just before it exceeded, and begin the next bundle with the entry that made you exceed the page size.... It's quite helpful when storage space or IO bandwidth is somewhat scarce.
If you're worried about the last layer being a giant unmanageably large B+-Tree, just shard it similarly in key space to not need much free temporary working space on SSD to stream the freshly compacted data to while the inputs to the compaction still serve real time queries.
Of course mileage may vary with different workloads, but are there any good benchmarks/suites to use for comparison in cases like these? They used YCSB but I don't know if those workloads ([1]) are relevant to modern/typical access patterns nor if they're applicable to SQL databases.
You thinking about running some benchmarks in a bcachefs branch (:pray:)?
I want to see this data structure prototyped in PostgreSQL.
They're ancient, I only have pure random and sequential benchmarks - no zipf distribution, which really should be included.
Feel free to play around with them if you want :) I could even find the driver code, if you want.
I've always been curious about PostgreSQL's core b-tree implementation. I ran into a PostgreSQL developer at a conference once, and exchanged a few words that as I recall were enough to get me intrigued, but never learned anything about it.
In a system as big, complex and well optimized as either bcachefs or postgres, the core index implementation is no longer the main consideration - there's layers and layers, and the stuff that's fun to optimize and write paper about eventually gets buried (and you start thinking a lot more about how to lay out your data structures and less about optimizing the data structures themselves).
But you know in something like that there's going to be some clever tricks, that few people know about or even remember anymore :)
For a reductionist, it might be better understood as - step outside of your usual mode of thinking. Remember that you don't know everything. Or just - take time to stop and smell the flowers. Try to spend more time noticing and less time analyzing.
There are things that are difficult to communicate directly in the reductionist mode of thought - and are intended to have meaning at multiple levels of abstraction. You have to think a bit more laterally.
Table; chair behind and little to the left of the chair; plant on table
Most people won't really have conscious access to all the details that we use in recognizing objects - but that is a skill that can be consciously developed, as artists and painters do. A non-artist would be able to identify most of the details, but not all (I would be really bad compared to an actual artist with colors and spatial relationships), and I wouldn't be able to enumerate the important details in a way that makes any kind of sense for forming a recognizable scene.
So it follows from that that our ability to recognize faces is not purely - or even primarily - an attribute of what we would normally call "memory", certainly in the sense of conscious memory where we can recall details on demand. Like you alluded to re: mammals and spaces, we're really good at identifying, categorizing, and recognizing new forms of structure.
reply