Ask HN: Language-agnostic concepts a Backend Engineer should know?

danpalmer · on July 2, 2021

I think it's important to know about different data storage options and their trade-offs. Managing state is one of the hardest parts of backend development, particularly at scale, so an understanding of the trade-offs in databases/caches/blob-storage/queues, and when each is useful is important.

I'd pay close attention to speed and "correctness". What's the consistency model of a system? Can we lose data and if so how? What's the throughput? Latency?

These help choose good technology for backend systems, and helps answer questions like:

- Can we do this in-band while serving a user request?

- Can we do it 100 times to serve a request?

- If it completes successfully can we trust it or do we still need to handle failure?

- Can we trust it immediately or eventually?

There are lots of technologies and terms for all of this but I've specifically avoided them because the important bit is the mental model of how these things fit together and the things they allow/prevent.

SOLAR_FIELDS · on July 2, 2021

Absolutely this.

Also, I’ve had to explain this to so many other engineers, both junior and senior to me: most data is inherently relational. This next statement is a bit opinionated: 9 times out of 10 you probably want an RDBMS. I’ve seen so many attempts to shoehorn some ElasticSearch/Mongo/Neo4j/whatever database into a design because the developer wanted to work on CoolDatabaseTech. Then you’re stuck dealing with joins in CoolDatabase that it wasn’t really designed to do and frustrated at CoolDatabase’s lack of drivers in X language. Later on you’re dealing with stability and scalability issues you would never see with BattleTestedRDBMS.

The amount of capability a well designed Postgres instance can output is insane. I’ve seen a single vertically scaled Postgres instance compete with 100+ node Spark clusters on computations.

danpalmer · on July 2, 2021

Exactly, but it goes a lot further than RDBMSs. For example does the application expect that all items on a queue will be processed? If so then you need a durable queue and Redis probably isn't a good idea, and this will likely reduce the throughput of the queue which might change how it needs to be used.

One I've been bitten by several times is expecting APIs to allow me to read-my-writes, only to find that their underlying data store is eventually consistent. The integration point/API client on our end may end up being twice as complex or more just to handle that.

mamcx · on July 2, 2021

> 9 times out of 10 you probably want an RDBMS.

And the last 1 can be done (modeled) in a RDBMS when the scale/pressure/volume is low. In other words, wait until you feel the heat.

SOLAR_FIELDS · on July 2, 2021

Yeah, Postgres off the top of my head does NoSQL (JSONB), graph and time series stuff either natively or through some cheap or freely available add ons. It really can do anything. It’s not gonna be the best at that non-relational stuff, but it will do a “good enough” job for most use cases until you introduce heavy scaling.

Zealotux · on July 2, 2021

I'm currently learning back-end coming from a front-end career, and I started reading "Designing Data-Intensive Applications" by Martin Kleppmann, seems to be a must-have for anyone who wants to get serious in this field.

davidkell · on July 2, 2021

+1 For this recommendation, an exceptional book. In particular chapters 2-3 demystify how databases work, and provides an intuition on when to use different kinds of databases and query languages without the typical hype (spoiler alert - use an RDBMS!)

humbleMouse · on July 2, 2021

Yes this is a great book and a perfect example of language agnostic knowledge that is useful.

throwaway81523 · on July 2, 2021

Security mindset: read the book Security Engineering (it is online), less for specific technical info than for the many war stories etc. which will help you guard against vulnerabilities and unforeseen consequences.

Basics of cryptography: there are many dumb errors to avoid.

Antirez's general advice about "10x programmers" is good: http://antirez.com/news/112

Thorough (not just basic) knowledge of SQL, if you don't count that as a language. The sqlite.org "technical and design documents" about sqlite's virtual machine and its query planner are well worth reading, and apply to other databases as well. ORM's are less important than SQL, and are usually language specific as someone mentioned.

Reasonable clue about socket programming, even if you're doing everything with libraries that wrap the details.

Comfort using debugging and profiling tools.

Lots of other stuff, I'm sure.

comprev · on July 2, 2021

This book? https://www.cl.cam.ac.uk/~rja14/book.html

throwaway81523 · on July 3, 2021

Added: Oh nice, there is a third edition (2020) now. I only knew about the 1st and 2nd editions. The 2nd ed is completely online but the 3rd only has sample chapters online: OTOH, the 3rd is in all likelihood an updated/expanded version of the 2nd. So you could read the 2nd and decide about buying the 3rd.

I'd say it's a book for inspirational bedtime reading, rather than careful study or reference. But it's great in that way. Security is about mindset more than anything else, and the book puts you right into it.

throwaway81523 · on July 2, 2021

Yes. It is a little bit old by now, but it is a great book.

nogbit · on July 3, 2021

IAM, who and what is authenticating, how and what permissions will it have.

What data is coming into your system and it’s variety, velocity and volume.

Do you really need NoSQL, probably not.

Do you really need that ORM and all the schema, migrations and ops to go with it, known the pros and cons.

Are your boundaries defined well? Networking, firewalls etc? Are or do they need to be identity aware?

Are you logging what you need to log, where you need to log it and do the right people have access to it? Maybe metrics are really what you need.

What’s the dev story like? Can I run things locally or easily without spending days recreating an environment? IAC is one thing, but debugging some Python locally vs deploying and print statements sucks. Have a good readme and leave the repo better than you found it.

Tackle the hard problems first, then create reproducible developer story, then hand it off to someone Jr. so they can do the grunt work and you can help them out in a jiff.

CI/CD, incrementally improve it over time and don’t spend time boiling the ocean here. A simple bash script to deploy may suffice for an SRE to take it to the next step as IAC or to drop it into some CI tool.

Apply the practice of least privilege from the very start.

KISS, if what you are building is too confusing, it’s because you need to spend more time writing about it vs actually writing it.

tekstar · on July 2, 2021

C and how to debug it.

If you understand the system a layer of abstraction or two below the layer you work in, you will be able to debug deeper. Learn system calls, Various ways how to examine processes.

I learnt a lot of this back in the day by completing war games on a site called digital evolution (dievo). Those are antiquated now but still a really fun way to learn it.

pdpi · on July 2, 2021

Also, learn Javascript and browser tech in general. One of the keys to high-performance systems is figuring out how to make the layer above and the layer below talk to each other while removing yourself from the process as much as possible.

cudgy · on July 2, 2021

I would think that most front end developers already know JavaScript. It might make sense to look at some other approaches like Go (and maybe Rust/C/C++), which is designed for back-end work unlike JavaScript which was basically shoehorned into back-end work in order to leverage all the JavaScript developers in many companies.

tifadg1 · on July 2, 2021

sounds very niche from an employability PoV.

sidlls · on July 2, 2021

These skills have helped me at every single job I've had, including my current one, where I've employed the skill a couple of times to discover bugs or unintended side-effects in third-party library code. For context I've worked in the industry for about 15 years now, in roles ranging from low-level programming on real-time systems to (currently) high level machine learning work.

It might be niche, but it can also be a differentiator--even if I'm not the fastest coder or the best architect I have these other skills that make me valuable at critical times. That's worth something anywhere.

And it's not even knowledge of C or C++, or syscalls or whatever: it's just basic "use a debugger" (not merely pantomime the commands, but understand what's going on) skills that can be the real game changer.

SOLAR_FIELDS · on July 2, 2021

Totally agree with this. I have never written C or C++ in a professional manner but just knowing how to debug and compile it has proved to be an invaluable skill. Lots of software ecosystems often have some underlying C/C++ code that they are calling out to and being able to dive into that when there is an issue is an incredible skill to have.

Make, Ninja etc. are fairly straightforward compared to something like Gradle and just knowing my way around that and Clang/GnuCC has gotten me a lot farther in my career.

pdpi · on July 2, 2021

It doesn't make you "employable". It does make you incredibly valuable once you're inside. Being able to debug systems and being able to talk to other teams working on different levels of abstraction are both incredibly important skills.

fennecfoxen · on July 2, 2021

You are by all means welcome to remain ignorant of how computers and frameworks and libraries all work, incapable of tracking down and fixing bugs in your application stack that affect you. Build a career filling in the boilerplate that Spring or Rails or whatever generates, and when the system doesn't perform as you expected it to, throw up your hands and say "I don't know what it's doing," and end the matter there. This is, presumably, a quite ordinary and common approach to software development.

But if you do so, then I must ask: what are you doing on this site? What can you possibly get out of it?

cudgy · on July 2, 2021

Your first paragraph is dead on. When I first encountered Ruby on rails after developing in C++ for years, it was very difficult to accept the black box of ruby on rails. In fact, my instincts are to fear tools like that. The sheer number of dependencies on third parties when using tools like Ruby on rails is downright frightening. Good luck making sure that all of the libraries your linking to are secure and that you can fix any issues that come up.

konart · on July 2, 2021

>>frameworks and libraries all work, incapable of tracking down and fixing bugs in your application stack...

What does any of this has to do with C?

>What can you possibly get out of it?

I hope you realise that many of commenters here are not even software engineers. HN covers more topics than just 'another arrogant software dev lectures someone on topic X'

whynaut · on July 2, 2021

> 'another arrogant software dev lectures someone on topic X'

Not GP. But this is explicitly a software thread.

> >>frameworks and libraries all work, incapable of tracking down and fixing bugs in your application stack...

Pretty sure they are implying that C/C++ lies somewhere in most software stacks.

konart · on July 2, 2021

>Pretty sure they are implying that C/C++ lies somewhere in most software stacks.

So do many other things. It's nice to know every bit of tech behind the scenes but one has only this much time.

jgwil2 · on July 2, 2021

Well, one doesn't have time to learn everything, so it's entirely reasonable to do a cost-benefit analysis (in terms of employability or personal interest) before investing.

pdpi · on July 2, 2021

The OP's statement makes the context clear though: It's not about knowing everything. It's about knowing a couple of abstraction layers under you (and, as I argued elsewhere, above you too).

It's useful to understand a few classic networking problems that a backend engineer might face — e.g. weird latency issues caused by Nagle's algorithm, or TCP CLOSE_WAIT leading to ephemeral port exhaustion between a proxy and an application node. It's useful to understand why we mostly moved to event loop-based servers instead of thread-per-connection servers as a way to handle c10k.

A backend engineer doesn't necessarily have to be an expert on any of these things, but they should be able to follow along if an expert explains that sort of problem.

jgwil2 · on July 2, 2021

OP was responding to a comment specifically about debugging C. Someone questioned the value of this to employability and then the comment I responded to laid into him/her, suggesting no one on HN should ask that. I am defending that question.

tekstar · on July 2, 2021

Yeah it's the complete opposite of niche. It's the fundamental understanding of processes, signals, syscalls, memory that every modern computing system is based upon.

cudgy · on July 2, 2021

So much for being language agnostic

Diggsey · on July 2, 2021

IME, it's the following considerations that make back-end development hard:

- Fault tolerance. - Backwards (or forwards) compatibility. - Scalability. - Testability. - Everything around state (backup/restore, migration strategies, data integrity, etc.)

Most other things are a one-time cost. These things are an ongoing burden to consider, but if you forget to consider them it can be devastating.

Also remember: any time you give a (internal or external) customer programmatic access to something, that is an API, and APIs have huge costs to maintain. That includes when you dump your database into "data lake" for internal reporting...

ianpurton · on July 2, 2021

Some that spring to mind

- Database Migrations

- Kubernetes

- Basic RPC and code generation i.e. gRPC, OpenAPI and GraphQL.

- Realtime Concepts, i.e. Kafka, MQTT

- DevSecOps

- Builds. i.e. make files.

- Jobs, i.e. cron or batch and job workflows.

- Offsite incremental DB backups and restore.

- Infrastructure as Code i.e. Pulumi.

comprev · on July 2, 2021

- How to write clean and concise documentation, including references to further reading material which helped you solve that particular problem.

- Basics of server/runtime environment security (RBAC, least privileges, common threats, etc.)

agentultra · on July 2, 2021

- HTTP: REST/HATEOAS, headers, transport layer caching, rate limiting, load balancing

- Authentication: OAuth2 is probably the most widely used

- Authorization: RBAC

- Some rudimentary statistics: know how to read metrics, write metrics, etc

- Learn one RDBMS inside and out. Other database systems have their place but you’ll almost always encounter a Postgres, MySQL, MSSQL. Learn how to read EXPLAIN output, cursor based pagination, and indices.

chris_j · on July 2, 2021

Understand the importance of having good visibility of your system. Implement good logging and collect metrics, for example the four golden signals of throughput, latency, saturation and error rate. The Google SRE book gives a good introduction to some of these concepts. See for example https://sre.google/sre-book/monitoring-distributed-systems/.

Understand how to load test your system and to reason about its behaviour under load and its failure modes when you push it too hard. It's one thing to be able to build a system and functionally test it such that you're confident that it behaves correctly when you send one request at a time. It's another thing to let thousands or millions of real users hit it for real in production and to have confidence that you are giving them all a good experience.

gilfoyle · on July 2, 2021

Would totally recommend this book "Patterns of Enterprise Application Architecture" by Martin Fowler

https://martinfowler.com/books/eaa.html

bbkane · on July 3, 2021

I would add soft skills - how to sell your ideas; how to ask for help, how to offer help with offering offense; how to write docs, emails, or proposals; how to avoid taking offense when it feels like you're being ignored or slighted (usually not the case); how to keep stakeholders updated; creating realistic timelines. I'm sure I'm missing some. You can accomplish more than your technically astute colleagues and work on more interesting projects if people trust you, like you, and feel inspired and happy to work with you.

stueynz · on July 3, 2021

Privacy management.. Privacy laws are severely restricting who may see people’s PII

Are you using the PII data for purposes other than it was originally collected?

Can you synthesise a good enough set of test data so you don’t have to anonymise production data? Hint: you can’t sufficiently anonymise production data and still have it be useful

yewenjie · on July 2, 2021

Slightly related, what are the most high-quality resources for learning backend engineering?

Sonata · on July 2, 2021

Having a good knowledge of HTTP is useful in many different contexts.

- The correct semantics for each HTTP method

- What different status codes indicate

- Common headers, particularly around caching

- HTTP 1.1 vs HTTP 2

- Common authentication protocols - OAuth 2.0, JWTs, etc.

bovermyer · on July 2, 2021

Besides all of the excellent suggestions others have said, I would add:

- the OSI model, DNS, TCP/UDP, TLS, and networking in general

- CPU flame graphs and other low-level performance/debugging tools

- the Knightmare devops story

- anger management

rasikjain · on July 3, 2021

Here are some of the concepts in no particular order. This is the quick list I came up based on my experience and usage.

1) RDBMS, NoSQL Concepts

2) Writing Queries and Joins

3) Connecting to Database native and ORM

4) HTTP Verbs like POST, GET, DELTE, PUT etc

5) Restful API and GraphQL Concepts

6) Session State, Application State, Caching and Safe Error Handling

7) Distributed Systems, SOA, Async Functions (i.e file handling)

8) Design Patterns, OOPs concepts (Abstraction, Interfaces etc)

9) Authentication, Authorization, Cryptography

10) Configuration, Minimum Privileges (e.g dbrole, server account etc)

eatonphil · on July 2, 2021

How to set up a decently secure, monitored and backed up server.

corobo · on July 2, 2021

Look into and learn as much as possible about penetration testing.

No better way to know how to secure your code than the mindset of "Ok how would I break into this" :)

perelin · on July 3, 2021

I always thought the roadmap.sh Backend Developer Roadmap gives a nice overview of technologies and concepts. Even though it misses some things imo (identity management comes to mind)

https://roadmap.sh/backend

aszen · on July 2, 2021

I have found that the most important bit as a backend engineer is modelling the application domain precisely. A taste for making good APIs helps and knowing how to measure performance.

These days most backend engineers also tend to manage data sources so understanding them is also a plus

koolba · on July 2, 2021

Don’t hide the async nature of external requests. It seems like a magically good idea until it all falls apart. Timeouts and error propagation needs to flow from the request ingress all the way to the backend services.

etaham · on July 2, 2021

Distributed systems, consensus & data consistency concepts

aristofun · on July 3, 2021

Any (almost) software engineering concept is language agnostic.

The sooner you get this attitude the more comfortable youll be with your skills development.

slifin · on July 3, 2021

How to move your IO and side effects to the edge of your system

So that you can perform tests without firing the side effects

tompazourek · on July 2, 2021

Almost everything is actually language agnostic or transferable. So it's a big question.

eterps · on July 2, 2021

Domain Modeling (check out the book 'Domain Modeling Made Functional').

dudul · on July 2, 2021

Isn't an ORM the perfect example of something that is not language agnostic?

onei · on July 2, 2021

The implementation is, but the concept is transferable. Pretty much the same for most concepts in programming.

exdsq · on July 2, 2021

SQL is far more useful to learn than any one ORM or even the general concept of am ORM.

loevborg · on July 2, 2021

ORMs are language-specific, but the insight to avoid ORMs for the rest of your life because of impedance mismatch is general and deep.

Aeolun · on July 2, 2021

Lazy loading

Dependency inversion/injection

throwaway019254 · on July 2, 2021

Idempotence

Oleg2tor · on July 2, 2021