The author claims this produces “Uncaught SyntaxError: Invalid or unexpected token”. This is incorrect; you end up with the body being `return; [item];` thanks to Automatic Semicolon Insertion, which is unlikely to be what was intended, and will produce a warning in some environments (e.g. “unreachable code after return statement” in Firefox), but is not a syntax error.
I presume the author wrote slightly different code. Most especially, I don’t believe anything is ever indentation-sensitive.
This is why I hate that Javascript has automatic semicolon insertion. A whole new set of rules to keep in mind because some developers can't be bothered to put semicolons behind their expressions and statements. Unlike languages like Python, which have been designed from the ground up to avoid semicolons, or Bash, which requires very explicit multi line splits, Javascript is just not a great language to work without them and the language clearly suffers for it.
Javascript shouldn't be indentation dependent, but I have no trouble believing that it's doing something weird to indents because of this "feature".
What is this set of rules exactly? As far as I'm concerned all you have to do is not put newlines between your "return" and whatever is being returned, which is hardly a common thing anyways.
Generally, the rule seems to be "insert a semicolon wherever the syntax messes up but can be fixed by adding semicolons". Then there are some special cases (++/--/continue/break/return/throw/arrow functions/yield) with more specific rules about where semicolons are placed and when; those are probably why the example produces "unexpected" results.
I think the added complexity of these special rules and cases are an unnecessary burden, and honestly the language should never have allowed leaving out semicolons when it was clearly designed to have semicolons.
Semicolons are like the periods of sentences It's so much easier to read something someone else wrote when thoughts are properly terminated Why would you ever demand that semicolons are removed They also prevent issues such as pronouns like SemanticStrength if used with mechanical automatic period insertion
Seriously, semicolons are something that was baked into the design assumptions of JavaScript. Python's an example of a language where the syntax was clearly designed from the inception to avoid them. Most such languages make explicit tradeoffs and TBH I'd much rather have semicolons than whitespace sensitivity.
No the analogy is wrong.
I appreciate your example devoid of periods and while I agree periods in natural language significantly reduce the cognitive overhead, it is only because sentences in english can have multiple statements.
Periods are not needed if each sentence span exactly one line, which is the case in programming languages.
the separators &&, ||, () do the intra statement disambiguation while for interstatement {} suffice. ; bring no value in practice. Using multiples semi colons on a single line is pathological code that should never exist in the first place.
When I code in Kotlin (which is not whitespace sensitive like python) I really realize that reading semicolon less code is smoother, it takes less cognitive resources (parsing is faster) and is more pleasant because the reading is not interrupted by vacuous symbols. There are no ambiguities in practice. The thing is, habits take time to die but ask anyone that has experience in an officially semi-colon less programming language, most will tell you it is a net improvement.
BTW the vast majority of newer programming languages are semi-colon-less (Kotlin, Go, Swift, Julia, Scala, etc)
A series of individual invocations of arcane commands; sometimes with pipes
Though it is not uncommon for one-liners to exist (sometimes with sub-shells). In that case the semicolons are not optional. Even in cases where another type of escape is used to visually break up the one liner into a list like series of commands which happen to be run concurrently; most frequently for really long pipe-sets.
Putting each "sentence" of code on its own line is actually the crutch that leads to low density and high line counts. As humans en masse become more comfortable parsing and writing machine inter-languages I imagine things will trend the other way.
You could do other things to prevent it from sticking out of the side of the monitor but if it is to say return true if this and that and that and that and that it would (to me) look better on separate lines.
One particularly annoying case comes up when using IIFEs:
let c = a + b
(() => { ... })()
This is evaluated as:
let c = a + (b(() => {...}))()
which usually results in "TypeError: b is not a function." Obviously this can be solved with a semicolon after the "let c..." line, but another common approach is to just prefix all IIFEs with semicolons, since empty statements are no-ops in JS:
let c = a + b
;(() => {...})()
I personally prefer to forego semi-colons, but I have a hard time getting worked up over it either way.
Prettier can screw that situation up pretty badly. A stray return can suddenly turn into a multiline return statement with boolean logic and assignments inside.
Go doesn't have automatic semicolon insertion. It's just not required unless for splitting multiple commands on the same line. So falls into the same category as Bash in the GPs examples. Thus Go's syntax has been designed around having semicolons omitted.
Whereas JavaScript technically requires it but many browsers decided to support parsing missing semicolons for the sake of making bad websites render (much like how most browsers will support a variety of incorrectly formatted HTML documents as well). The problem here is that JavaScript's syntax wasn't designed to have semicolons omitted. Which leads to all sorts of edge case problems and no clear answer on how they should be handled.
The JS part of your comment isn't correct at all. ASI rules are fully described in the very first JS spec, and in 20+ years of JS coding I can't recall ever having heard of browsers handling it inconsistently.
Maybe it's the way the language is taught that has changed? Originally (pre-ECMA days) you were taught that you should always include a semi-colon. The older specifications certainly don't encourage dropping semicolons liberally either. Take this passage from the specification:
> Most ECMAScript statements and declarations must be terminated with a semicolon. Such semicolons may always appear explicitly in the source text. For convenience, however, such semicolons may be omitted from the source text in certain situations. These situations are described by saying that semicolons are automatically inserted into the source code token stream in those situations.
Which, to me at least, reads as saying JavaScript must include a semicolon but interpreters must be tolerant of the occasional missing semicolon where intent is obvious.
Somewhere along the line the attitude seems to have changed from "you should include them" to "don't bother". Or at least that's perception I get these days.
I've not been heavily involved in JavaScript for a few years but I was one of the early adopters back in the days of Netscape Navigator and, later, Internet Explorer (albeit with it's differing but ever-so-similar "JSScript").
As an aside, since I'm talking about the early days of web scripting, wouldn't it be great if we brought back the language property of <script> to support other languages. Wouldn't be sensible in practice (for a whole plethora of reasons) but one can dream....
JS appeared in Dec 95 and the first ECMA spec came in June 97, so that's a pretty narrow window. But it's certainly true that ASI style (omitting semicolons) was relatively rare before, say, 2010-2014 or so. Its subsequent rise in popularity is probably thanks to Node (which uses ASI style), and a small group of ASI developers whose code suddenly got more visible once npm came along.
But as a practical matter, the rules on ASI have been the same since forever, and the rules are that they're nearly always optional. The only realistic case where omitting them causes issues is when you start a line of code with a parenthesis or a square bracket. Back when ASI first got popular that was the one big gotcha, but nowadays thanks to linters even that's not an issue.
What you may be remembering is, some years back there were some cases where certain JS tools didn't support ASI. Famously, there was a thing where a popular minimizer (JSMin) broke ASI code and Crockford didn't want to fix it because he didn't like ASI style (he later relented). But I've never heard of an actual JS engine not supporting ASI correctly.
> JS appeared in Dec 95 and the first ECMA spec came in June 97, so that's a pretty narrow window.
In theory yes, in practice not really. IE4 was the first version of JScript to be based on ECMA and that was released at the end of 1997. But in that era people didn't update regularly. Sure some folk (myself included) installed IE4 on top of Windows 95 SE[1]. But a lot of people wouldn't have had access to IE4 until they bought a new PC with Windows 98 pre-installed. So there was a good few years in practice.
Also worth noting that the web changed so much between 95 and 98. It might have only been three years in real terms but it felt like a lifetime in terms of technical revolutions.
The second most popular formatter is semi-colon-less https://standardjs.com/
Also yes, strong agree the web should support other languages and it is very easy to achieve by integrating the polyglot and interoperable graalVM
I never have trouble with automatic semicolon insertion in real JS-based projects thanks to similar format tools. JS only really has this problem when it is serving its unique position as the scripting language of the web.
This one was a real surprise, the rest were "sharp eyes" or "understand how to use your tools".
function foo() {
let a = b = 0
a++
return a
}
console.log(typeof a)
console.log(typeof b)
As expected, `a` is defined with `let`, so it is a local variable. But `b` is defined without let, so it is actually globally. I personally do not like declaring multiple variables in a single statement like this, so I've probably not coded any bugs with this behaviour, but I have most certainly worked on code which I now understand to have latent bugs due to this behaviour.
That's why we introduced the strict mode ("use strict" pragma)[1]. Among other things, it prevents from accidentally declaring a global variable this way, throwing a ReferenceError.
Actually, at this point, with the prolifiration of strict mode and linters, I'd say that these old gotchas mostly belong to quizzes and spec discussions, since personally I have not seen new code written this way even in vanilla JS for years by now.
I should apologize, my normal way of writing made it seem like I wasn't sure about having seen it but I was sure, I often use phrases that imply inexact information (for example: I guess) when what I mean oh yeah, I am sure about this thing.
I didn't notice this quirk about global vars either, and indeed if you copypaste into jsfiddle both console.logs are actually undefined.
TIL if you use strict mode, undefined vars do not become global.
TIL Array.sort sorts integers lexically by default. I wonder how many bugs that has caused over the years.
As a side-comment, often I see JavaScript learning channels and Twitters focusing a lot on JS gotchas and stuff like the prototype chain. I’ve worked on production JavaScript apps for years and have never needed to touch the JS prototype system. Time could probably be spent better elsewhere, as fun as the gotchas are. NB: I’m not necessarily saying the author of this blog is guilty of this - for all I know this is the only post like that they have written; but I have seen many channels/Twitters where this kind of stuff is post of what they discuss - possibly because of interview questions in their areas).
The more surprising thing about `Array.prototype.sort` is that it both sorts the array in place and returns the (now-sorted) array. You can go a long time writing `const arr2 = arr1.sort()` without realizing that what you're doing is wrong, and the code you're writing is misleading readers about what it does.
> I’ve worked on production JavaScript apps for years and have never needed to touch the JS prototype system.
I think that is similar to “why do we bother teaching χ in school, I've never used it as an adult”, where the answer is a mix of:
1. For the people it is useful to, it is very useful. Those that may later be building frameworks rather than using them for instance, to save them reinventing wheels, or those coming from other languages who might otherwise have more trouble reasoning with JS as they try to use it exactly like they've used something else.
2. For the people it isn't directly useful to, at least a vague understanding of what goes on can be useful if only because you need some knowledge to know if you need to care.
3. For everyone else, perhaps myself included though I'm probably more in group 2, you taking a hit to your learning time is a sacrifice well made to make sure the people who do find it useful get it!
I think I will never understand Javascript prototype chain.
People always say that it's actually simpler than "traditional", Java-like classes... yet when JS gained that, with ES6, I could immediately "grok" what `class` and `new` was doing. I could never do that with __proto__ and prototype.
An object 'a' having a prototype of 'b' means that if a property is not found on 'a', it will then look it up on 'b'. It's just chaining. To set an object's prototype, you set the special __proto__ property on it.
var a = { foo: 5 };
a.__proto__ = { foo: 10, bar: 2 };
console.log(a.foo, a.bar); // 5, 2
You can also use Object.create() to initialize an object with a given prototype. So the above is equivalent to:
var a = Object.create({ foo: 10, bar: 2 });
a.foo = 5;
console.log(a.foo, a.bar); // 5, 2
The real confusing part is the property called "prototype". This doesn't exist on random objects, it's only available on functions, and it sets the __proto__ of the object created with that function when used as a constructor (that is, when using the "new" operator).
function Foo() {}
Foo.prototype = { foo: 10, bar: 2 };
var a = new Foo();
a.foo = 5;
console.log(a.foo, a.bar); // 5, 2
The "new" operator also sets a.constructor = Foo, and calls the constructor method. You can think of the new operator as syntax sugar performing the below:
function Foo() {}
Foo.prototype = { foo: 10, bar: 2 };
// The four lines below are equivalent to `var a = new Foo();`
var a = {};
a.constructor = Foo;
a.__proto__ = Foo.prototype;
Foo.call(a);
a.foo = 5;
console.log(a.foo, a.bar); // 5, 2
The reason for all of these different mechanisms is history. The original mechanism was the constructor `prototype`, then in ES5, Object.create() was added. Then, much later on, `__proto__` was standardized (it was a non-standard extension before).
The prototype object model is half-baked in JavaScript.
The other prototype based language I've programmed in was the Io language and there it made sense, just by having a built-in clone utility. See Objects section https://iolanguage.org/tutorial.html
With the lack of a clone operation, in order to define new objects types you had to go through hacky stuff, like explicitly setting up .prototype properties.
> The prototype object model is half-baked in JavaScript.
Exactly. Not just half-baked but half-assed too, the entire concept of constructors doesn't belong for starters.
Javascript's prototypal system was never a philosophy, it was just a way to quickly get an object system up and running (which is why delegation has to be hand-rolled, delegating ctors stinks, and most of the native objects can't be derived from at all), essentially a terrible implementation detail people had to deal with out of necessity.
> With the lack of a clone operation, in order to define new objects types you had to go through hacky stuff, like explicitly setting up .prototype properties.
As you link to io, JS IO's clone way back (in ES5 I think?) in Object.create. I don't think it's very good, though I think IO's clone is even worse as it really do what you'd intuit in reading the word (I think JS's naming is less confusing, though hardly great).
In Self, the "clone" operation would simply do a shallow copy (unless you'd overridden it). So you'd have an object, "clone" it, and you'd get a copy of that with the same slots set to the same value.
A prototype then would simply be an existing "reference" instance you'd copy, and you'd get the same parent slots, meaning you'd get the same parents (what Javascript and Io mis-call prototypes).
Ah, so you mean calling the superclass constructor. To me, the wording ”delegate a constructor” means ”to entirely shift responsibility for constructing an instance to something that isn’t the actual constructor”, which made me think of something else than the normal way extending constructors work.
> yet when JS gained that, with ES6, I could immediately "grok" what `class` and `new` was doing. I could never do that with __proto__ and prototype.
The new ES6 "class" syntax is really syntax sugar on the existing prototype system. "class" does not add new functionality to the language. Anything that can be written with "class" can be written with the older way. So really, you need to understand the prototype system to really understand how "class" works.
But I think you don't actually need to deeply understand how the prototype system works to use the new ES6 class syntax. Your "Java intuition" + the knowing that there could be gotchas in corner cases should suffice if you don't do anything weird.
It depends on your starting point. Java-like classes are simpler to you because you already know them. JS prototypes have a few simple rules you also need to know. I personally think both are simple.
When people talk about something being simple (or complex), they're talking about how many "things" there are in the concept, rather than how easy or intuitive it is to understand (since that's subjective, as pointed out).
Prototypal inheritence is seen as simpler because conceptually it involves less "things". Both JS and Java have runtimes with values containing basic data (strings, ints, bools etc) as well as objects. Where they differ is that JS has first class functions (meaning functions are also values just like basic data and objects), which is what enables prototypal inheritance. And Java has a compile time where it generates static objects (classes) that aren't values and can't be changed, that's what enables classical inheritance.
It's generally considered that conceptually the functional approach is simpler, because in those languages functions adhere to the same rules as objects and other data, you can add new ones, remove them, move them around, copy them into a different variable etc. So all the "things" that apply to working with functions are just the same "things" that work on everything else, there's nothing really extra there, you don't need all these special rules and concepts like "reflection" where you're "reaching into the internals" of the running program or whatever it is in Java, because you don't have any "internals" to reach into. Nothing is happening at some compile time before the code actually runs, and nothing is static or out of bounds.
None of that really has much to do with prototypal inheritance itself. But it's relevant in that you can't just pick which kind of inheritance to use, you need the language constructs that enable them, so they're part of the complexity. The two forms of inheritance are basically the same thing, you just have an object with your functions on it that has a special label saying "if you don't find what you're looking for, look at this object instead". The difference is that in Javascript that label is just a standard property on an object, that can hold another object. Whereas in Java, that label is an "extends" or "inherits" clause on class that's wired up at compile time, and at runtime it's treated as static and outside of the programmer's control.
Just a nitpick, but Java’s classes are values (though immutable) and are available by the getClass method of every object.
But thank you, great summary! Though now I am interested in whether a javascript-like inheritance is pretty much possible with lambdas in java, since instance field resolution does work identically to the javascript logic (though with runtime immutable inheritance chain).
Are you using some bespoke Babel plugin that actually adds Java-like classical inheritance to JavaScript, or are you implying that the standard EcmaScript 6 classes which Babel handles by default are somehow "Java-like" instead of prototype based? Because they are not, they still use the same prototype-based inheritance, even though the syntax kinda looks like Java, and have all the same pros and cons of prototypal inheritance as before.
I kind of know what `class`, `new` and `extends` does, I never know what `__proto__` and `prototype` does without googling it and staring at the code for a long time.
The point is you don't need to know though. It's called __proto__ exactly because it's something internal that the programmer of course can have some fun with if they so choose, but it is unnecessary for using it. Unless you do extreme meta- stuff, but that is very far from normal programming and should be done by few and very sparingly and well-targeted and triple-checked.
I mean, for the advanced programmer knowing things about internal workings of the ECMAScript spec and of actual runtimes - most such bog posts are about V8, covering Chrome, Edge (new), node.js - are good to know, such as what is an Execution Context, an Execution Stack, and what is a Scope Chain and how are they implemented. Or, what happens internally when you modify an object's structure by adding or removing properties. Or, what happens when you call a function with different types and numbers of parameters, how /why this destroys internal optimizations. Most programmers will have only a fuzzy, if any, understanding of those though, and that's fine.
Even before "class" was introduced into Javascript people were constructing "classes" left and right, using the prototype, and many (most?) did not know of or did not use __proto__. The latter is used internally during runtime, it is not something you need to manipulate. Your "interface" is the `prototype` property (see https://stackoverflow.com/a/9959753/544779).
Reading the ECMAScript spec for someone thus far only well-versed in MDN as Javascript documentation feels quite weird, there's some learning curve required, new words, new concepts. But that's because now you opened the hood and are looking at the gears and the engine of the language.
The basics of some event loop internals, such as what microtasks are, are much more important to know. Asynchronous issues catch people more often and are much harder to debug.
Compile 3 times to figure out the answer in 30 seconds and never do the wrong one again, saving your employer about 1,000 man hours across the team, which they’ll instead use to interview additional people seeing if they unnecessarily memorized these answers and smugly pretend that “anybody that doesn’t know this basic knowledge is a joke, source: me, I made that up on the spot for my ego” not knowing what the day to day coding challenges are at all from the very people that were hired to “hit the ground running”
On the closure and event loop one, I suspect some people might wonder if you can change the output by adjusting the timing; if an iteration took over 100ms, would it then emit different values? Well, you can try it:
for (var i = 0; i < 3; i++) {
const end = Date.now() + 100;
while (Date.now() < end);
setTimeout(() => {
console.log(i);
}, 100);
}
The answer is no, it will still log three thrice, because of how the event loop works: execution is single-threaded, which means that anything you queue up, whether microtasks like promises or tasks like timeouts, has to wait until you finish what you’re executing. There’s no preemption.
Can someone explain this? Is each iteration of the loop body considered a separate block scope?
Edit: Each time the loop is run, a new anonymous function is created. But in the "let" version, each of those functions is bound to a different version of "i", which means "i" is actually being re-declared as a new and separate variable every time the loop body runs?
It's not just that `let` is block-scoped, it's that a for-let loop scopes the variable inside the loop iteration, so each iteration has a completely different binding.
You can have block scoping and your loops be scoped the other way around still:
r := []func() int{}
for i := range [5]int{} {
r = append(r, func() int { return i })
}
fmt.Println(r[0]()) // 4
r = r[:0]
for i := 0; i < 5; i++ {
r = append(r, func() int { return i })
}
fmt.Println(r[0]()) // 5
You can even have different loop types be scoped differently e.g. in C#, a C-style for loop will update the binding in-place but a foreach loop will not:
var r = new List<Func<int>>();
for(int i=0; i<5; ++i) {
r.Add(() => i);
}
Console.WriteLine(r[0]()); // 5
r.Clear();
foreach(int i in Enumerable.Range(0, 5)) {
r.Add(() => i);
}
Console.WriteLine(r[0]()); // 0
Right, I forgot that it was triggered by var vs let.
> so each iteration has a completely different binding.
That was what confused me about this rule as well: If you write "for (let i = 0; i<n; i++)", this will actually create n+1 independent variables, all named "i" but bound to different scopes.
The variables can even be independently modified:
You can write
for (var i=0; i<3; i++)
setInterval(() => {console.log(i); i+=100}, 1000);
This will print the "expected" output:
3
103
203
303
403
...
i.e. all the closures bound to the same variable. The same output is produced by
let i; for (i=0; i<3; i++)
setInterval(() => {console.log(i); i+=100}, 1000);
However, if you write
for (let i=0; i<3; i++)
setInterval(() => {console.log(i); i+=100}, 1000);
you get:
0
1
2
100
101
102
200
201
202
...
i.e. each closure logs - and increments(!) - its own copy of i.
Neither "for" nor "let" on their own have this behaviour, it's a special language rule which seems to be triggered by using "let" in the first clause of a "for" statement.
I can understand the intention behind the design, but I feel that, in order to make a common usecase less surprising, they made the whole thing actually more surprising.
> I can understand the intention behind the design, but I feel that, in order to make a common usecase less surprising, they made the whole thing actually more surprising.
I don't agree with that, `for let` simply behaves as if the `let` was inside the loop:
for (var _i=0; _i<3; _i++) {
let i = _i;
setInterval(() => {console.log(i); i+=100}, 1000);
}
I agree with xg15 that it’s surprising, however well-intentioned. The general wisdom is that C-style for loops are simple sugar for while loops, that the following two are equivalent:
for (initialiser; condition; incrementer) { body; }
initialiser;
while (condition) {
body;
incrementer;
}
This made perfect sense and was easy to reason about. Add lexical scoping, and it should obviously have just gained an extra level of scope, so that bindings are limited to the loop:
{
initialiser;
while (condition) {
body;
incrementer;
}
}
But instead, the initialiser became awkwardly half inside the loop and half outside the loop: inside as regards lexical bindings, but outside as regards execution (… and lexical bindings in the remainder of the initialiser). That’s wacky. I understand why they did it, and in practice it was probably the right decision, but it’s logically pretty crazy, and much more convoluted for the spec and implementation. https://262.ecma-international.org/6.0/#sec-for-statement-ru... (warning: 4MB of slow HTML) and the following headings show how complicated it makes it. In essence, you end up with this:
{
initialiser;
{
Lexically rebind every name declared in the initialiser.
(Roughly `let i = i;`, if only that shadowed a parent, as it does
in many such languages, rather than raising a ReferenceError.)
while (condition) {
body;
incrementer;
}
}
}
This reveals also problems in your attempted desugaring:
for (var _i=0; _i<3; _i++) {
let i = _i;
setInterval(() => {console.log(i); i+=100}, 1000);
}
The trouble is that the condition and incrementer are not referring to a var, but rather to your lexical declaration that you put on the next line.
Hoisting. Within the function body the declaration of the variable, but not it's initialization, is (re-ordered) hoisted to the top of the function, much like old C. So the clobbered global definition happens at the function frame step rather than exactly where a sane programmer would expect such insane code to perform it.
A fun guess-the-output to illustrate the implicit/permissive nature of JS:
console.log(parseInt("020" + 020, base=5))
Most languages would throw an error because "020" + 020 = "02016" and we try to parse this in base 5. However, the parseInt function silently stops reading the input if it encounters an invalid character.
Language design nitpicks:
- the octal prefix is kind of difficult to see and could lead to accidental conversions, at least the clearer representation 0o20 is also understood.
- the parseInt function understands the prefix 0x, but not 0b, 0o and only when the base is 16 or not set
I suppose it works as a poor man's named argument. You still need to follow the argument order, but the code might be slightly more readable at the expense of creating random variables all over your code. The expression `a = 5` evaluates to 5 after all, which allows things like `a = b = c = d = 0` or the listed `let a = b = 5` which can cause confusion.
My bad, too much Python lately. I wanted to be more explicit about this argument and briefly tested the syntax in a Node console. It works, but... I didn't catch the atrocious side effect. Thanks for pointing the footgun !
Yes, replace 'var' with 'let' and it ends up correctly scoped. var also gets scoped to the function (or in this case, the window if you just run this loop directly), so you'll even be able to view the value 3 with window.i which is also unexpected. Always use let, I guess.
There are generally no backwards-incompatible changes made to JavaScript, on account of how decades-old websites continue to be supported.
The confusion here is that `i` is never snapshotted in time. The variable is captured by the closure, but the only thing captured is the symbol, not the value.
I presume the author wrote slightly different code. Most especially, I don’t believe anything is ever indentation-sensitive.