The speed shouldn't be the issue. If it is, please start using a library like Lo-Dash [1], a faster version of Underscore.js. Actually, please just do that in general so you won't make mistakes like this.
I'd draw an entirely different lesson from this -- namely, killing time by doing "100 other minor cleanup changes to a bunch of different files" on a 40,000 LOC legacy production system that (apparently?) has no tests sounds like a Really Bad Idea(tm). Don't mess with code you don't understand if you don't have even the basic safety net of decent unit tests in place.
I am scratching my head at how someone can write a blog post correcting code and leaving such egregious code as a revision. The last snippet of code is still busted and very cringeworthy
The proper way to do alternate behavior for the first iteration, would be this...
It's O(n) versus the authors O(2n)
if(attributeArray instanceof Array) {
if (attributeArray.length >= 1){
// do something with attributeArray[0]
for (var i = 1; i < attributeArray.length; ++i)
// do something different with attributeArray[index]
}
}
Scientifically sure, practically though, it matters when working on large data sets and you're already exhausting the snappiness that you would have liked.
Correct, however if this "exact runtime bound" is what you are going for, then please use the correct notation. T(N) = 2N would be legitimate, as would \Theta(N) = 2N. Saying that O(F(N)) = 2N is misleading because the definition of the big-o notation explicitly discards all constant factors and constant annends.
It's still worth noting, as others have, that there's a huge assumption baked in to your estimate: that all operations are created equal when it comes to execution time.
If we're going to talk "practically" here :) things like branch prediction will ensure that nothing like 2x the execution time will be spent on just the double conditional.
Meanwhile, what you do inside that loop becomes much more important to the time estimate than the conditional (.length is a fast single property lookup, whereas just accessing a value in the array (which is presumably the purpose of this method) will often trigger a bounds check on length, then an offset into a looked-up offset into memory, etc).
So it's not really worth doing for speed. All that said, I think yours reads better anyway.
Not to mention the anti pattern of doing this in the first place.
When manipulating collections - you almost always want .map .filter .reduce .some .all or a variation of thereof and not a plain old for loop for this sort of thing, doing this a million or a billion times a second theoretically shouldn't matter. (explained in : http://stackoverflow.com/a/17253577/1348195)
Also, checking for instanceof Array is a JS anti pattern to begin with and makes code a lot less generic.
I agree that the Array methods are extremely useful other than their ugly performance-sacrificing function scope. In this case though, when you need alternate behavior for the first index, it is much clearer to read a standard for loop with some precursor code behind it, than to read a forEach loop with an if-else.
People use arrays for things like queues and stacks all the time in JavaScript - they're _much_ slower than hand rolled collection (eg. http://jsperf.com/deque-vs-array-2 ) - why not micro optimize that as well?
If you have a very large array of course optimization can/should be considered but that's simply not the average case.
I actually like the slice code for it's ease of readability. Functional methods are extremely nice and I hope they get faster and faster with the optimization of JS JITs.
I never heard of Dequeue [1], that is mighty cool. I agree that premature optimization is not good - but when it can be done cleanly without hurt to the readability of the code, I'd say to keep it in ones' repertoire, to effectively write more performant code, more often, on average.
I see your point, but it's a bit of a stretch to call a style preference an anti-pattern. You even caveat it yourself in your stackoverflow response: "it's also a lot more readable (at least to me)".
Agreed on the hideousness of that 'instanceof Array' check.
Can you elaborate on what's wrong with the final solution from the article? I don't see anything wrong there and I wouldn't expect your snippet to be faster.
Speed isn't the only issue here. In fact, the execution of the looping construct itself is almost meaningless compared to what you're actually doing in it. (Explained here http://stackoverflow.com/a/17253577/1348195 )
While in general I indeed think that people should profile code before they optimize it, I also think that they should avoid unnecessarily pessimising it by writing code that is not exactly idiomatic and is known to have an overhead.
In this particular case I am not sure I would prefer slicing+forEach to writing a old-style for-loop on any relatively hot path. Of course either profiling or thorough knowledge of the code should be applied before making this decision to determine whether this code path is hot or not, what's the average size of the array and so on. The main motivation for this decision would be the fact that slice produces the copy of the array. This is both hindrance in terms of performance (as VMs right now would not elide copy operation) and in terms of readability (e.g. when I see such code I would have to ask myself: does doOtherFunction require fresh copy or not, does it use only the first argument or all of what forEach passes into it and so on?).
So you can see the choice for me is pretty complicated.
It's not worthless when you are writing an app that happens to run in the browser.
I agree 1000-fold that the Array manipulation techniques are extremely easy to read. Keep in mind though, you are creating a new copy of the [1-n]array through the slicing, and the overhead of the function call with forEach will matter on large Arrays. Really depends what you are building. Don't call performance optimization worthless.
"Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." - Knuth
Micro benchmarks are not how we optimize - we profile.
Who says the function call actually happens? If the function is short enough, they are a very good candidates for the JIT to inline. If they aren't short, the overhead of calling the function shouldn't matter much anymore.
Checking the index in every iteration of a loop will effectively double the amount of operations performed in every iteration of the loop for this scenario. That is, O(2n) versus O(n)
Not really, things like branch prediction and JIT optimizations will kick in pretty fast. This is why these benchmarks on jsperf suck - the code gets a different 'fresh' version every time and jits (like crankshaft in v8 ) don't get to work.
Effectively, you're checking 'the performance of interperted JavaScript' most the the time.
I bet my bottom dollar that branch prediction is not going to magic away that index dependent if statement in a language like JS, even with Crankshaft in full effect.
The actual percentage depends on the CPU. On mobile the cost would likely be higher (e.g. on my phone it's around 33%), on old CPU can be even higher, but people don't usually care about them these days.
I made an assumption, you're making an assumption too. I'd agree that in any case where it would actually matters, you'd be doing a small amount of ops in the if statement. The complexity change will depend on what you were doing in the loop as compared to the speed of a single if statement. If it were a large amount of ops, then yes, it'd be a more minor difference.
My point is that "O(2n)" is an extremely specific and misleading way to characterize 'extra conditional per iteration'. It doesn't matter what the actual percentage difference is.
It's completely pointless to use such a notation here in lieu of description or benchmarks, but if you insist then use something like O(kn) vs. O((k+1)n)
I've expected the JS interpreter to optimize that out. Turns out I was wrong and the version from the article is even 40% slower for a loop that iterates over 5000 elements and does a trivial amount of work:
You're not giving the JIT a chance to warm up at all. Also, you're degrading into 'slow mode' because your array has different types. Basically, it looks like you're doing a lot of bad things to the JIT on purpose :P
Both the different types and the delete (array with holes) are _very_ harmful for performance in v8.
The arrays with holes are called Sparse Arrays, I googled for 'sparse array v8 performance' and didn't find much, but I've previously seen a Google Presentation on a slide-sharing site that did mention how poor sparse arrays perform when compared to normal contiguous arrays.
That's not how big O notation works. You need to eliminate constant multipliers because you can't make any assumptions about the amount of work each iteration involves. It certainly doesn't correlate directly to machine instructions.
Also, putting 'var' in a loop header is debatable because it gives the false impression that the variable "i"'s scope is confined to the for loop, when in reality it will be alive throughout the containing function. JSLint would bark.
That code snippet is also using a faulty technique for identifying an Array. Although it will work for the majority of cases, if you're using an array from a different iframe, the instanceof check will fail since the Array constructors are not the same.
You should be using Array.isArray ( ES5 compatible browsers ), or Object.prototype.toString.call( obj ) === '[object Array].
Lodash, Underscore, and jQuery all provide utility methods to do that comparison for you.
The only problem here was that a for/in loop was being used. Iterating the array, rather than its properties, was clearly what was actually wanted. However, the author uses the opportunity to take a strong stance in favour of always using strict equals, even though it was never a bug here anyway. As far as I can see, the only argument this article produces against double equals is that overzealous developers might accidentally break your code trying to be proactive.
Strict equals is something I only use when necessary in javascript. Despite the "taboo" surrounding double equals, I rarely face situations where its use adds brittleness to the code. Is there some horrible danger that I am just not seeing?
Well in this case, the original author was relying on '0' == 0 for their code to work. Even if you were dead set against using strict equality (which is silly, but whatever), the code is still disingenuous and should have tested index == '0' to make its intent clear. There's no other value that they could have being relying on it to coerce without some other very nasty things going on.
> overzealous developers might accidentally break your code trying to be proactive
It's not just people changing your code, it's people trying to read your code (including you, months later).
I'd recommend putting comments around code that appears like it may be wrong, or just avoiding such code as proposed. For instance, if I must do assignment within a conditional, I would at least put a comment the line before saying that assignment was intentional. But it's worth writing the assignment on its own line for clarity.
The speed shouldn't be the issue. If it is, please start using a library like Lo-Dash [1], a faster version of Underscore.js. Actually, please just do that in general so you won't make mistakes like this.
[1] http://lodash.com/