The fun thing was the Roland Sync. You could sync up all the TB-303, TB-909 and all the others with a 5-pole DIN cable. The sync was badly implemented. It lagged, it had latency.
However!
As soon as you cabled all together their imperfections added up and they started to groove like nothing that has been heard before.
This I don't understand. DINsync is raw trigger outputs/inputs like you'd have in a modular synth, in contrast to MIDI that has to send serial messages over a 1k data bus.
Perhaps this take has something to do with calling a five-pin DIN plug '5-pole'? Something's wrong and backwards here.
Again, I guess this is where we are now? I remember reality, but here we are.
No, this is different from pure CV. Each device has its own (digital) sequencer that can synchronize with others using pulse trains over DIN cable. Lots of places where latency and instability can occur!
DIN sync is not trigger based, it's a full clock protocol invented by Roland that has various different states. The fact is that not all DIN sync capable machines implement it the correct way, leading to slight differences in synchronisation, even between devices made by Roland.
Interesting, I'd be curious to read a technical analysis of the differences. DIN provides by far the tightest sync we've ever seen in hardware, so I question the idea that lag and latency are a defining feature of the protocol.
Owner of all (ALL!!) the classic Roland x0x boxes here, which are connected together using DIN Sync - the sync was not badly implemented at all - they sync together perfectly.
The sequencers in each of the machines have a bit of nuance, which is where that famous groove comes from!
You might be confusing this with the sometimes hilarious midi timing of the 909 and 707.
The problem is rather, that most data structure tutorials and books don't even get the idea, to introduce a purely functional version, but merely state the imperative versions. Coming up with the functional versions of data structures can be difficult. Papers about it can be hard to understand and often require one to already know some niche language, that a researcher used for the paper. Even if you can find a functional implementation of a data structure, there is often not a good explanation and you need to, sort of, reverse engineer it and translate it to the language you are using.
In short, it seems relatively few people have the skills to implement them and even fewer have the skills to come up with functional versions of ordinary data structures. They are almost completely absent from university lectures as well, as far as I am aware. For example for AVL trees I could only find a document from ETH from a lecture, that no longer exists or is taught. The language is Isabel and I need to understand its syntax first, before being able to translate it to Scheme.
If anyone has an obscure source for implementations one can learn from and implement oneself in another language, please share.
[1] are the exercises with solutions (!) of a lecture [2] at ETH Zurich, which include AVL trees in Isabel. The whole list of exercise PDFs is at [3]. I started porting the AVL trees to Scheme [4], but lately have not worked more on this. I also try to make it easy to understand the implementation by providing explanatory comments.
Okasaki's is basically the Bible for this stuff. Anyone writing data structure libraries in a functional language will have read this or have it on their to-read list.
I have the book, but it also doesn't contain that many data structures. Some of the maybe most used (AVL tree?) are not in there.
Not all exercises have solutions. I get stuck on some exercise and have a hard time finding solutions to them, that I can compare with my implementations in Scheme. Not being a Haskell user (yet), I also have some issues translating Haskell to Scheme. In languages like Haskell one often controls flow by pattern matching on type. In Scheme one needs to make structures or records explicitly and use their predicates to explicitly check the type. It's all possible, but not exactly great for learning. Sort of the road has many sticks and stones to stumble.
You can even build them with basically one line of code by sorting points using Morton / Z-curve order. It's linear time if you use a counting/radix sort.
Edit: lol, downvoted for this post. Never change, HN.
Here's an example from AWS, where lat/long pairs are put into a Z-index, which is used as a DynamoDB sort key, letting you efficiently query for items near a point.
I would like to know about this more, too. Is there a code anywhere, ideally with comments? But I am fine without comments, too, I would just like to see the code and possibly with an example usage.
Okay, I was intrigued and I did some digging. Morton / Z-order is all about interleaving the individual bits of the x and y coordinates. You end up grouping by quadrants. Python one liner:
points.sort(key=lambda p: sum(((p[0]>>i&1)<<(2*i))|((p[1]>>i&1)<<(2*i+1)) for i in range(16)))
Yeah good point, they are downsides for sure but it's a simple enough approach and most of all it can be shoved in a database (or b-tree or any 1d-sorted data structure).
And for a z-curve, the order is basically a depth-first traversal of a quadtree.
It’s super easy to click the downvote button by accident on mobile when you meant to upvote. And this UI will never be fixed because this is HN after all.
This is precisely the reason that I do not log in to HN on my phone. My phone is read-only and if I want to upvote or comment then I have to switch to my laptop. Pretty easy with firefox because I can send tabs to other devices.
There are couple of tricks you can do if you fiddle with the bits of a floating point value using integer arithmetic and binary logic.
That was a thing back in the 90th..
I wonder how hard the performance hit from moving values between integer and float pipeline is nowadays.
Last time I looked into that was the Cortex-A8 (first I-Phone area). Doing that kind of trick costed around 26 cycles (back and forth) due to pipeline stalls back then.
There are basic integer operations in the FP/SIMD units on most CPUs, so there’s no generally need to “move back and forth” unless you need to branch on the result of a comparison, use a value as an address, or do some more specialized arithmetic.
On x86 there is sometimes (depending on the specific microarchitecture) an extra cycle additional latency when using an integer operation on a xmm register last used with a float operation.
I have seen it explained as the integer and foat ALUs's being physically distant and the forwarding network needing an extra cycle to transport the operands.
This is correct, but it’s happily pretty rare for it to matter in practice (because the domain bypass penalty is small and does not directly impact throughput, only latency).
(For that matter, though, most modern FP/SIMD units have a direct approximate-reciprocal instruction that is single-cycle throughput or better and much more accurate--generally around 10-12 bits, so there's no need for this sort of thing. See, e.g. FRECPE on ARM NEON and [V]RCPP[S/D] on x86.)
These kinds of tricks are still used today. They're not so useful if you need a reciprocal or square root, since CPUs now have dedicated hardware for that, but it's different if you need a _cube_ root or x^(1/2.4).
The big cores do. They essentially pump division through something like an FMA (fused multiply-add) unit, possibly the same unit that is used for multiplication and addition. That's for the Newton-Raphson steps, or Goldschmidt steps.
In hardware it's much easier to do a LUT-based approximation for the initial estimate rather than the subtraction trick, though.
It's common for CPUs to give 6-8 accurate bits in the approximation. x86 gives 13 accurate bits. Back in 1975, the Cray 1 gave 30 (!) accurate bits in the first approximation, and it didn't even have a division instruction (everything about that machine was big and fast).
I've worked in the payment industry and among other things built a payment terminal, so I know a thing or two about it.
1st: The message overhead during communication is not an issue. It is tiny compared to the time it takes for the credit card to do it's crypto thing.
2nd: This won't be adapted ever. There is simply way to much working infrastructure out there build on the APDU protocol. And there are so many companies and stakeholders involved that doing any change will take years. If they start aligning on a change, it would be something that makes a difference, not something that safes time in the order of 5 to 20 milliseconds. per payment.
Hi there, author here! I think you've highlighted a couple of things worth clarifying in the doc:
1. Apple having just announced it is opening up NFC to developers means that both major mobile platforms can now act as responding devices; so widely distributing new NFC protocols to consumer devices has become very fast and inexpensive through an update to the OS or installing a third party app.
2. Mobile consumer hardware is sufficiently fast for the application operations (eg. Cryptographic operations) so that these roundtrip and metadata overheads of APDU do actually make a meaningful contribution to the total time it takes to complete a transaction. Experiencing this in my development efforts here was the motivation for designing this alternative.
3. A1 is interoperable with APDU infrastructure and can therefore be adopted by terminals immediately, since reader terminals can attempt an A1 initiation and any APDU response from a legacy device is considered a failure; at which point the terminal can fall back to its default APDU message exchange.
I will update the doc to clarify these points, what do you think?
Given your experience I'd be interested in your detailed feedback, maybe we could jump on a call soon if you have time?
> 1. Apple having just announced it is opening up NFC
> to developers means that both major mobile
> platforms can now act as responding devices;
> 2. Mobile consumer hardware is sufficiently fast for the application
> operations (eg. Cryptographic operations)
You are right here. It is possible to emulate a card using mobile phones. We've been able to shim/emulate any card for much longer.
The thing is: To connect to the payment system you need a certificate. And you simply don't get it unless you can prove that you have all kinds of security measures applied.
For android and apple, the actual payment stuff runs inside a isolated little micro-controller which has been certified and is temper proof/protected. This little thing is fortified so far that it will destruct itself when you try to open it up. There are alarm meshes, light sensors and much more inside the chip to detect any intrusion just to protect the certificate.
If you don't have that security, the payment providers will deny you their certificate, plain and simple.
You can build your own thing using card emulation via apps, but you will take all the risks.
How it works in practice is the following: These temper proof micro-controllers can run java-card code. You can write an applet for them, get it installed (if you have the key from apple/google - not easy). Then install it and you have a two-way communication: Your applet inside the secure world communicating with the payment world, and your ordinary mobile app showing things on the screen.
"You can build your own thing using card emulation via apps, but you will take all the risks." right! This is exactly what I've developed. I've built a new NFC application, with its own PKI infrastructure, that's deployed onto users' mobile devices via an app install. It works over APDU, but it would be more efficient if A1 was made available by the two major mobile OS. It would likely shave off >10% of the total time to complete which makes a material difference to the failure rate (ie. customer's lifting away too early or other interference) and therefore improves the overall UX.
So sad.
reply