In the HOPL paper is some discussion of the first design for Smalltalk, that I was working on when "the bet" happened and brought Smalltalk-72 to life as the answer to "the bet" (it used a combination of Lisp and Meta II techniques to win the bet). This made it quite easy to implement, and once Dan Ingalls implemented it, we started using it.
Smalltalk-71 was never implemented (and never had its design finshed, so there is less that can be claimed about it). But, germane to this discussion, I really liked Carl Hewitt's PLANNER language, and the entire approach to "pattern directed invocation" as he called it -- this was kind of a superset of the later Prolog, and likely influenced Prolog quite a bit.
The PLANNER ideas could be used as the communications part of an object oriented language, and I thought this would be powerful in general, and also could make a big difference in what children could implement in terms of "reasoning systems" (not just imperative action systems).
For Smalltalk-72 I used a much more programmatic approach (Meta II) to recognize messages (that also allowed new syntax/languages to be defined as object protocols (to win the bet) but this was not done comprehensively enough to really use what was great about PLANNER.
There were a few subsequent attempts to combine objects with reasoning, but none of them that I'm aware of were done with end-users in mind.
I thought the subsequent ACTOR work by Carl and his colleagues produced really important theoretical results and designs, most of which we couldn't pragmatically use in the personal computing and interface work on the rather small Xerox Parc personal computers.
The Greybus project around UniPro interconnect has a lot of hallmarks of this all. And support is built right into the Linux kernel; peripherals just show up right away! You don't need libraries at all. https://kernel-recipes.org/en/2015/talks/an-introduction-to-...
You don't need macros in Red (but it's easy to roll out your own macro layer, if one so desires [1]), all code transformations can be achieved at runtime. A brief explanation of how this works is given in [2].
As for Red / Rebol vs. Racket content - I wrote an excessive post on that some time ago [3], in response to similar question.
TL;DR from language creation perspective is that Racket has state-of-the-art macro system and all the necessary infrastructure, and Red has metaDSL (that is, DSL for creating DSLs) called Parse [4], which is basically PEG on steroids (think OMeta) + a different take on homoiconicity, compared to Lisps, which, most of the time, renders custom readers and tokenizers unnecessary.
Real-time notifications give a sense of activity, nice for building a community. Reminds me of http://listen.hatnote.com/ showing live wikipedia edits, but more integrated. Some anonymity, or mention of a group, e.g. anonymous users, forum member, admin, would be better. Too much and too specific sharing of activity makes this feel stifling--more like chat with read notifications, less like email or forum where people take time to compose replies.
I run an electronic lab at a university — I have the following tips:
- don't cheap out on breadboards unless you plan to manually test every single connection regularily before prototyping
- with some breadboards the (red/blue) power rails are not connected all the way through, but only for half of the board, in doubt always measure. If this is the case with your board, just bridge the rails once and leave them bridged forever.
- if you prototype projects more often get yourself sortiments (e.g. a resistors, film capacitors, ceramic capacitors, electrolytic capacitors). Yes, these sortiments are expensive, but my first private resistor sortiment lasted me for a decade and if you calculate a decade worth of "fuck that part is missing" it is a no-brainer. I literally never regreted getting those.
- if you constantly find yourself using certain connectors build a (heavy) breakout-box for them. Something that doesn't fly off the table easily. Having your breadboard fly of the table because your guitar is connected to it is bullshit.
- there are limits to breadboarding. If you go high voltage, high current and/or high frequency or anything extra sensitive avoid it
- learn how to translate between schematics and breadboards. Nowadays there are many breadboard-illustrations that show you exactly how to connect things. That is neat, but ot is hard to reason about circuits that way, and this will be needed once you deciate from what you find. Most interesting circuits will come as schematics anyways, so learning it is worth it
- if you breadboard complex schematics, print out the schematic and tick off (with a pen) every connection between every pin of every component you make. It is very easy to miss something even for professionals. In the end a circuit is basically just nodes (a single pin of a component) and edges (the connections between them). If you get the topology right, the circuit should work. Exceptions: the schematic is wrong (not uncommon, there is a lot of crap on the internet), the parts are wrong (there is many nuances to each part, beyond their basic value) or the circuit is of a kind that is not suited for breadboarding (see above).
- Sometimes people online claim that X works, but it only works because the part they had lying around had a very specific characteristic that they conceniently failed to mention. It is okay to just give up and move on at some point
- Try to simulate schematics before you breadboard them. Change component values to get a basic intuition for the circuit. I like this one: https://www.falstad.com/circuit/circuitjs.html (it has some limitations, e.g. inverter based relaxation oscillators won't oscillate, but for the most stuff this is fine)
- don't forget to add bypass caps. Just get a sack of ceramic 100nF capacitors and sprinkle them between plus and ground on the power rails next to wherever tou put active components (chips, etc)
useful context is that in most programs a very significant fraction of the instruction mix consists of reads of local variables; another significant fraction consists of passing arguments to subroutines and calling them. so the efficiency of these two operations is usually what is relevant
with dynamic scoping, the current value of a given variable is always in the same location in memory, so you can just fetch it from that constant location; when you enter or exit a scope that binds that variable, you push or pop that value onto a stack. a function pointer is just a (tagged) pointer to a piece of code, typically in machine language but sometimes an s-expression or something
with lexical scoping, in an environment that supports recursion, the usual implementation is that you have to index off the frame pointer to find the current location of the variable. worse, if your lexical scoping supports nested scopes (which is generally necessary for higher-order programming), you may need to follow index off the frame pointer to find the closure display pointer, then index off the closure pointer to find the variable's value. and then, if it's mutable, it probably needs to be separately boxed (rather than just copied into the closure when the closure is created, which would mean changes made to its value by the inner scope wouldn't be visible to the outer scope that originally created it), which means that accessing its value involves indexing off the frame pointer to find the context or closure display pointer, indexing off the closure display pointer to find the pointer to the variable, and then dereferencing that pointer to get the actual value. also, supporting closures in this way means that function pointers aren't just pointers to compiled code; they're a dynamically allocated record containing a pointer to the compiled code and a context pointer, like in pascal, with a corresponding extra indirection (and hidden argument, but that's free) every time you call a function.
there's a tradeoff available where they can just be those two pointers, instead of taking the approach I described earlier where the closure display is a potentially arbitrarily large object; but in that case accessing a variable captured from a surrounding scope can involve following an arbitrarily long chain of context pointers to outer scopes, and it also kind of requires you to heap-allocate your stack frames and not garbage-collect them until the last closure from within them dies, so it's a common approach in pascal (and gcc uses it for nested functions in c, with a little bit of dynamic code generation on the stack to keep its function pointers to a single word) but not in lisps
dybvig's dissertation about how he wrote chez scheme is a good source for explanations on this, and also explains how to implement call/cc with reasonable efficiency. http://agl.cs.unm.edu/~williams/cs491/three-imp.pdf
current cpus and operating systems favor lexical scoping more than those in the late 01970s and early 01980s for a variety of reasons. one is that they are, of course, just much faster and have much more memory. also, though, address sizes are larger, and instruction sizes are smaller, so it's no longer such a slam-dunk trivial thing to just fetch a word from some absolute address; the absolute address doesn't fit inside a single instruction, so you have to compute it by offsetting from a base register, fetch it from memory (perhaps with a pc-relative load instruction), or compose it over the course of two or more instrutions. and currently popular operating systems prefer all your code to be position-independent, so it may be inconvenient to put the variable value at a fixed absolute address anyway; if you try, you may find upon disassembling the code that you're actually indexing into a got or something. finally, though i don't know if this figured into the lispers' world, 8-bit cpus like the 6502 and 8080 were especially terrible at indexing, in a way that computers hadn't been since the 01950s and haven't been since, which made the performance gain of absolute addressing more significant
i don't know enough about addressing modes on machines like the vax, the dorado, and the pdp-10 to know what the cost was, but on arm and risc-v (and so i assume mips) you can index off the stack pointer with an immediate constant for free; it doesn't even require an extra clock cycle
a couple of points in favor of the efficiency of lexical scoping: when you have a cache, having all your local variables packed together into a stack frame is better than having them scattered all over the symbol table; popping a stack frame is a constant-time operation (adding a constant to the stack pointer, typically), while restoring n values on exiting a dynamic scope requires n operations; and the space usage of local variables on a call stack only includes the variables currently in scope, rather than all the distinct variable names used anywhere in the program, as is the case with lexical scoping
in a sense, the way you handle callee-saved registers in assembly language is pretty much the same as how you handle dynamically-scoped variables: on entry to a subroutine that changes r4 or rbp, you save its current value on a stack, and on exit you restore that value, and any value you put there can be seen by your callees. oddly enough, this turns out to be another efficiency advantage for lexical scoping; with dynamic scoping, when you call another function, the language semantics are that it could change any of your local variable values before it returns, so it's not safe for the compiler to cache its value in a cpu register across the call. (unless the variable is assigned to that cpu register for your entire program, that is, which requires whole-program optimizations that run counter to the lisp zeitgeist.) this has become an increasingly big deal as the relative cost of accessing memory instead of a cpu register has steadily increased over the last 40 years
i'm no expert in this stuff, though i did implement a subset of scheme with recursion, lexical scoping, closures, and enough power to run its own compiler (and basically nothing else): http://canonical.org/~kragen/sw/urscheme so i could totally be wrong about any of this stuff
UTF8 is an extremely simple and lightweight text encoding. Check out Plan 9's man page on UTF, it would fit on a t-shirt: https://plan9.io/magic/man2html/6/utf
Unicode is also just a representation for text, and a handful of common operations - you work with arrays of characters, rather than arrays of bytes. It was worth its cost on 1992 hardware; Nintendo DS is over a decade more recent.
I recommend studying libutf in sbase[0]. It's not a single header file solution (although utf.h[1] is an excellent place to start reading), but it does provide a fairly comprehensive implementation. There's also a good introduction to Unicode in Plan 9's C programming guide[2]. Even if you choose to only support runes that fit in a single byte, you gain the ability to tell byte blobs apart from text, which is useful both for reasoning about your program, and for future-proofing it, in case you needed to put places like Łódź or Πάτρα on your map.
Recents events in ML make me feel about 2/3 vindicated of the claims made in the book. Based on the book's ideas, I began training LLMs based on large corpora in the early 2010s, well before it was "cool". I figured out that LLMs could scale to giga-parameter complexity without overfitting, and that the concepts developed under this training would be reusable for other tasks (I called this the Reusability Hypothesis, to emphasize that it was deeply non-obvious; other terms like "self-supervision" are more common in the literature).
I missed on two related points. Technically, I did not think DNNs would scale up forever; I thought that they would hit some barrier, and the engineers would not be able to debug the problem because of the black-box nature of DNNs. Philosophically, I wanted this work to resemble classical empirical science in that the humans involved should achieve a high degree of knowledge relating to the material. In the case of LLMs, I wanted researchers (including myself) to develop understanding of key concepts in linguistics such as syntax, semantics, morphology, etc.
This style of research actually worked! I built a statistical parser without using any labelled training data! And I did learn a ton about syntax by building these models. One nice insight was that the PCFG is a bad formalism for grammar; I wrote about this here:
Obviously, I feel into the "Bitter Lesson" trap described by Rich Sutton. The DNNs can scale up, and can improve up their understanding much faster than a group of human researchers can.
One funny memory is that in 2013 I went to CVPR and told a bunch of CV researchers that they should give up on modeling P(L|I) - label given image - and just model P(I) instead - the probability of an image. They weren't too happy to hear that. I'm not sure that approach has yet taken over the CV world, but based on the overwhelming success of GPT in the NLP world, I'm sure it's just a matter of time.
In hindsight, I regret the emphasis I placed on the keyword "compression". To me, compression is a nice and rigorous way to compare models, with a built-in Occam's principle. But "compression" means many different things to different people. The important idea is that we're modeling very large unlabelled datasets, using the most natural objective metric in this setting.
Some of these folks do sponsored content, usually widgets or toys or something, but they're pretty up front about it. A few of them I follow religiously, others I choose which videos I'd prefer to watch.
Full disclosure: Principal Software Engineer here on the Scratch backend...
Scratch is not built to be a "teach your kid programming languages" system, it is based on the work and ideas of the Life Long Kindergarten group at the MIT Media Lab (the director of this group is Professor Mitch Resnick, the LEGO, Papert Professor of Learning Research). The Papert part is where the term Mindstorms comes from (https://www.amazon.com/Mindstorms-Children-Computers-Powerfu...) and was used by the Lego Group when branding those products, and our philosophy is heavily influenced by that.
I can say that the https://scratch.mit.edu/statistics/ are real and we have a substantial footprint of backend services and custom software to support it. We handle on the order of 15-20 million comments/month.
The primary design philosophy is:
Passion: You have a strong interest in a subject/problem to solve/explore
Projects: Build something based on your passions, gain directly interactive experience with it.
Peers: Share your work with folks who are interested and provide feedback to you
Play: It should be fun!
Note that there is nothing in there about STEM/STEAM nor application development. We build and support Scratch to provide creative tools for anyone to explore computation in a from that is relatable and has a low floor for understanding/entry. Having said that, the complexity of what Scratch can do rises sharply the more you work with it and the concepts behind "forking" and opensource are built in via the remix ability on individual projects.
A lot of design thinking goes into the frontend of Scratch to build on a creativity feedback loop that is not focused on learning Python or any other specific language (or the syntax of them, i.e. avoid "why isn't my program working... oh, one too many tabs... or maybe this semi-colon, or maybe this .")
Another part I think is worth raising, the Scratch frontend is a sophisticated virtual machine interpreter that has it's own machine code and model that is executing in a Javascript environment in browser and it is still open source. Google's Blockly project was based on the ideas of Scratch 1.4 and when we ported Scratch 2 away from being Flash based, we partnered with the Blockly group to fork their code base and create Scratch Blocks.
Based on the TIOBE index, we're usually somewhere in the top 20 most popular "programming languages". _eat it Fortran!_
I want you to only use English words of Germanic origin; avoid using words derived from Latin or Greek. If there's no alternative, make up a new word by calquing the original into Germanic roots.
--- output ---
The sunstead is the sun and all things that go around it. In the middle is the Sun, a big, hot ball of glowing gas. Around the Sun are eight big worldrocks. From nearest to farthest, they are:
Mirkworld - The nearest worldrock to the Sun, small and fast.
Lovegoddessworld - The hottest worldrock, with thick air and no water.
Earth - Our home, the only worldrock known to have life.
Redworld - A red, rocky worldrock with thin air and high mounts.
Greatthundergodworld - The biggest worldrock, with many fellowtravellers and a big red spot.
Harvestgodworld - A gas worldrock with a thin ringstead around it.
Skygodworld - An iceworldrock that lies on its side, with a thin ringstead.
Seagodworld - The farthest iceworldrock, with strong winds and blue clouds.
Besides the worldrocks, there are smaller rockbits and icebits that also go around the Sun. The rockbits are mostly in the Rockbitbelt between the Redworld and Greatthundergodworld. The icebits come from the far edges of the sunstead, in the long way out Icyrockcloud and the faraway Cloud of All.
The sunstead has a flat shape, with most things going around the Sun on near level paths. The worldrocks go around the Sun in a way called nearround paths. The Sun's pull keeps everything in the sunstead together.
Beyond the worldrocks, rockbits, and icebits, the sunstead also has spacegrit, small bits of rock and metal that fly through space. When spacegrit enters Earth's air and burns up, it makes a lightstreak that we can see in the night sky. If a bit of spacegrit reaches Earth's ground, it is called a starstone.
The Sun, which makes up most of the sunstead's weight, sends out a flow of charged bits that spreads far into space. This solar wind forms a wide bubble, the Sunshield, that surrounds the sunstead and helps shield it from high-speed bits coming from outside.
The sunstead's workings are steered by the pull between the Sun and the things that go around it. The worldrocks and fellowtravellers tug on each other, leading to small changes in their paths over time. The Sun's pull is strongest near it, so worldrocks closer to the Sun go around faster than those farther out.
As the Sun ages, it will go through changes that will have big outcomes for the sunstead. In about 5 billion years, the Sun will bloat into a Red Greatstar, swallowing Mirkworld, Lovegoddessworld, and maybe even Earth. Afterward, the Sun will shed its outer layers and shrink into a White Dwarf, a small, hot, dim star.
Lastly, the sunstead is but one of many star systems in our Starset, the Milky Road. The Milky Road is a wide, spiraling set of billions of stars, gas, and dust, and it is just one of countless starsets that make up the great Allthing.
> We've known since the early sixties, but have never come to grips with the implications that there are net negative producing programmers (NNPPs) on almost all projects, who insert enough spoilage to exceed the value of their production. So, it is important to make the bold statement: Taking a poor performer off the team can often be more productive than adding a good one. [6, p. 208] Although important, it is difficult to deal with the NNPP. Most development managers do not handle negative aspects of their programming staff well. This paper discusses how to recognize NNPPs, and remedial actions necessary for project success.
> Researchers have found between a low of 5 to 1 to a high of 100 to 1 ratios in programmer performance. This means that programmers at the same level, with similar backgrounds and comparable salaries, might take 1 to 100 weeks to complete the same tasks. [21, p. 8]
> The ratio of programmer performance that repeatedly appeared in the studies investigated by Bill Curtis in the July/August 1990 issue of American Programmer was 22 to 1. This was both for source lines of code produced and for debugging times - which includes both defect detection rate and defect removal efficiency. [5, pp. 4 - 6] The NNPP also produces a higher instance of defects in the work product. Figure 1 shows the consequences of the NNPPs.
[5] : Curtis, Bill, "Managing the Real Leverage in Software Productivity and Quality", American Programmer July/August 1990
[6] : DeMarco, Tom Controlling Software Projects: Management, Measurement & Estimation (New York: Yourdon Press, 1982)
[21] : Shneiderman, Ben Software Psychology: Human Factors in Computer and Information Systems (Cambridge, MA: Winthrop, 1980)
The mutual benefit corporation is https://lahcommunityfiber.org and I'm now on the board. It's basically the same corporate structure as our water company and so far is treating us well.
Costs are all over the map. My neighborhood ended up being about $12.7k to build out and $155/mo, but we're on a steep hillside and with large parcels in an expensive metro area. Another neighborhood in a flatter area and with slightly smaller (but still quite large) parcels and higher uptake was around $4k.
We were at ~50% uptake. Those that didn't either were outliers with Comcast, or weren't really serious internet users. Our alternative is AT&T DSL at 18/0.75 for $50/mo. We're expecting rebates on the initial build as uptake rises and additional nearby infra gets built, as well as a reduction in monthly cost as our backhaul gets better utilized. Getting enough commitment to be able to nail down costs and avoid the dilemma of a cost death spiral as people pull out of the project mostly boils down to peer pressure. We did OK on this in our neighborhood because the DSL alternative was so miserable that there was strong will to continue.
Construction is hard and the contractors were incompetent and constantly screwed things up. Folks suffering that incompetence got pissed. Getting power from PG&E is a miserable sufferfest and won't be resolved for another year. In the mean time we're mooching power from one of the subscribers and paying him back. Luckily we didn't need a lot in the way of permits because it's all private property, including the road.
If we were constructing in the public right of way we'd either need special permission from the town to do so, or we'd need to register as a CLEC with the public utilities commission. In practice we're working on both, but becoming a CLEC is so far away in our future that it won't be viable for any currently-planned expansions. Usually getting special permission from the town is easier for getting started.
> This proof-of-concept would be a breakthrough for healthcare, security, gaming (VR), and a host of other industries.
Similar capability is scheduled for new consumer routers in 2024 via Wi-Fi 7 Sensing / IEEE 802.11bf. Hundreds of previous papers include terms like these:
human-to-human interaction recognition
device-free human activity recognition
occupant activity recognition in smart offices
emotion sensing via wireless channel data
CSI learning for gait biometric sensing
sleep monitoring from afar
human breath status via commodity wifi
device-free crowd sensing
- FinServ has had harmonized data for years, in the form of horizontally integrated data solutions from Bloomberg (B-PIPE), GS (Marquee), S&P, and LSEG.
- A number of data marketplaces have popped up to facilitate easy access to datasets (free and paid) through standardized data sharing methods, including Snowflake Marketplace, AWS Data Exchange, Google Cloud Marketplace, Databricks Marketplace, and Datarade.
You'll want to explore the space and get an idea of what the current State of the Art is, so that you can understand how you can best contribute.
The point of OpenAI’s safety efforts is not to reduce harm, the point of OpenAI’s safety efforts is to be blandly inoffensive to low-to-moderate information observers so as to mitigate the ability of anyone concerned with harm to marshal resistance. Serious efforts to reduce harm across the whole scope of GPT-4’s subject areas (i.e., everything) or to reduce its scope to a domain in which meaningful harm mitigation would be more tractable would slow things down too much, and OpenAI’s strategic drive is to move forward as fast as they can, keeping control as centralized as possible; they see it as an AGI arms race where winning is paramount, and commercial dominance of the earlier steps and building public support for tight control along the way is similarly central.
Too many cooks in the kitchen. There's a ton of competing ideas to 'fix' packaging in python, but no will or desire from the core to step up and own/shepherd a solution.
There's a recent podcast with most of the folks important to the ecosystem and they pretty quickly get to the realization that yes it sucks for users but it's not their domain and no one really owns or can fix python packaging: https://talkpython.fm/episodes/show/406/reimagining-pythons-...
Cheap, light, flexible, yet robust circuit boards are critical for wearable electronics, among other applications. In the future, those electronics might be printed on flexible circuits made out of bacterial cultures used to make the popular fermented black tea drink called kombucha, according to a recent paper posted to the arXiv preprint server.
As we've reported previously, making kombucha merely requires combining tea and sugar with a kombucha culture known as a SCOBY (symbiotic culture of bacteria and yeast), aka the "mother"—also known as a tea mushroom, tea fungus, or a Manchurian mushroom. It's akin to a sourdough starter. A SCOBY is a firm, gel-like collection of cellulose fiber (biofilm), courtesy of the active bacteria in the culture creating the perfect breeding ground for the yeast and bacteria to flourish. Dissolve the sugar in non-chlorinated boiling water, then steep some tea leaves of your choice in the hot sugar-water before discarding them.
Once the tea cools, add the SCOBY and pour the whole thing into a sterilized beaker or jar. Then cover the beaker or jar with a paper towel or cheesecloth to keep out insects, let it sit for two to three weeks, and voila! You've got your own home-brewed kombucha. A new "daughter" SCOBY will be floating right at the top of the liquid (technically known in this form as a pellicle).
Beyond the popularity of the beverage, kombucha cultures hold promise as a useful biomaterial. For instance, in 2016, an Iowa State professor of apparel, merchandising, and design named Young-A Lee gained attention for her proof-of-concept research in using dried SCOBY as a sustainable leather substitute for biodegradable SCOBY-based clothing, shoes, or handbags. In 2021, scientists at Massachusetts Institute of Technology and Imperial College London created new kinds of tough "living materials" that could one day be used as biosensors, helping purify water or detect damage to "smart" packing materials. Experiments last year by researchers at Montana Technological University (MTU) and Arizona State University (ASU) showed that membranes grown from kombucha cultures were better at preventing the formation of biofilms—a significant challenge in water filtration—than current commercial membranes.
“Nowadays kombucha is emerging as a promising candidate to produce sustainable textiles to be used as eco-friendly bio wearables,” co-author Andrew Adamatzky, of the University of the West of England in Bristol, told New Scientist. “We will see that dried—and hopefully living—kombucha mats will be incorporated in smart wearables that extend the functionality of clothes and gadgets. We propose to develop smart eco-wearables which are a convergence of dead and alive biological matter.”
Adamatzky previously co-authored a 2021 paper demonstrating that living kombucha mats showed dynamic electrical activity and stimulating responses, as well as a paper last year describing the development of a bacterial reactive glove to serve as a living electronic sensing device. Inspired by the potential of kombucha mats for wearable electronics, he and his latest co-authors have now demonstrated that it's possible to print electronic circuits onto dried SCOBY mats.
The team used commercially sourced kombucha bacteria to grow their mats, then air-dried the cultures on plastic or paper at room temperature. The mats don't tear easily and are not easily destroyed, even when immersed in water for several days. One of the test mats even survived oven temperatures up to 200° C (392° F), although the mats will burn when exposed to an open flame. Adamatzky et al. were able to print conductive polymer circuits onto the dried kombucha mats with an aerosol jet printer and also successfully tested an alternative method of 3D printing a circuit out of a conductive polyester/copper mix. They could even attach small LEDs to the circuits with an epoxy adhesive spiked with silver, which were still functioning after repeatedly being bent and stretched.
According to Adamatzky et al., unlike the living kombucha mats he worked with previously, the dried SCOBY mats are non-conductive, confining the electrical current to the printed circuit. The mats are also lighter, cheaper, and more flexible than the ceramic or plastic alternatives. Potential applications include wearable heart rate monitors, for instance, and other kombucha-based devices. "Future research will be concerned with printing advanced functional circuits, capable for detecting—and maybe recognizing—mechanical, optical, and chemical stimuli," the authors concluded.
I've worked on a lot of sophisticated data pipelines, in robotics, built customized wysiwyg editors, distributed systems, append-only logs with replicated data types, in around a dozen languages. While I have found occasionally remote profiling useful for optimizing hot code paths, I have always found that trace-level logs and structured logging with aggregation and indexing is superior to debugging in most applications.
Essentially - println debugging is superior to interactive debugging, especially for intermittent bugs, data races, deadlocks, resource leakage, modeling and serialization bugs, binary or source code incompatibility bugs, and bugs that are catastrophic and rare or pathological in nature.
The problem with interactive debugging is that you typically only have a view into not only a single process of a system, but also only a single concurrent execution thread of the single process.
When you have structured logging and aggregation and indexed log search, you have a much broader view of the systems under investigation and can see the conditions causing the bug with a much higher frequency than with interactive debugging.
For any non-trivial bug in any sufficiently large input or program you have to have insight into the root cause via intuition to cause it to occur quickly enough to inspect the program via the debugger. This obviously doesn't scale well. It's much better to have a detailed, structured log mode built into your systems that is capable of being enabled on demand - via environment variable, remote procedure call, request header, data packet, or command flag during deployment. This allows you to observe the system behavior and incur the cost of the detailed logging on demand, much like you would when connecting via remote debugging, but on a system of systems scale, increasing the probability of capturing the bug's root cause over stepping forward and backward through interactive execution.
Additionally, care should be taken to program in a value-centric way when you can get away with it performance or space-wise. Essentially - use FP principles of immutability and programs as values and composition to build your programs where you can. Values have a unique property in that they are serialization state that can be inspected or substituted in place of references to the value without changing the behavior of a program at runtime. This often simplifies debugging even in complex side-effecting concurrent algorithms.
When you can't, comprehensive argument and shared state change log events that can be shared amongst many members of the development, stakeholder, and operations teams asynchronously helps to narrow the surface area scan of the system you are debugging.
Formal methods during modeling, design, and development can help to increase constraints and enable more comprehensive test suites, but property-based testing with randomized generated valid inputs also serve as a sort of automated preemptive debugger. Things like chance in the js world, quickcheck/hedgehog in haskell and their ports or siblings in other languages, like hypothesis in Python, fall into this category of testing tools.
Interactive debugging is the tool of last resort, and still serves a useful purpose when all other mitigation and system state capture methods have failed, but I personally find myself believing that if a bug's root cause hasn't been discovered by this stage I have made some set of mistakes in the design of the system, either in the application of its logical rules or in the basic assumptions fom interpretation of business requirements to the choices of abstractions/tradeoffs made in development. It's usually a bad day when I have to resort to interactive debugging.
Software is so huge that it would take you a lifetime of programming from different perspectives to get a grip on what it really is. So we are all doomed to experience POSIX through whatever programming experience we end up getting deep in.
I feel the underlying problem most software is that it's just too damn complex, so you can't fit enough of it in your head to design it how you think it should go. An average person can't go "Oh, I think the kernel should be able to do this" and then go whip it up and having an experiment running in a little bit. That's an esoteric corner full of tons of specialized and arcane knowledge that, truth be told, is completely invented. And half of that invention is workarounds for other bad inventions.
I dunno enough to just pronounce doom on POSIX, but I do feel like the rickety C way of doing things (everything centered around intricately-compiled machine code executables, incredibly dainty fragile eggshells that shatter and spill their entire guts of complexity on the world) underpins a ton of the problem.
The number of years you would need to just read, let alone grok all the hundreds of millions of lines of code that run on our system is just beyond human lifetimes now.
The problem with x86 in particular is that there is tons of cruft. You can get lost for days reading about obsolete functionality.
Here's my general workflow for optimizing functions in HFT:
Write a function in C++ and compile it. Look at the annotated disassembly and try to improve it using intrinsics, particularly vector intrinsics, and rdtsc times.
Then compare your output to the compiler's "-ftree-vectorize -march=native" and compare what it did to what you did. Lookup the instructions it used and compare them with what you did, check for redundancies, bad ordering, register misuse/underuse in the compiler output.
Then see if you can improve that.
But all that being said, note that in general this kind of cycle-counting micro-optimization is often overshadowed by instruction/data cacheline loads. It's rare that you have a few kilobytes of data that you will constantly iterate over with the same function. Most learning resources and optimizing compilers seem to ignore this fact.
For any "How do I...X?" question involving connecting two pieces of software together in a different way, the first thing I try to do is look at the API(s) that connects them. In the case of a program and the OS, it's the system calls, device drivers, and standard libraries.
There have been efforts to provide the capability of running a program without an OS before, but any such effort is going to need to provide the system calls and standard libraries used by the program, and the infrastructure to support it (device drivers, management, etc.). That that point it becomes a mini-OS.
An example is Erlang on Xen. Xen is more often run with a guest OS running inside it, and then the program runs within that guest os. The http://erlangonxen.org/ folks made Ling (https://github.com/cloudozer/ling), software that enables an Erlang BEAM VM to be run directly on Xen, and thereby run a single Erlang program on Ling.
(Not a language or Unicode expert, the following likely has important mistakes.)
> Off the top of my head, I don't know of a terminal that actually implements the entire (very complex) set of Unicode text rendering behaviors
There are at least two reasons for this:
First, nobody actually seems to know how bidirectional text should interact with terminal control sequences, or indeed how it should be typeset on a terminal in the first place (what is the primary direction? where are the reordering boundaries?). There is the pre-Unicode bidirectional support mode (BDSM, I kid you not) in ECMA-48[1] and TR/53[2], which AFAIK nobody implements nor cares about; there are terminal emulators endorsed by bidi-language users[3], which AFAIK nobody has written down the behaviour of; there is the Freedesktop bidi terminal spec[4], which is a draft and AFAIK nobody implements yet either but at least some people care about; finally, there are bidi-language users who say that spec is a mistake[5].
Second, aside from bidi and a smattering of other things such as emoji, there is no detailed “Unicode text rendering behaviour”, only standards specific to font formats—the most recent among them being OpenType, which is dubiously compatible across implementations, decently documented only through painstaking reverse engineering (sometimes in words[6], sometimes only in Freetype library code), and generally full of snakes[7]. And it has no notion of a monospace font—only of a (proportional) font where all Lat/Cyr/Grk characters just happen to have the same advance.
AFAICT that is not negligence or an oversight, but rather a concession to the fact that there are scripts which don’t really have a notion of monospace in the typographic tradition and in fact are written such that it’s extremely unclear what monospace would even mean—certainly not one or two cells per codepoint (e.g. Burmese or Tibetan; apparently there are Arabic monospace fonts[8] but I’ve no idea how the hell they work). Not coincidentally, those are the scripts where you really, really need that shaper, otherwise nothing looks anywhere close to correct.
[This post could have been titled “Contra Muratori on Unicode in terminal emulators”.]
Personally, I found the link between sexual assault and tracking to be expected: they're the two scenarios in which consent even gets mentioned or discussed in daily life. When it comes to sex, it's critical and mentioned all the time in media and internet posts; in tracking, it's in the forms of "I consent" buttons on every other badly designed web page. In how many other places will you find a consent form? Medical procedures, perhaps?
Obviously, Google's data hunger is worlds less harmful than sexual assault, but I think the difference in how consent is treated is actually relevant for the discussion.
When it comes to sex, it has taken many decades (or even centuries) of hard work, but at this point society is finally starting to get the message across: you don't have consent if it's not given explicitly. Society is still in the process of getting everyone on the same page here (sadly) but progress is being made.
While compiler data collection isn't as important, the idea that consent means "you haven't tried saying no enough" is completely absurd. This means is no good comparison for consent because every comparison where a normal person deals with consent will make the suggestion opt-out seem horrible. Most actions happening to a person without consent are or should be illegal, with perhaps lawful arrests and the judicial system being the exception.
If we, as a society, really do value explicit consent, we shouldn't mix and match what is and isn't consent based on what serves us. Consent is important in software too, even if some companies or developers don't see it that way.
Google devs especially should know this. People I know have felt a real sense of violation when they found out how intensely Google tracks its users. I've seen people get creeped and grossed out by their phones because they were asked to review a restaurant they were near, realising that Google knew exactly where they were and how long, and that the tracking had gone on for much longer before that. I, myself, felt a little violated when I found out how much data Microsoft's dotnet tool has been sending out after getting "consent" during an automated install and first use, despite knowing the data can't track me down as a person.
The impact may be incomparable, but developers seem to severely underestimate the emotional impact their blatant disregard for other people's privacy can really have just because they don't feel the same way. Perhaps another way the comparison is more apt than it would seem at first glance.
A big reason 1-on-1 mentoring is so much better than almost any other form of education is that the teacher is able to build a mental model of what the student knows. You can get close to this with a tight pedagogical track where the teacher can say "You've all taken classes X and Y, so you should already be comfortable with the concepts I'm going to talk about." But Software Engineering is a young discipline and the topics discussed in TFA are nebulous, opinion-laden, and specific to individual companies, tasks, and languages. So precisely-specified pedagogy is not possible, and the only remaining option is 1-on-1 mentoring. Every other option (online tutorials, videos, presentations) doesn't allow the teacher to build a mental model of what the student knows and adapt their lesson based on that. It's not really about being right next to the person or being available right now, as TFA says. It's about being able to understand what the student already knows, which lessons are sinking in and which aren't, and why.
Then comes the extremely hard part for the teacher: having the deep self-awareness to be able to separate out what really matters from what is simply their opinion, or what they're comfortable with. When a company tech lead is mentoring juniors, it doesn't matter as much, because her preferences can basically become mandates without causing any real issue; that probably increases the overall consistency of the codebase. But if you want to set up a website where experienced mentors share real grains of wisdom without the chaff of personal preference, prepare for a lot of religious wars.
There's a reason the master/apprentice model has persisted for thousands of years. It doesn't sound like TFA is advocating for some mass-scale solution, but rather just pointing out how valuable mentoring is and how it should be a priority for senior developers if a company wants a healthy engineering culture. Totally agreed. We're probably not ready for a mass-scale solution to this problem yet.
Optical mice actually aren't good at precise relative positioning because the way the lens is setup, if they are moved slightly closer to the surface they 'see' more movement.
You would imagine that you could use some kind of registration pattern(s) to solve this, but sadly nearly all mouse sensors are made by Avago, and Avago runs their own algorithm on the raw sensor data, and won't let you have the full data stream (you can only receive it via a debug API which only lets you get the data out super slowly). The avago ROM's are built into the sensor, and run a little 8080 CPU with a neat DSP for processing the image data, but sadly the system has too little RAM for any exploit to run your own code on it.
Much of the confusion in this thread comes down to the fact that in the US, the standard shower valve is a 'pressure-balanced' valve, which allows you to select a flow-ratio between hot and cold feeds, but does not in itself maintain any particular temperature - it just ensures that in the event of pressure changes on one or other feed, the overall mix is maintained (preventing temperature spikes if someone flushes a toilet).
Whereas in many other parts of the world, the standard shower valve is a 'thermostatic mixing valve', which allows you to select a temperature, and balances the inputs to achieve that desired temperature mark.
Note that if you go to, say, HomeDepot.com, and look at shower faucets, you will not find thermostatic mixing valves. Only pressure balanced. The only thermostatic component in conventional american shower valves is an anti-scald-cutoff. Thermostatic valves do exist, but they're not the ones that are bundled into showers and that therefore get employed by the average landlord who is grabbing the cheap standard parts off the shelf. Thermostatic is a premium option, and not one that even crops up in the search result filters for showers.
So most Americans have simply no idea that thermostatic valves are a thing.
Whereas if you go to, say, diy.com (which is the store of B&Q, an equivalent store in the UK), and look up showers, you will find all the basic 'mixer showers' have thermostatic valves.
Smalltalk-71 was never implemented (and never had its design finshed, so there is less that can be claimed about it). But, germane to this discussion, I really liked Carl Hewitt's PLANNER language, and the entire approach to "pattern directed invocation" as he called it -- this was kind of a superset of the later Prolog, and likely influenced Prolog quite a bit.
The PLANNER ideas could be used as the communications part of an object oriented language, and I thought this would be powerful in general, and also could make a big difference in what children could implement in terms of "reasoning systems" (not just imperative action systems).
For Smalltalk-72 I used a much more programmatic approach (Meta II) to recognize messages (that also allowed new syntax/languages to be defined as object protocols (to win the bet) but this was not done comprehensively enough to really use what was great about PLANNER.
There were a few subsequent attempts to combine objects with reasoning, but none of them that I'm aware of were done with end-users in mind.
I thought the subsequent ACTOR work by Carl and his colleagues produced really important theoretical results and designs, most of which we couldn't pragmatically use in the personal computing and interface work on the rather small Xerox Parc personal computers.