Permissive Licenses Are Gifts

Harrison Ainsworth, who is always interesting on software matters, has a recent note arguing permissive software licenses are unethical. I disagree. Such software is a non-toxic gift, like free cake.

All open source licenses allow access to source code and some kind of copying and modification. Permissive licenses, such as the MIT license, have no restrictions. You may copy, alter and share the source code and anything built with it as you please, including making other things that are not shared back to the community. This is in contrast to “copyleft” licenses like the GNU Public License (GPL). If GPL code is changed, is built into other things, that code must also be released.

There might be times when a gift is unethical. Giving someone an animal they can’t look after is unfair to the animal. Giving something leaking toxic chemicals could hurt people. Some gifts create an obligation of work.

Open source software doesn’t seem to be in any of these categories. People may adapt and contribute back if they wish – or choose not to use it at all. With a permissive license, people not in a position to contribute back today can still use the tool, and perhaps be able to contribute back tomorrow. Many developers working for conservative corporations or governments were and are in that position.

The exception would be open source code embedding adware and trojans, like Clipgrab (which claims to be GPL but took down its open repository years ago), or malicious NPM packages, or cryptominer embeds. These are all the equivalent of gifting your sister a nice bucket of plutonium slurry for her birthday. But this is not most open source software, and it’s not what Ainsworth has in mind.

Ainsworth says the permissive license boils down to “I should share, but others should not.” But it’s really “I should share, but those who cannot need not”. I’m sure this basic argument is already known to Ainsworth, and so his primary point is about consequences: permissive licenses create an ecosystem of freeloading corporations. But the alternative, if everyone were a GPL purist, is likely not a world of free GPL software; it’s a world of worse software, with less good tools available, and corporations and other rich bad actors continuing to skive anyway. The corporations that contribute back to open source are mostly more technically sophisticated ones; the ones who understand software development dynamics, though a Google or Facebook may mix the two together and often skirt the freeloading line.

Python is an interesting test case for permissive licenses. Would Python be as dominant in data science today if it hadn’t been given away, with no license headaches? I doubt it would, or that data science would really exist in its current productive form at all. This continued to foster openness in the rest of the Python ecosystem. Likewise for maven, the Apache webserver, and so on. It is good that the GPL and open licenses exist. The famous GPL ratchet, that forces openness on software using GPL libraries, is a good thing. The two work together to some degree, permissive licenses softening a reuse culture up, copyleft forcing it open. There was, and still is, a problem of basic literacy for open source values in many companies. This way we did not all need to hold our breath waiting for them to learn.

Is it unethical to take free gifts, over and again, make money off the use of it, and never give back? Sure, and that freeloading should socially embarrassing and bad for business. For software and content developers: it’s your gift to the world. Leave the strings unattached if you want.

VII.1 Reuse

子日,述而不作,信而好古,窃比于我老彭. – 论语 七:一

The Master said, ‘I transmit but do not innovate; I am truthful in what I say and devoted to antiquity. I venture to compare myself to your Old P’eng.’ – Analects VII.1 (Lau)

Before contemplating the process implications of this radically static statement, let’s note that from the perspective of designed code itself, it is always true. Code transforms and transmits information. This is the garbage-in garbage-out principle. Designed code (not genetic or evolved code) does not innovate.

Backups of the user directory for the Analects’ source control repository are, alas, lost to antiquity, and though many sophisticated data recovery techniques have been tried, with some success, none have yielded the identity of Old P’eng. Our ignorance of him highlights our relationship with Confucius and with any classical tradition. To us, Confucius founded a philosophical school, but in his own words he merely continued a tradition that we can see indirectly, if at all.

Scholarly consensus is that Confucius is deliberately overstating his lack of innovation for reasons of rhetoric or modesty (see eg DC Lau, AC Graham, or just wikipedia on this verse). Nevertheless the verse is considered pivotal in understanding Confucius’ traditionalism and conservatism in a time of extraordinary violence and social change.

Existing solutions are useful in at least two ways. 

Firstly they may capture unintuitive theoretical results in accessible ways. Many algorithm design and data structure results are now in this category, such as sorting algorithms and efficient concurrent maps (eg the Java 6 lock free implementation of java.util.ConcurrentHashMap). The formal scientific characterization of such solutions in terms of, say, algorithmic complexity and performance benchmarks  make computer theoretic literacy crucial. Programmers will be unlikely to understand the derivation by reading the code, so they must be able to read the documentation. 

Secondly they may capture highly specific details of the environment and robust solutions to managing it. This will include successful workarounds for under-specified elements of protocols, or flat-out incorrect but popular implementations. Any user of say Ruby on Rails or Tomcat takes advantage of this kind of reuse. Consider too the domain specific details and tolerances of a fly-by-wire control system for a particular make and model of plane.

These two kinds of reuse may be contrived to lie on a spectrum, but I’ve chosen to distinguish them here for their correspondence to two different categories of knowledge – logos and metis. In classical Greek epistemology logos is theoretic universal knowledge and metis is hard won cunning, “feel”, or craft knowledge (as an aspect of techne, craft knowledge and theory). James C. Scott describes the Greek hero Odysseus, surgeons and maritime pilots as all relying on metis (Seeing Like A State). Scott also makes the connection between traditional knowledge – which is particular and tied to a society and geography – and common law conservatism in the tradition of Edmund Burke and Michael Oakeshott.

Confucius is claimed as a kind of Burkean conservative, for instance, by James Kalb. Both Confucius and Burke grew up in societies with small literate elites and large impoverished peasantries. They both share senses of the worth of settled convention, the importance of teaching and the literary canon, a paternalistic affection for heredity power, and a sympathy for the welfare of everyday people.  Neither are they reactionaries, but welcome improvement at a humane pace (IX.3).

Seeing Burke and Confucius as similar is not mainstream and deserves a dedicated analysis of its own. (My searches revealed more extant work linking both of them individually to Wittgenstein than to each other, but pointers are always welcome.) In a comprehensive entry for Burke in the Stanford Encyclopedia of Philosophy, Ian Harris argues that despite being more often claimed by the right wing, he does not have a clear modern partisan successor. Nevertheless, distinguished scholars like DC Lau or AC Graham stay well clear of Western political comparisons, while happily comparing classical Chinese figures with Western philosophers. 

Unusually, a software library, and all the hard won craft knowledge that comes with it, can be imported into another with extraordinary ease when compared to other forms of craft knowledge. A pilot is of little advantage outside his home port, and Ruby on Rails is of little use for 3D rendering, but in software we can copy the pilot and use him on innumerable ships entering that port. We can also ultimately read the source code to Ruby on Rails and determine how it tolerates the idiosyncrasies of particular browsers and servers. This is because all code is built on a formal information substrate – the computational medium. (This is Harrison Ainsworth’s term and his note on reuse provided a number of the connections in this post.)

Not all craft knowledge of a codebase is encapsulated in the codebase. There are particularities of the install, workaround scripts, configuration, scheduled jobs and so on, but these are ultimately digital artifacts easily included within a slightly broader view of what a codebase is (this latter is a premise of DevOps and for anyone serious about a controlled environment). More problematically, there are conventions of use, design choices, oral traditions of “check here when you change there”, and so on. At the limit, all codebases are incomplete. They depend on co-texts, results and knowledge of the domain that need not be encoded. An air traffic control system does not need a textbook description of Bernoulli’s Principle.

Burke and Scott argue that in an established society important, non-obvious, traditional knowledge is captured in social conventions and established practice, and the practice cannot be simplified without a loss of valuable situational knowledge. Scott additionally points out that such an environment is very difficult for an outsider to navigate and there are strong motivations for central political power to apply simplifications to it.

Yet highly particular, ‘local’ code that requires hands on experience and knowledge of accompanying conventions most frequently has another name in software development: bad. Or: spaghetti. Or: legacy. The sentiment is well captured in Qi’s koan on fear, even if it does riff off an opposing classical Chinese tradition. (In Confucian terms we might note the building is not harmonious.)

In No Silver Bullet, Brooks distinguishes accidental and inherent complexity, with the latter being an attribute of the underlying problem rather than any specific software or hardware implementation. Complexity due to poor or improvable design is always accidental; that due to the problem domain is by definition inherent.

An aesthetic sense of good or poor design becomes crucial when pursuing aggressive reuse (VII.14). Without it you will simply perpetuate junk.

Having argued the link between conservatism and software reuse, it is worth being a little more precise about flavours of conservatism. William F. Buckley famously described it as that which “stands athwart history, yelling Stop, at a time when no one is inclined to do so, or to have much patience with those who so urge it.” Despite its partisan origins, this is a good start, as it illustrates certain threads of environmentalism and the idea of heritage listing fall easily under the same banner. ((It is also useful to think of contemporary US Democrats defending Franklin D. Roosevelt’s New Deal, or opposition to changes to Britain’s NHS in this frame.))

In its purest form, this can be “return to a golden age” conservatism. There’s certainly an argument that Confucius would have been happy with a reversion to the society of the Eastern Zhou. We should again temper our interpretation by wondering how much is rhetoric covering adaptation of tradition to new times. In software, certainly, simply reactionary approaches are of little use. Brooks and the founders of eXtreme Programming have both noted that a more effective strategy is to embrace change. Oakeshott argues in On Being Conservative that settings of widespread and enthusiastic change are in particular need of an awareness of the value of what exists now. A traditionalist most often defends the present versus the future, not the past versus the present. This conservative disposition’s usefulness to software is more apparent if taken as an analytic tool rather than an inherent aspect of personality. After all, the greenfield doesn’t exist (see X.18), and any project that pretends to be a greenfield is an interesting lie.

Conservative thought in this vein usually emphasizes working within a tradition and a community – in software we would say platform. This also suggests interesting contours for the breadth of possible reuse; and there are other verses, such as XVI.11, where that might be explored. What is immediately apparent is the narrowness and fragility of an entirely in-house platform due to the smallness of its developer community; and the need for a shared jargon (XIII.3) and perhaps a canon (XVI.13).

Given the corpus of extant code in the form of libraries, to adore antiquity is to know your platform, including its innards, not just thoughtless rote quoting via copy and paste. At this moment in software, to reuse and extend is a greater service than extraneous self-involvement masquerading as innovation.

If you can easily find some code and copy it, you get the result at zero cost. That is an efficiency that cannot be beaten: no amount of programming tool and technique improvements can ever do that. So we want to maximise reuse. – Fred Brooks, No Silver Bullet

XII.11 Let the prince be a prince 

Duke Ching of Ch’i asked Confucius about government. Confucius answered, ‘Let the ruler be a ruler, the subject a subject, the father a father, the son a son.’ The Duke said, ‘Splendid! Truly, if the ruler be not a ruler, the subject not a subject, the father not a father, the son not a son, then even if there be grain, would I get to eat it?’ — Analects XII.11 (Lau)

齐景公问政于孔子。孔子对日,君,君,臣,臣,父,父,孑,子。公日,善哉,信如君不君,臣不臣,父不父,子不子,虽有粟,吾得而食诸。- 论语,十二:十一

This is one of The Analects clearest statements of the feudal and patriarchal social order that would later get the name Confucianism. Detach it, for a moment, from that overwhelming cultural context, and it’s also an expression on separating design concerns. The two can be contrasted. Every political pundit is a social engineer. They either advocate improvement to the design of the state, or argue a change will break the existing system.

Mencius (孟子) expanded on this sentiment for one of the earliest recorded defenses of the division of labour (Book 3 part 1 chapter 4, 3-4). Labour specialization works because humans have limits on the complexity of a task they can undertake, and are not cloneable or particularly fungible. 

Software, by contrast, is highly specialized, but also cloneable at near zero cost. Software complexity has different boundaries. There are physical limits inherent to what Harrison Ainsworth calls engineering in a computational material. These are physical characteristics of algorithmic complexity or computability – limits on how fast a particular problem can be solved, if it can be solved at all. 

There are, by contrast, few physical limits to the conceptual complexity of a software component. Those measures like cyclomatic complexity – number of subtasks, variables and choices in a method – have high values, orders of magnitude short of the physical limits imposed by compilers and interpreters. (I once worked on a system where other team members had, in their wisdom, exceeded the limit for the size of a single Java method in a long list of simple business transformation rules. Pushed by the very essence of the language to refactor, they proceeded to – what else? – push the remaining rules into longMethod2().)

The limits which measures like cyclomatic complexity indicate are human limits. They mark the soft edges of a space where humans can effectively create, manage, or even understand software. There are different ways of describing coding conventions, but they all seek to indicate a limit beyond which code becomes illegible.

Legibility is the term James C. Scott uses to describe the social engineering needs of a nascent or established state (Seeing Like A State). The mechanics of a working state require internal legibility. Those working for it must be able to measure and understand their environment in mutually compatible terms which also promote the success of the government. This is why feudal states have such a profusion of titles which become the name of the person (not Bob – The Duke of Marlborough). It is also why courtly dress has such systematic rules. This is seen particularly in bureaucratic feudal states as seen historically in East Asia, eg in feudal Korea, but also in the Vatican, or the badges at the postmodern World Economic Forum. These codes serve the dual purpose of defining the interfaces of the state and of making the role of the person instantly legible to one familiar with the system, all while tempting people with the markings of social status.

Marking lexemes by colour and shape according to their role is exactly what IDE pretty printing achieves. This is also the intent behind decoupling, encapsulation, and well-named entities (name oriented software). It makes the role of a component, from lexical to method, class and class pattern levels, readily legible to humans who much maintain and extend the system. 

This strictness of role works well for machines made of non-sentient digital components. For systems where components are sentient meat, there are inevitable side effects. This is, perhaps, the core ethical dilemma Confucius concerns himself with: the demands of The State and The Way (道).

FUNCTIONS SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO IT ONLY. — Robert Martin, Clean Code

Clean Sweep

Software engineering isn’t philosophy, as fun as both of them are. There are certainly intersections, as HXA7241 (Harrison Ainsworth) recently described:

The single core idea (to be rather bold and sweeping) in philosophy is the distinction of necessary and contingent: ‘necessary’ being what is always true, what is known logically; ‘contingent’ being everything else, that may or may not be known or true according to circumstance.

The single core idea in software engineering is abstraction: which is the fusion of a fixed part with a varying part. And this maps exactly to necessary and contingent. An abstraction says that within its context a particular thing is necessary – the fixed part – but also that the rest is contingent – the varying part. (A single bit number is always a number – by definition, but it might be 0 or 1 – completely by circumstance.)

It is indeed a sweeping generalization, beautiful in its bold wrongness. Thinking mathematics was a science, Wittgenstein once said, was like mistaking the broom for the furniture. Similarly, when you pick up the broom to clean the room – when you put the toolset to use – the confusion disappears.

Now Ainsworth’s assertion is closer to Wittgenstein than the math / science analogy implies, because he is saying that both philosophy and software engineering are toolsets of a kind. (Elsewhere, in an interesting take I might well return to, he describes software engineering instead as engineering in a computational medium.) Even keeping in mind that in another post software engineering is defined as entirely concerned with how the software works. “It neither changes what is wanted, nor what can possibly be computed,” … but it does change what is wanted – the articulation of a possibility in software changes its future iterations through the evolution of human understanding of that possibility. It is less like a broom and more like a paintbrush. Or a Japanese fan. Picking it up changes the room.

Or a dodgy second-hand chainsaw, which only works when you hold it at a fifteen degree angle and rev the crap out of the engine. The machine-nature of useful semi-broken software – or software engineering – seems to strain the very limits of the metaphor. “Software is clarity,” Ainsworth writes. I guess he hasn’t used Microsoft Word.

A Program Is Articulate

Rearing its head out of Helen’s corner of the twitter-sphere around the occassion of the great Austrian’s 112th birthday (and sixty years since his death) comes the Tractatus Digito-Philosophicus, a recasting of Wittgenstein’s landmark first book into software terms.

2.0122 […] (It is impossible for words to appear in two different roles: by themselves, and in programs.)

There are several appealing elements to this self-described “odd venture”. One is that the translation is to a degree automatic, based on a simple search and replace table found at the end. It is logical positivism via sed. Another is that the Tractatus was produced during and soon after Wittgenstein was working as an actual engineer – first as an aviation research engineer at Manchester University, and later supervising technicians in a supply depot in World War I. He was not temperamentally very well suited to engineering work. Biographers have traditionally downplayed this as an intellectual influence, though Susan Sterett explores interesting parallels and possible influences around the idea of engineering models in the well-titled and readable Wittgenstein Flies A Kite.

The Tractatus Digito has the virtue of poetry (metaphor, simile, and so on) in presenting the same information from a different perspective and so firing different connections in the brain. But it’s more systematic than poetry as well. It’s not just a martial arts metaphor, as rhetorically useful as they can be. To contrast with an example close to hand, attempting to describe software in Confucian terms is a project fuelled as much by juxtaposition and analogy as correspondence. The mapping to that world will always be a partial one.

Ainsworth, rather, has noticed what every undergraduate programmer should know: that programs are sequences of logical propositions. So Wittgenstein is necessarily writing about software, or perhaps more specfically, because there is no social dimension, about programs. Our thinking about software is intertwined with its origins in the 1920s. This partial recasting is valuable in the same way a Turing Machine simulator is valuable. Sure, some of the resulting sentences don’t really make sense. Yet bringing registers and sorting algorithms into the book that invented truth tables feels less like visiting a foreign land, and more like hearing a friend talk excitedly on their return to the old family home.

3.141

A program is not a blend of instructions. – (Just as a theme in music is not a blend of notes.)

A program is articulate.