Seeing Like A Facebook

The insistence on a single, unique, legal identity by Facebook and Google continues a historical pattern of expansion of power through control of the information environment. Consider the historical introduction of surnames:

Customary naming practices are enormously rich. Among some peoples, it is not uncommon to have different names during different stages of life (infancy, childhood, adulthood) and in some cases after death; added to those are names used for joking, rituals, and mourning and names used for interactions with same-sex friends or with in-laws. […]  To the question “What is your name?” which has a more unambiguous answer in the contemporary West, the only plausible answer is “It depends”.
For the insider who grows up using these naming practices, they are both legible and clarifying.
 — James C. Scott, Seeing Like A State

It’s all rather reminiscent of the namespace of open internets since they emerged in the 80s, including BBS, blogs, IRC, message boards, slashcode, newsgroups and even extending the lineage to the pseudonym-friendly Twitter. You can tell Twitter has this heredity by the joke and impersonating accounts, sometimes created in ill-spirit, but mostly in a slyly mocking one. CheeseburgerBrown’s autobiography of his pseudonyms captures the spirit of it.

Practically any structured scheme you might use to capture this richness of possible real world names will fail, as  Patrick McKenzie amusingly demonstrates in his list of falsehoods programmers believe about names.

Scott goes on to show how the consistent surnames made information on people much easier to access and organize for the state – more legible. This in turn made efficient taxation, conscription and corvee labour possible for the feudal state, as well as fine grained legal title to land. It establishes an information environment on which later institutions such as the stock market, income tax and the welfare state (medical, unemployment cover, universal education) rely. Indeed the idea of a uniquely identifiable citizen, who votes once, is relied on by mass democracy. Exceptions,  where they exist, are limited in their design impact due to their rarity. Even then, the introduction of national ID cards and car registration plates is part of that same legibility project, by enforcing unique identifiers. For more commercial reasons but with much the same effect, public transport smartcards, mobile phones  and number plates, when combined with modern computing, make mass surveillance within technical reach. 

The transition to simplified names was not self-emerging or gentle but was aggressively pursued by premodern and colonial states. In the course of a wide survey Scott gives a striking example from the Philippines:

Filipinos were instructed by the decree of November 21, 1849, to take on permanent Hispanic surnames. The author of the decree was Governor (and Lieutenant General) Narciso Claveria y Zaldua, a meticulous administrator as determined to rationalise names as he had been determined to rationalise existing law, provincial boundaries, and the calendar. He had observed, as his decree states, that Filipinos generally lacked individual surnames, which might “distinguish them by families,” and that their practice of adopting baptismal names from a small group of saints’ names resulted in great “confusion”. The remedy was the catalogo, a compendium not only of personal names but also of nouns and adjectives drawn from flora, fauna, minerals, geography and the arts and intended to be used by the authorities in assigning permanent, inherited surnames. […] In practice, each town was given a number of pages from an alphabetized catalogo, producing whole towns with surnames of the same letter. In situations where there has been little in-migration in the past 150 years, the traces of this administrative exercise are still perfectly visible across the landscape.
[…]
For a utilitarian state builder of Claveria’s temper, however, the ultimate goal was a complete and legible list of subjects and taxpayers. […] Schoolteachers were ordered to forbid thier students to address or even know one another by any other name except the officially inscribed family name. More efficacious, perhaps, given the minuscule school enrolment, was the proviso that forbade priests and military and civil officials from accepting any document, application, petition or deed that did not use the official surnames.

The ultimate consequences of these simplification projects can be good or bad, but they are all expansions of centralized power, often unnecessary, and dangerous without counterbalancing elements. Mass democracy could eventually use the mechanism of citizen registration to empower individuals and restrain the government, but this was in some sense historically reactive: it came after the expansion of the state at the expense of more local interests.

The existence of Farmville aside, Google and Facebook probably don’t intend to press people into involuntary labour. People are still choosing to click that cow no matter how much gamification gets them there. The interest in unique identities is for selling a maximally valued demographic bundle to advertisers. Even with multitudes of names and identities, we usually funnel back to one shared income and set of assets backed by a legal name.

Any power grab of this nature will encounter resistance. This might be placing oneself outside the system of control (deleting accounts), or it might be finding ways to use the system without ceding everything it asks for, like Jamais Cascio lying to Facebook.

The great target of Scott’s book is not historical states so much as the high modernist mega-projects so characteristic of the twentieth century, and their ongoing intellectual temptations today. He is particularly devastating when describing the comprehensive miseries possible when high modernist central planning combines with the unconstrained political power in a totalitarian state.

Again, it would be incorrect and unfair to describe any of the big software players today as being high modernist, let alone totalitarian. IBM in its mainframe and KLOC heyday was part of that high modernist moment, but today even the restrictive and aesthetically austere Apple has succeeded mainly by fostering creative uses of its platform by its users. The pressures of consumer capitalism being what they are, though, the motivation to forcibly simplify identity to a single point is hard for a state or a corporation to resist. Centralization has a self-perpetuating momentum to it, which good technocratic intentions tend to reinforce, even when these firms have a philosophical background in open systems. With the combined marvels of smartphones, clouds, electronic billing and social networks, I am reminded of Le Corbusier’s words. These software platforms are becoming machines for living.

VII.1 Reuse

子日,述而不作,信而好古,窃比于我老彭. – 论语 七:一

The Master said, ‘I transmit but do not innovate; I am truthful in what I say and devoted to antiquity. I venture to compare myself to your Old P’eng.’ – Analects VII.1 (Lau)

Before contemplating the process implications of this radically static statement, let’s note that from the perspective of designed code itself, it is always true. Code transforms and transmits information. This is the garbage-in garbage-out principle. Designed code (not genetic or evolved code) does not innovate.

Backups of the user directory for the Analects’ source control repository are, alas, lost to antiquity, and though many sophisticated data recovery techniques have been tried, with some success, none have yielded the identity of Old P’eng. Our ignorance of him highlights our relationship with Confucius and with any classical tradition. To us, Confucius founded a philosophical school, but in his own words he merely continued a tradition that we can see indirectly, if at all.

Scholarly consensus is that Confucius is deliberately overstating his lack of innovation for reasons of rhetoric or modesty (see eg DC Lau, AC Graham, or just wikipedia on this verse). Nevertheless the verse is considered pivotal in understanding Confucius’ traditionalism and conservatism in a time of extraordinary violence and social change.

Existing solutions are useful in at least two ways. 

Firstly they may capture unintuitive theoretical results in accessible ways. Many algorithm design and data structure results are now in this category, such as sorting algorithms and efficient concurrent maps (eg the Java 6 lock free implementation of java.util.ConcurrentHashMap). The formal scientific characterization of such solutions in terms of, say, algorithmic complexity and performance benchmarks  make computer theoretic literacy crucial. Programmers will be unlikely to understand the derivation by reading the code, so they must be able to read the documentation. 

Secondly they may capture highly specific details of the environment and robust solutions to managing it. This will include successful workarounds for under-specified elements of protocols, or flat-out incorrect but popular implementations. Any user of say Ruby on Rails or Tomcat takes advantage of this kind of reuse. Consider too the domain specific details and tolerances of a fly-by-wire control system for a particular make and model of plane.

These two kinds of reuse may be contrived to lie on a spectrum, but I’ve chosen to distinguish them here for their correspondence to two different categories of knowledge – logos and metis. In classical Greek epistemology logos is theoretic universal knowledge and metis is hard won cunning, “feel”, or craft knowledge (as an aspect of techne, craft knowledge and theory). James C. Scott describes the Greek hero Odysseus, surgeons and maritime pilots as all relying on metis (Seeing Like A State). Scott also makes the connection between traditional knowledge – which is particular and tied to a society and geography – and common law conservatism in the tradition of Edmund Burke and Michael Oakeshott.

Confucius is claimed as a kind of Burkean conservative, for instance, by James Kalb. Both Confucius and Burke grew up in societies with small literate elites and large impoverished peasantries. They both share senses of the worth of settled convention, the importance of teaching and the literary canon, a paternalistic affection for heredity power, and a sympathy for the welfare of everyday people.  Neither are they reactionaries, but welcome improvement at a humane pace (IX.3).

Seeing Burke and Confucius as similar is not mainstream and deserves a dedicated analysis of its own. (My searches revealed more extant work linking both of them individually to Wittgenstein than to each other, but pointers are always welcome.) In a comprehensive entry for Burke in the Stanford Encyclopedia of Philosophy, Ian Harris argues that despite being more often claimed by the right wing, he does not have a clear modern partisan successor. Nevertheless, distinguished scholars like DC Lau or AC Graham stay well clear of Western political comparisons, while happily comparing classical Chinese figures with Western philosophers. 

Unusually, a software library, and all the hard won craft knowledge that comes with it, can be imported into another with extraordinary ease when compared to other forms of craft knowledge. A pilot is of little advantage outside his home port, and Ruby on Rails is of little use for 3D rendering, but in software we can copy the pilot and use him on innumerable ships entering that port. We can also ultimately read the source code to Ruby on Rails and determine how it tolerates the idiosyncrasies of particular browsers and servers. This is because all code is built on a formal information substrate – the computational medium. (This is Harrison Ainsworth’s term and his note on reuse provided a number of the connections in this post.)

Not all craft knowledge of a codebase is encapsulated in the codebase. There are particularities of the install, workaround scripts, configuration, scheduled jobs and so on, but these are ultimately digital artifacts easily included within a slightly broader view of what a codebase is (this latter is a premise of DevOps and for anyone serious about a controlled environment). More problematically, there are conventions of use, design choices, oral traditions of “check here when you change there”, and so on. At the limit, all codebases are incomplete. They depend on co-texts, results and knowledge of the domain that need not be encoded. An air traffic control system does not need a textbook description of Bernoulli’s Principle.

Burke and Scott argue that in an established society important, non-obvious, traditional knowledge is captured in social conventions and established practice, and the practice cannot be simplified without a loss of valuable situational knowledge. Scott additionally points out that such an environment is very difficult for an outsider to navigate and there are strong motivations for central political power to apply simplifications to it.

Yet highly particular, ‘local’ code that requires hands on experience and knowledge of accompanying conventions most frequently has another name in software development: bad. Or: spaghetti. Or: legacy. The sentiment is well captured in Qi’s koan on fear, even if it does riff off an opposing classical Chinese tradition. (In Confucian terms we might note the building is not harmonious.)

In No Silver Bullet, Brooks distinguishes accidental and inherent complexity, with the latter being an attribute of the underlying problem rather than any specific software or hardware implementation. Complexity due to poor or improvable design is always accidental; that due to the problem domain is by definition inherent.

An aesthetic sense of good or poor design becomes crucial when pursuing aggressive reuse (VII.14). Without it you will simply perpetuate junk.

Having argued the link between conservatism and software reuse, it is worth being a little more precise about flavours of conservatism. William F. Buckley famously described it as that which “stands athwart history, yelling Stop, at a time when no one is inclined to do so, or to have much patience with those who so urge it.” Despite its partisan origins, this is a good start, as it illustrates certain threads of environmentalism and the idea of heritage listing fall easily under the same banner. ((It is also useful to think of contemporary US Democrats defending Franklin D. Roosevelt’s New Deal, or opposition to changes to Britain’s NHS in this frame.))

In its purest form, this can be “return to a golden age” conservatism. There’s certainly an argument that Confucius would have been happy with a reversion to the society of the Eastern Zhou. We should again temper our interpretation by wondering how much is rhetoric covering adaptation of tradition to new times. In software, certainly, simply reactionary approaches are of little use. Brooks and the founders of eXtreme Programming have both noted that a more effective strategy is to embrace change. Oakeshott argues in On Being Conservative that settings of widespread and enthusiastic change are in particular need of an awareness of the value of what exists now. A traditionalist most often defends the present versus the future, not the past versus the present. This conservative disposition’s usefulness to software is more apparent if taken as an analytic tool rather than an inherent aspect of personality. After all, the greenfield doesn’t exist (see X.18), and any project that pretends to be a greenfield is an interesting lie.

Conservative thought in this vein usually emphasizes working within a tradition and a community – in software we would say platform. This also suggests interesting contours for the breadth of possible reuse; and there are other verses, such as XVI.11, where that might be explored. What is immediately apparent is the narrowness and fragility of an entirely in-house platform due to the smallness of its developer community; and the need for a shared jargon (XIII.3) and perhaps a canon (XVI.13).

Given the corpus of extant code in the form of libraries, to adore antiquity is to know your platform, including its innards, not just thoughtless rote quoting via copy and paste. At this moment in software, to reuse and extend is a greater service than extraneous self-involvement masquerading as innovation.

If you can easily find some code and copy it, you get the result at zero cost. That is an efficiency that cannot be beaten: no amount of programming tool and technique improvements can ever do that. So we want to maximise reuse. – Fred Brooks, No Silver Bullet

XII.11 Let the prince be a prince 

Duke Ching of Ch’i asked Confucius about government. Confucius answered, ‘Let the ruler be a ruler, the subject a subject, the father a father, the son a son.’ The Duke said, ‘Splendid! Truly, if the ruler be not a ruler, the subject not a subject, the father not a father, the son not a son, then even if there be grain, would I get to eat it?’ — Analects XII.11 (Lau)

齐景公问政于孔子。孔子对日,君,君,臣,臣,父,父,孑,子。公日,善哉,信如君不君,臣不臣,父不父,子不子,虽有粟,吾得而食诸。- 论语,十二:十一

This is one of The Analects clearest statements of the feudal and patriarchal social order that would later get the name Confucianism. Detach it, for a moment, from that overwhelming cultural context, and it’s also an expression on separating design concerns. The two can be contrasted. Every political pundit is a social engineer. They either advocate improvement to the design of the state, or argue a change will break the existing system.

Mencius (孟子) expanded on this sentiment for one of the earliest recorded defenses of the division of labour (Book 3 part 1 chapter 4, 3-4). Labour specialization works because humans have limits on the complexity of a task they can undertake, and are not cloneable or particularly fungible. 

Software, by contrast, is highly specialized, but also cloneable at near zero cost. Software complexity has different boundaries. There are physical limits inherent to what Harrison Ainsworth calls engineering in a computational material. These are physical characteristics of algorithmic complexity or computability – limits on how fast a particular problem can be solved, if it can be solved at all. 

There are, by contrast, few physical limits to the conceptual complexity of a software component. Those measures like cyclomatic complexity – number of subtasks, variables and choices in a method – have high values, orders of magnitude short of the physical limits imposed by compilers and interpreters. (I once worked on a system where other team members had, in their wisdom, exceeded the limit for the size of a single Java method in a long list of simple business transformation rules. Pushed by the very essence of the language to refactor, they proceeded to – what else? – push the remaining rules into longMethod2().)

The limits which measures like cyclomatic complexity indicate are human limits. They mark the soft edges of a space where humans can effectively create, manage, or even understand software. There are different ways of describing coding conventions, but they all seek to indicate a limit beyond which code becomes illegible.

Legibility is the term James C. Scott uses to describe the social engineering needs of a nascent or established state (Seeing Like A State). The mechanics of a working state require internal legibility. Those working for it must be able to measure and understand their environment in mutually compatible terms which also promote the success of the government. This is why feudal states have such a profusion of titles which become the name of the person (not Bob – The Duke of Marlborough). It is also why courtly dress has such systematic rules. This is seen particularly in bureaucratic feudal states as seen historically in East Asia, eg in feudal Korea, but also in the Vatican, or the badges at the postmodern World Economic Forum. These codes serve the dual purpose of defining the interfaces of the state and of making the role of the person instantly legible to one familiar with the system, all while tempting people with the markings of social status.

Marking lexemes by colour and shape according to their role is exactly what IDE pretty printing achieves. This is also the intent behind decoupling, encapsulation, and well-named entities (name oriented software). It makes the role of a component, from lexical to method, class and class pattern levels, readily legible to humans who much maintain and extend the system. 

This strictness of role works well for machines made of non-sentient digital components. For systems where components are sentient meat, there are inevitable side effects. This is, perhaps, the core ethical dilemma Confucius concerns himself with: the demands of The State and The Way (道).

FUNCTIONS SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO IT ONLY. — Robert Martin, Clean Code