Professional Documents
Culture Documents
Remarks On The Foundations of Computer Science: Barry Brown
Remarks On The Foundations of Computer Science: Barry Brown
science
Barry Brown
Computing Science, University of Glasgow, Glasgow, G12 8QQ, United Kingdom
barry@dcs.gla.ac.uk
Abstract. Research in a number of fields has underlined the social nature of technical
work. Yet there are still few enquiries into the social aspects of computer science, in
particular how objects such as proofs, programming languages and the like interact and
depend upon the work of computer scientists, programmers and users for their existance.
Through following ethnomethodology’s hybrid program, this paper studies conceptually
the technical practice of computer science. Drawing an analogy with studies of the
nature of proofs, programming code is described as a lived object, one which is crafted
and depends upon the reading and writing of individuals for its sense. Generally, three
key aspects of this lived work in computer science are described: first, that it establishes
‘accountable objects’ – technical concepts which can be established through reading
instructions to demonstrate their status and facticity. Second, in these descriptions there
is a prospective orientation to their use - the future lived work with a technical concept is
of crucial importance in its evaluation and development. Lastly, these technical concepts
depend upon each other to form chains of generality – with objects connecting to other
objects of more or less specificity. I chart the course of one chain from type theory, to
programming language, to program, to system, and finally to use.
Introduction
Nearly twenty years after the publication of ‘Plans and situated action’,
ethnomethodology (EM) has had noy inconsiderable influence in the study of
technology, and informing its design, although mainly within academic work.
This adoption of EM within the CSCW (computer supported collaborative work)
and HCI (human computer interaction) research fields has done much to support
and revitalise EM within its home discipline of sociology. This paper reflects on
the relationship between computer science (CS) and ethnomethodology, asking
how this relationship has developed and how it might be furthered. I argue that
EM has seldom taken seriously the details of computer science itself, and the
technical practices of computer science more broadly. With a few notable
exceptions there has been very few attempts by ethnomethodologists to address
the core themes and problems of computer science. EM has instead been
developed as a way of contributing to end user system design.
While not denying the importance of designing systems for end users,
computer science is not just about designing the parts of systems which end users
directly see. This focus on the ‘end result’ ignores many of the sources of
inspiration and innovation within computer science, and the wide range nature of
problems with which computer science grapples. Indeed, innovations from
computer science – and its fundamental contribution to society – has in many
cases come from how it has enabled computers to do new things that were
previous impossible, be that in the form of information retrieval algorithms for
web search engines, or architectures of mobile computing. That is to say,
innovation in computer science is not just about rearranging what was previously
possible into new user-focused configurations, but about making technology do
new things which it could not do before (things which of course may, or may not,
be of use). EM has focused on the ‘last 10%’ of technology innovation, ignoring
to a large extent work that makes the technologies work.
An understanding of what technical concepts in CS could contribute to a better
understanding of how technical concepts develop, and also nature of technology,
why it develops in certain ways. This paper presents an attempt to reconsider this
relationship, asking how EM might look and understand some of the ‘technical
concepts’ which computer science produces. In particular, the paper takes
discussion of scientific proofs and sees what we can learn from these discussions
for understanding the sort of concepts which are the currency of computer
science. This is developed in an analysis of authored computer code. Lastly, the
paper explore the relevance of these points for technical concepts in computer
science more generally.
Hybrid studies
One key theme in recent ethnomethodology has been the ‘hybridisation’ of EM
with whatever discipline or practice that it studies. Garfinkel argues that to be
close enough to the whatever is being studied EM must get to the point where it is
not just describing as an outsider, or re-representing whatever that activity or
discipline is. Instead, it should be able to teach and instruct others in whatever
that activity is, developing a set of techniques and methods which are uniquely
adequate for the particular practice being studied. Then, EM will be not simply a
variant on sociology but instead showing, through the working through of local
problems, what is the nature of what it is studying. A number of Garfinkel’s
students have followed this route - exploring how EM may be hybridised into
fields such as ethno-biology, ethno-mathematics and the like.
Yet it is also worth reflecting that EMs most successful engagement, that with
technology design, has been the one where is has specifically not hybridised itself
with the host discipline. The community and strength which has been built from
not simply disappearing into a hybrid form. Yet the lack of deep engagement
between EM and CS may have come at some cost. Crabtree, for example, {ref}
talks about ‘boiling pot hybrids’ where the engagement between the concepts of
EM and computer science is superficial. Dourish and Button have outlined
technomethodology {ref}, as an attempt to integrate key findings of
ethnomethodology into computer science more seriously. These papers challenge
the ways in which ethnomethodology has been taken as a design manifesto and
challenge EM and CS to develop further a hybridisation.
It is a goal of this paper to explore the potential for this hybridisation - for how
EM and CS may entangle more. The hybridisation I explore, however, is not one
which seeks an engagement with design. Crabtree’s description of hybridisation
specifically focuses on the applications of technology:
“The hybrid programme requires that we subject the objects of computer
science – computer-based systems, applications and devices – to
ethnomethodological study and employ those studies to inform the ongoing
development of design practice. Distinctively, ethnomethodology does not
study the objects of design as technical objects – as, for example, mathematical
objects, or models, or structures of technical components”
That is to say, the hybridisation is not made with technology’s production, or
computer science per se. Instead, the hybridisation is with designs and design
itself as a practice which is in dialogue with use to produce its systems,
applications and devices. This sort of ethnomethodological hybrid work focus on
design conversations and on technical objects as they are transformed in use, their
successes (as such) – and revisions.
To my mind, this hybrid design leaves neglected the technical details at the
core of technology - ‘the objects of design as technical objects’. For example, the
program code which makes a prototype system work is an essential part of the
design process, and while it may not be immediately presented to the users, its
shortcomings can be immediately apparent at points of failure or breakdown. A
central observation of science studies was that in studying science we cannot just
look at scientific papers or textbooks, we need to go and look at the practice of
science. Similarly, to understand technology one needs to look at what is
involved in its construction, rather than just its end products in terms of
technology in use.
In Button and Dourish’s discussion of hybrid CS there is more of a focus on
design than what we would describe as the deep technical core of CS. The focus
of much of their discussion is on what CS could learn, and how its designs could
be informed by, ethnomethodology. However, one promising argument they
make concerns the importance of accountability in the use of technology,
exploring how a computer ‘accounts for itself’ in an interaction with a user. The
example they give, of a file copy which gets stuck at 40% is revealing of how
systems often account for themselves in inappropriate ways. Yet still the focus
here is on what ethnomethodology can say to CS, less what we can learn about
CS itself.
To attack this problem, in this paper I borrow from research done on the nature
of scientific proof, particularly studies which have been done in science and
technology studies. These studies show, in different ways, the importance of the
social for scientific proof, and scientific concepts more generally. It is through
social action that scientific proofs are formed and practically engaged (whatever
their epistemological status). While this point may be seen as a technical
observation concerning the nature of science, I attempt to apply this observation
to computer science. Our aim is to understand something about the core objects
of computer science – the base elements of its technical discourse – functions,
types, programs and the like.
To outline the argument: we will draw a number of lessons from sociology of
science, in particular work which has explored the nature of scientific proof.
Studies of proof have shown how proofs depend for their existence on ‘lived
work’ - they depend upon the reading, writing and working out of scientists. It is
not just the written document which establishes a proof, but that documents can
be read by a competent scientist as establishing a proof’s claim.
Proofs are something of a hard case for developing an understanding of what
science does, showing the ‘something else’ involved in proof breaks open the
study of what are usually thought of as the purely technical parts of a science.
Drawing these points onto CS, it is not just that proofs in CS depends upon lived
work, but the whole range of technical concepts in computer science are not
purely technical objects, but instead are pairs depending upon the work of
reading, writing and calculation. Exploring this with some developments in the
design of programming languages explains the value of recent innovations.
1. First, proofs consist of two parts which are paired – the formal description of
the proof and the lived work of showing that the proof works. Both are
indispensable for the establishing of a proof.
2. Second, it follows that what proofs establish are accountable objects. They
objects they create can be seen to others to be sensible objects which can be
talked about and shared without problem and while appearing reasonable,
within the practices of a discipline, etc. The pulsar which is discovered is
accountable in that the proof can be followed to find the pulsar. The proof of
the four colour problem can be followed to show that four colours suffice – it
is accountable.
3. Lastly, in the formulation and authoring of a proofs there is an orientation to
its future use, be that simply its reading by other researchers or more broadly
this it is an ‘interesting discovery’ which sheds light on other problems.
To computer science
Proofs play an active role in computer science, particularly theoretical
computer science, and it would be reasonably straightforward to develop an
argument that proofs in CS have similar characteristics to the proofs described
above. However, rather than address CS proofs, my interest is more in what we
can learn about ‘technical concepts’ which are core to CS. That is to say, do these
points also apply not only to proofs, as technical products of science, but also to
the technical products of CS - types, objects, programs, functions and the like.
Can we draw a useful analogy between proofs and computer code, for example?
Each of these lines of code is obviously different, yet with some equivalence -
they all compute the sum 1+1. For each of these representations it is possible that,
when executed on a computer they will cause the computer to do the same thing
(perhaps even exactly the same thing). In this practical sense then these programs
are equivalent. Yet these programs are not equivalent to a programmer - they
mean very different things and would be used in very different ways. The first
two functions can work over more than numbers, whereas the Java example will
only work over numbers. The Java program creates a class with a method,
whereas the first two define only functions. In this sense programs are much
more than their execution on particular machines - they are objects themselves
with important properties for those who read and write them. Programming code
is much more than simply its execution on particular computers, it is an object
with importance in its own right.
Going further, following the first observation above concerning proofs, we can
asset that program code is a paired object - it has a formal description in terms of
the line of the code, and lived work of reading and writing program code. As with
proofs this might seem to be an unusual assertion – computer programs surely
exist on their own without the need for a pair. Yet consider the activities involved
in the writing, authoring and running of computer programs. Programs
themselves are written in languages which are both human and machine readable.
They are an intermediate representation which is executed by the computer in a
more or less complex execution process, one which involves human involvement.
Computer programs rely for their meaning not only on other computer
programs which execute them, but on humans who read and write that code. The
status of the code as executable is central, but also code that is comprehensible by
others. The lifeworlds of programmers is full of discussions of code, of how to do
different things and predominantly how to link code into the complex ecology of
computational systems that already exists. Code – as with proofs - is a living
social object. Indeed, in the execution of code, that is usually some interactions
between user(s) and system. Even ‘batch’ processes have as an end result some
output to a user. The execution of code is therefore something which makes sense
in the interactions it produces with users. This underlines the paired nature of
program code – both in its creation and execution code is a social object, the
formal code coexists and depends upon lived work for its sense. Program code is
thus a lebenswelt pair, it consists of two mutually dependent parts:
Computer programs can also be seen as ‘accountable objects’, in that they are
seen as programs because of, and gain their sense from, the ways in which code
makes sense to programmers. What makes a computer program a valid computer
program is that it makes some sort of sense as a valid set of instructions which has
some sort of purpose. While computers will happily execute random instructions,
that is not a computer program - something which a human has written and could
potentially read – it is purposeless nonsense. Valid computer programs are
programs because of their authorship and potential execution by a computer.
Programs depend upon the writing and reading of authors to be established as
valid programs, as well as their execution by a user on a computer.
The three colour proof’s status as a valid proof depended upon mathematicians
accepting the output of that program as showing that each of the different graphs
was reducible. In turn, the status of a computer program depends upon that code
being potentially readable by a programmer. Computer programs are
‘accountable’ in that they contain within them instructions for their valid
execution and comprehension by both users and programmers. It is both valid
execution and comprehension which makes valid programs.
Program code is not random text not only because it will compile or execute,
but because it can be read by a programmer as doing something. The code
extracts above are thus lines of code not only because they are syntactically
correct, and that they are in known languages, but that they would compile on a
computer and so on (although these are important), but because they are
comprehensible and understandable to those reading them.
Following our remarks on the status of proof, the future readability and
comprehensibility of computer programs becomes of interest. Obviously, this is a
key practical concern of programmers - and not only for the benefit of other
programmers – programmers need to be able to read their code as they write it, as
they author their code line by line they need to write code which gets its purpose
both from previous lines of code and those not written. The creation of programs
line by line has a orientation to future lines of code that will be written. In writing
lines of code programmers have an orientation to future parts of code that will be
written - other objects, modules and the like which will connect with an use this
code.
More broadly, code may be used or read by others. Code that is written, all
code that is written thus has some sort of orientation to what comes next, to the
future reading and writing of code. Code is not written simply to be executed, in
practice it will be read many times, if only by the programmer who wrote it. As
with proofs then, code has an orientation to its future use.
Literate programming
These three points – the paired nature of code, its accountability and future
orientation, can be seen in the discussion and championing of ‘literate
programming’ by Donald Knuth. Knuth argued that computer programs are read
not only by computers during their execution, but also by other future
programmers. He proposed a change to the traditional process of compilation,
whereby a computer program is converted from a human readable form to a form
more directly executable by a computer. Rather than being a one way process
between program code and executable, he argued that their should be two outputs
of compilation – a human readable document, along with the executable code.
Knuth argued that programmers were like authors, writing text which was to be
later understood by others:
Readability
One observation that we can draw, one and promising approach for designing
programming styles or languages, is the accountable nature of code. Approaches
which attempt to make code more accountable - its function clearer - are likely to
be more promising that those which forever attempt to put descriptions in.
This corresponds very much with readability as a goal of programming
language design. However, while there is a body of work which explores
techniques such as program visualisations, text presentation and such,
comprehensibility has been somewhat neglected. While innovations in
programming languages have often been powerful because they support increased
readability, they are usually presented as improving code’s power, type correction
and so on.
One example of this is the introduction of generics into the Java programming
language. This recent addition to Java allows programmers to be more expressive
about the types that they are describing. Previously, if one was doing something
like creating a list of things, it was not possible to specify what the type of things
that list was of. So if one wanted to create a list of pictures, that would be created
using much the same line of code as that for a list of numbers, and there was little
way of differentiating between them. However, using generic types in Java it is
now possible to specify more clearly what you are defining. Previously creating a
list of numbers would be specified so:
Which gives little idea that myList is a list of numbers. The type of
LinkedList is List, and this gives little idea that this is a list of numbers.
With Generics this becomes:
It should be pointed out that these additions to Java do little to increase the
efficiency of the language. It is instead that they support code which is more
obvious to its intent. Generics also help the programmer when they are writing
the code in that they prevent the programmer writing code which will assign a
non-integer to a list of integers. In this way it further preserves the type safety of
Java programs.
Indeed, more broadly the accountability of program code goes some way to
explaining the popularity of open source software, and the use of open source
software in other projects. Since code is freely available, if a programmer wants
to make use of a particular package he has access to a complete description of
what the program does and how it does it – the program code itself. They can use
this to both understand the program if it does not act as expected, or alternatively
change the code to do as they require. In many ways this goes against the
principles of encapsulation, that code should only present a specific interface to
the world, since it allows programmers access to the internal code of a program
module. Yet this access to code supports the accountability of program code
further.
The NX bit
A similar analysis can be made of a number of other technical concepts used in
computer science. Broadly, looking at technical concepts in this way makes us
ask, what is the social part that makes up the pair of the technical concept?
Moreover, it focuses our attention on the social work which makes that thing what
is is - it’s accountability.
While there is not space here to do an extended analysis, a short analysis can
be sketched of a another technical objects from computer science – the ‘NX’ or no
execute bit in program code. In a computer’s memory this is a way of designating
areas of memory which do not contain program code. These are areas of
memory which contain, for example, pictures or data which it would not be
suitable to execute. Unfortunately, many viruses use bugs in program code to
trick the computer into trying to execute data from, for example, a web page
which is held in memory. This can be used to get the machine to run damaging
code.
The NX bit goes some way to stopping this by marking memory with a tag if it
is “non-executable”. This is put on the whole memory by default, with program
code marked as “executable”. This means that programs will work as expected
until there is a problem (either deliberately by a virus), or accidental. The NX bit
will prevent the non-code memory region from being executed and the operating
system will attempt to repair or close the offending program.
Now, as with the description of program code as a technical object I can sketch
out a similar description of the NX bit. Again, at first glance the NX may seem to
be a purely technical creation. However, in practice what is or is not set as NX is
under the control of programmers, users and organisations. It is their practices of
coding and organising what they do that will influence what individual computers
do. It is this lived work again that changes the invisible operations of millions of
computer chips into the NX bit. Since it is people who decided what ultimately
control - through the code they write - which parts of memory are designated at
executable or not the NX bit also subtle changes who has control over the
execution of the machine. The specific aim is to prevent hackers from writing
code which they can execute on your machine without your permission - to deny
others the power to execute programs on y our computer. The NX bit is thus
involved in the question of who has control over your machine and who can get it
to execute the code they want. Indeed, in future as copyright protection becomes
controversially be incorporated into standard processors, who has the power to
control what code is executed on machines could turn into a controversial area.
What might seem a technical object thus become a very much social concern.
Conclusion
This paper draws some lessons from the study of proofs, in the sociology of
science, to produce a re-description of the technical details of computer science.
The argument, broadly stated, is that the many technical parts of computer science
- programs, types and the like - while usually considered as purely technical
creations, are lebenswelt (or lifeworld) pairs - they depend for their existence as
much on the lived work of computer scientists, programmers and users, as much
as they do on their particular formalisms. It is through the combination of lived
work and formal presentation that these concepts gain their stability and
reliability. A line of code is a line of code not simply because it is in a formal
notation but because it makes sense to a programmer within the context of a
program.
In some senses this is not a surprising point to make: for a notation like the
lambda calculus it is difficult to dispute the human elements in its creation
(although the nature of computation it depicts may be more controversial).
Program code depends on its existence for programmers to author and read that
code. Yet a key finding we drew from Garfinkel et al was that many technical
concepts are accountable objects. That is to say that they depend on both the
formal presentation and the lived work around them for their status as what they
are. That is to say, program code is not just random lines of maths because it is
authored and readable by other programmers. The lived work around program
code is not an optional extra: it depends upon it for its existence. The very fact
that a programmer can read the code, and make some sort of sense of what that
program does, establishes the code as a program.
More broadly, the range of technical concepts which are the currency of
computer science - types, functions and the sort - are also paired in this way, and
depend upon human activity around those technical concepts to make them what
they are. Moreover, they are arranged and used in such a way that their use
makes sense to the computer scientists who read and reason with them.
Drawing from the sociology of science, I have sketched in this paper an
argument which draws a number of findings about the nature of proofs, and
applied them more broadly to technical concepts used in computer science. In
particular, applying this analysis to program code underlined the importance of
readability in program languages.