Realization Is Universal

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Realization is U n i v e r s a l *

by
J. A. GOGUEN1"
Mathematical Sciences Department
IBM T. J. Watson Research Center
Yorktown Heights, New York 10598

ABSTRACT

Behavior is left adjoint to minimal realization, as functors between certain categories


of machines and behaviors. This gives a succinct characterization of minimal realization
valid for discrete as well as linear machines, with or without finiteness condition.
Realization theory is therefore expressed in contemporary algebra in a way which
reveals its inner structure and suggests generalizations. An adjunction between
regular sets and finite state acceptors follows as a corollary.

It is well k n o w n that the Nerode construction gives a canonical realization,


but the sense in which "canonical" is to be understood seems not to have been
stated sufficiently strongly or generally. This paper gives conditions which
characterize minimal realization uniquely up to isomorphism. The very same
characterization statements are valid for all categories o f machines one could
expect, for example, discrete machines and linear machines, with or without
finiteness condition. The characterizing conditions are universal, in the sense o f
categorical algebra.
I f M is a machine, let E M denote its external behavior. Then the universal
statement is that a machine N is a minimal realization of a behavior f if and
only if N is reachable, E N = f and every morphism E M - - + f surjective on
input alphabets, with M reachable, is Eg for a unique machine morphism
g: M ~ N. A n equivalent statement is that minimal realization is right adjoint
to behavior, as functors between categories o f machines and behaviors to be
defined later. In mathematics, adjoint situations have the connotation of
fundamental structural relationships; see M a c L a n e [8]. A by-product of this is
the fact that the minimal or reduced machines form a reflexive subcategory.
These results embed realization theory in c o n t e m p o r a r y algebra in a way which
reveals inner structure and suggests m a n y rich generalizations; see G o g u e n [6].
In system theory two c o m m o n approaches are the state variable [1] and
input-output relation [1% We wish to suggest, as a philosophical principle

* Soli Deo gloria.


t Written on leave from the Committee on Information Sciences, University of Chicago.
Address during 1972-73 : Department of Computer Science, UCLA, Los Angeles, Calif. 90024.
359
360 J.A. GOGUEN
arising from the present work, that these approaches are not opposed, but rather
dual, or more precisely, adjoint. This principle yields theorems in various specific
situations more complex than treated here; see [5]. Our last section also shows
the adjointness of regular sets and finite state acceptors.
The reader is assumed sufficiently familiar with a u t o m a t o n and control
theories not to require much by way of motivation for our definitions, which
are most often equivalent to the standard ones. In case of need, Arbib [1],
Arbib and Zeiger [2] and K a l m a n [7], can be consulted. The reader who is
only somewhat familiar with categorical algebra m a y consult the Appendix
to this paper, which contains all the definitions and results which we use. Since
limitations on space preclude our giving very many examples or proofs, a
reader with no categorical background may wish to consult some textbook
before reading this paper. MacLane [8] is especially recommended, and MacLane
and Birkhoff [9], Chapter 15, contains a fairly elementary exposition of basic
concepts.
The following is written explicitly for arbitrary discrete machines. The
small changes that have to be made to treat finite discrete machines and linear
machines (also called sampled data systems) are indicated in Sections (O)
and (P) below; these results are an improvement of Arbib and Zeiger's [2]
simplification of K a l m a n ' s [7] work.

(A) A machine M is a sextuple ~X, S, Y, 8, A, ~ , where X, S, Y are sets


(called the input, state, and output sets or alphabets, respectively), 8 : X x S ~ S,
A: S -+ Y, and c, e S (called the transition and output functions, and initial state,
respectively). A morphism M - + M ' of machines is a triple (a, b, c) of maps,
a: X - + X ' , b: S - + S ' , and c: Y - + Y' such that ~ b(c,) = c,', and the diagrams

6 k
X×S ,S S ,}

la×b

X' × S' 6' ~S'


and

S'
b X'

, Y'
]c

commute. We shall call a, b, and c respectively the first, second and third, or
input, state, and output, components of the morphism (a, b, c). It is easy to
verify that machines and their morphisms constitute a category 2 under com-
ponent-wise composition, which we denote M a c k

i For consistency, one might wish to define the initial state by a map a: 1 ~ S, where 1
is a one-point set, and if * is that point, a(*) ~ S is the initial state. For then the condition
b(~) = a' can be expressed by commutativity of the diagram
S

~~ ~ ' ~ ' ~ S'


2 This category is complete (or in the finite case, finitely complete) as it is easily seen to
have equalizers and products (or finite products).
Realization is Universal 361
(B) Given M, define 8 + : X* -~ S recursively by 8+(A) = ~r and 8+(xw) =
3(x, 8+(w)), for x c X and w c X*, where X* is the free m o n o i d generated by
X and A is the e m p t y word. If (a, b, c ) : M --~ M ' , then the d i a g r a m

X* 8+ ~S

1.
X'* 5'+
1
) S'

commutes, where a*: X * - + X ' * is the letter-wise extension o f a: X - + X ' to


words, defined by a*(A) = A and a*(xw) = a(x)a*Ov). F o r e3+(A) = b(cr) = or',
while 5'+a*(A) = 8 ' + ( A ) ; now assuming bS+(w) = 8'+a*(w) it follows that
bS+(xw) = 5'+a*(xw), since bS+(xw) = bS(x, 5+(w)) = 8'(a(x), bS+(w)) by the
first d i a g r a m o f (A), a n d 5'+a*(xw) = 5'+(a(x)a*(w)) = 8'(a(x), 3'+(a*(w))).
W e say that M is reachable if 8 + is surjective, and we denote the full sub-
category o f reachable machines by Maehr. Denote the subcategory 3 o f M a c h r
all o f whose m o r p h i s m s have surjective input c o m p o n e n t by just M. This is the
category of machines we will find most useful.

(C) A behavior is a function f: X* + Y, where X a n d Y a r e sets; a m o r p h i s m


f - - ~ f ' o f behaviors is a pair (a, c), where a: X - ~ X ' and c: Y--+ Y' such that
the d i a g r a m
*
,y
X

X,,
a*
y, 1'
commutes. Behaviors and their m o r p h i s m s constitute a category under com-
ponent-wise c o m p o s i t i o n , d e n o t e d Beh. Let B denote the subcategory 4 whose
m o r p h i s m s have surjective input c o m p o n e n t .

(D) Given M, call E M = A8 + : X* -+ Y the external behavior 5 o f M, and


given (a, b, c ) : M - ~ M ' , let E (a, b, c) = (a, c). Then E (a, b, c ) is a mor-
phism o f behaviors; for c o m m u t a t i v i t y o f the d i a g r a m in (C) follows from
c o n c a t e n a t i o n of the square in (B) with the second square in (A). M o r e o v e r ,
E is a functor M a c h -+ Beh, since preservation o f c o m p o s i t i o n follows from
c o n c a t e n a t i o n o f squares in Beh. Because a is surjective in (a, c) if it is surjective
in (a, b, c), E is also a functor M --~ B. We say that M realizes f i r E M = f i

3 It can be shown that Maehr is complete (or finitely complete in the finite case), but that
M has neither products nor equalizers. These facts are not used in this paper.
4 It can be shown that Beh is complete (or finitely complete with the finiteness condition),
but B has neither equalizers nor products.
s In some ways it would be more satisfying to define the external behavior to be the input
sequence to output sequence function f: X* --+ Y*, defined by the concatenation f ( x o • • .x,) =
f(xo)f(xox~)'"f(xo" "x,)where f = A3+. For this would be the input-output component of
the behavior of the system diagram for the machine, in the general sense of Goguen [4].
However, it is easy to show that f a n d ]'determine each other uniquely, and fwill be technically
more convenient here.
362 J . A . GOGUEN
(E) Given f : X* -+ Y, let N f denote its Nerode realization (X, Sf, Y, 3r,
AS, ~rS), where S s = X*/=f, with w = f v if and only iff(uw) =f(uv) for all
u ~ X*; 8s(X, [w]) = [xw], where [w] denotes the =s-equivalence class of w ~ X* ;
As([W]) = f ( w ) ; and o s = [A]. The Nerode relation = f is easily seen to be
reflexive, transitive, and symmetric; write [W]s for the =s-class of w if there is
a danger of confusion, and [w] otherwise. Let us check As is well-defined:
w = s v implies f(w) = f(v) follows from setting u = A in the definition of =s"
Given (a, c} : f -+ f', define N (a, c) = (a, b, c) : N f -+ Nf' by b[w]s =
[a*(w)]y,. This is well-defined provided a is surjective: for w = y v if and only if
f(uw) = f(uv) for all u E X*, and a*(w) ==-ya*(v) if and only if f'(u'a*(w)) =
f'(u'a*(v)) for all u' ~ X ' * , and since a is surjective, any u' ~ X ' * can be written
a*(u) for some u ~ S*, and then we are reduced to showing f'(a*(u)a*(w)) =
f'(a*(u)a*(v)), which follows from the square in (C).
We now show that N (a, c) = (a, b, c) is a morphism o f machines; that
is, we show that b(~1) = cry', and that the diagrams


6f .
~f ,, y

, and lb
6f' Xf'
x'xsf, . s r, s~ . Y'
commute. First, b(crs) = [a*(A)] s, = [A]s, = ~s" Next, b3s(x, [w]f) = b([xw])i) =
[a*(xw)]i,, while 3s,(a(x), b([w]i)) = 3i,(a(x), [a*(w)]s,) = [a(x)a*(w)]i, = [a*
(xw)]1,. Finally, cA1([w]1) = cf(w), while Ay,(b([w]s) = ,~y,([a*(w)]1,) = f'a*(w),
and cf(w) = f'a*(w) by the square of (C).
It is now easy to show that N: B ~ M is a functor by verifying that com-
position is preserved and noting that N f is always reachable.

(F) Given f : X* -+ Y, we show that 87(w) = [w]I in Nf. First, 3f(A) =


crI = [A]I. Next, under the inductive assumption that 3f+(w)= [wl] for all
w ~ X* o f length n > 0, we show the same for length n + 1. For 3~(xw) = 3i(x,
3i+(w)) = 3f(x, [w]f) = [xw]f for x ~ X, w E X* of length n.
Since each element of S I is of the form [w]1, it follows that N f is reachable.

(G) EN = B, the identity functor 6 on B.


Proof. Let f : X * - + Y. Then N f = (X, S f, Y, 31 , AT, ~ry), and by (F),
(ENf)(w) = A / . 3 f ( [ w ] ) = f ( w ) ; i.e., E N f = J l proving ( G ) f o r objects. But
EN (a, c) = E (a, b, c) = (a, c~, so (G) also holds for morphisms, provided
the domain and codomain behaviors o f E N (a, c) are the same as for (a, c).
But that follows from E N f = f, proved above.
(H) Given M, let Vm: M --~ N E M denote (X, qM, Y), in which X and Y are
used to indicate the identity functions, and where qm(s) = [w]I for s ~ S, where
w ~ X* is chosen such that 3+(w) = s; this is possible because of reachability.

6 In this paper we often identify objects with their identity morphisms. Extended to the
category of categories, in which functors are morphisms and categories objects, this gives the
notation used in (G); it is convenient in later calculations. See also the Appendix.
Realization is Universal 363
First, qM is well defined. F o r if 3+(w) = 3+(v) = s, then w - f v, since f(w) =
,~3+(w) = ~8+(v) = f(v). Denote qM by just q if confusion is unlikely.
We now show that (X, q, Y) is a machine morphism. First, q(~) = [A]s =
(rs; it remains to show that the diagrams

X×S ,S S ,Y

6r
x × ss . ss s, . Y

commute. Let s ~ S and assume ~+(w) = s. Then for x ~ X, q($(x, s)) = [xw]
since 3+(xw)= 3(x, s), while 3f(x, q(s))= 3f(x, [ w ] ) = [xw] also. Finally,
Aiq(s ) = A/([w]) = f ( w ) , while ~(s) = ,~3+(w) = f(w).
(I) ~ is a natural transformation v ~q: M ~ NE, where M denotes the identity
functor.
Proof Let (a, b, e>: M ~ M ' . We show that

~TM
M , NEM

<a'b'c> l rtM' JNE<a'b'c>


M' , NEM'

commutes, by showing commutativity for each component. Since N and E do


not change the input and output components, it suffices to check the state
component. Letting N (a, c) = (a, b'. c) we must show that the diagram

qM
s I

b lb'

S' qM' ~ S.f,

commutes. Let s E S, and assume 3+(w) = s. Let s' = b(s). Then 3'+(a*(w)) =
s', by the square in (B), so that qM(s) = [w]f and qM.(S') = [a*(w)]/,. We there-
fore must show that b'([w]/) = [a*(w)] s,; but this is already the definition of the
state c o m p o n e n t b' of N (a, c) given in (E).
(J) E~M = EM, the identity morphism in B on EM.
Proof Since ~M: M ~ NEM in M, E~M: EM ~ ENEM in B. But by (G)
EN(EM) = EM, so that E~M: E M - + E M . Since ~M = (X, q, Y), E~M =
(X, Y), the identity morphism on EM: X* -+ Y.
This result says reduction for reachable machines preserves behavior.

The morphism rlM: M--~ NEM iS the mapping of the machine M onto its minimal
realization. Thus NEM should be isomorphic to the usual quotient machine constructed from
M. This will follow from later results.
364 J.A. GOGUEN
(K) "ONs= Nf, the identity morphism on N f in M.
Proof. Given f : X* -+ Y in B, N f = (X, Ss, Y, 3s, As, as), and VNs: Nf-+
N E N f = Nf, again by (G). Since YNs = (X, qNs, Y ) by (H), we need only show
that qNs = Ss, the identity on S s. Every state in S s is of the form [W]s. By (F)
87(w ) = [w]s, so qNs([W]s)= [w]s, by its definition the =s-class of some
w s X* which b7 moves to [W]s. But this just says qNs = Ss, as desired.
This result says that applying NE to a Nerode machine has no effect; that is,
they are already reduced (this is meant intuitively for now; but the term
"reduced" is defined in Section (M)).

(L) E is left adjoint to N, as functors between M and B (this is written


E q N).
Proof. It is known (see Theorem 4 of the Appendix) that it suffices to give
natural transformations V: M ~ NE and E: E N ~ B such that the triangular
identity diagrams

ENE NEN
E~/~ ~ and r/N~ ~ e
E//'E -'~ E N//'N ~"~ N

commute. We take ~/ as defined in (H), and let E be the identity; recall again
that E N = B by (G).
To verify the first triangle on a reachable machine M, we must have com-
mutativity of the diagram

ENEM

EM , EM

where (E~) M = E-qM = E M by (J), E N E M = EM by (G), and (EE)M = eEM =


E M since eS is always the identity. Thus the condition is just (EM)(EM) =
EM, so that the composite of identities is the identity.
For a behavior f, the second triangular identity becomes commutativity of

NENf

Nf , Xf

where "qNs = N f by (K), N E N f = N f by (G), and (Ne) I = N(Ey) = N f since


eI is the identity o n f . We are therefore reduced to the identity (Nf)(Nf) = N f
on the identity Nf.
Thus (J) and (K) in effect are the triangular identities. There is also a direct
proof of (L) by verifying the universal property, as sketched in [6] for the case
treated there.
Realization is Universal 365
(M) A machine M is reduced 8 if all its states are inequivalent under the
relation s ~ s' if and only if f ( w , s) = f(w, ~') for all w ~ X*, wheref(w, s) =
A3+(w, s), with 3+(w, s) defined by S+(A, s) = s and 3+(xw, s) = 3(x, 3+(w, s)).
In this section we show that the full subcategory R of M whose objects are
the reduced machines is a reflexive subcategory (this definition is reviewed in
the Appendix) of M : that is, we show that for each reachable machine M
there is a reduced reachable machine R M and a morphism ~M: M --+ R M in M
such that for any g: M - ~ M ' with M ' reachable and reduced factors as h'%t
for some unique h: R M ~ M ' . Such a reflection R M of M is called a reduction
o f M.
7)M
M_ ~R(14

We show first that the image .subcategory NB of N in M is reflexive. This


follows from the fact that for any M in M , ~?M" M -+ N E M is a universal arrow
from M to N (this fact follows from the adjointness E-t N by Theorem 2 o f the
Appendix). In fact, the adjunction gives us that any M ~ N f w i t h f i n B factors
as ~MNh for a unique h" E M - + f in B; i.e., any M g M ' with M ' in NB
factors as ~7~tNh for a unique Nh: N E M ~ M ' in NB.

r/M
M ~ NEM

I 31Nh

Nf
We next show that NB is an equivalent subcategory of R. Since both R
and NB are full subcategories of M, and since each Nerode machine is reduced,
NB is a full subcategory of R; moreover, an inclusion functor is certainly faithful,
so it remains to check that each M in R is isomorphic to some machine in NB.
We show that if M is reduced, then ~M: M --~ N E M is an isomorphism; for this
it suffices to show that qM: S -~ S s is an isomorphism, because if a and d are
isomorphisms, then the left diagram commutes if and only if the right diagram
does,
A c ~B _.4 c ~B

C ,-D C ,D
proving that (Jr, qm 1, y ) , the inverse to ~TM, is a machine morphism.

s In control and linear system theories, the term observable is more likely to be used than
the automaton.theoretic synonym reduced. Moreover, observable is more likely to be defined
366 J.A. GOGUEN
TO show that qM is bijective, we show it is injective and surjective (this
special trick will also work in the categories of finite sets, sets, linear spaces, and
finite=dimensional linear spaces). First, we show surjectivity. Recall that
qM(s) = [w]I, where w E X* is such that 8 + ( w ) = s, and letting [w]i be an
arbitrary element of S I, we see that qM(8+(w)) = [w] I. N o w let s ~ I s' and
suppose that 3+(w) = s and 8+(w ') = s'; to prove that qM is injective we show
that [w]/- ~ [w']s. Because M is reduced, s ~ s', which means that AS+(v, s)
hS+(v, s') for some v e X*. But 8+(v, s) = M(vw) and 8+(v, s') = M(vw'), so
that f(vw) = ~8+(v, s) ¢ ~8+(v, s') = f(vw'), so that w ~ I w ' .
Thus we have shown that NB ___ R __q M, and that NB is an equivalent sub-
category of R. But N B is isomorphic to B, so that E-t N implies that N B is
a reflexive subcategory of M. Therefore, by L e m m a 1 in the Appendix R is a
reflexive subcategory of M, as claimed.
F r o m this we obtain the usual result that a machine M is a minimal realiza-
tion of its behavior if and only if it is reachable and reduced (or observable, in
the control-theoretic language).
(N) We are now prepared to take a somewhat more general point of view.
The functor N gives a particular reduced realization for each behavior, and its
adjointness to E expresses its minimality. But there are other constructions for
minimal realization (e.g., via the Myhill semigroup of the machine) and so we
assert that a minimal realization functor is exactly a right adjoint to the behavior
functor E. We first give a universal arrow style characterization of the minimal
realization of a single behavior.
In fact, T h e o r e m 2 of the Appendix allows us to assert, from the adjunction
E-{ N, that M = N f is characterized up to isomorphism by the condition:
there is a morphism E: E M - - > f such that every EM'--->f which is surjective
on inputs and has M ' reachable, can be factored as (Eg)e for a unique g: M ' -+
M, surjective on inputs.
e
M EM )f

3t g I Eg
I
MI EM ~

Notice that e is not required to be an isomorphism, but we know that if e


satisfies this condition, it will be an isomorphism. Moreover, the condition
need be satisfied only for a rather small class of M ' and E M ' -->f. In this sense,
the condition is rather weak, and the characterization result rather strong. In
case we want not isomorphism but actual equality on the input and output
alphabets, we require e to be an equality, and we get the following simple
criterion: M is a realization of f isomorphic to Nf, i.e., M is a minimal realiza-

by the condition that the function F: S--+ yX* be injective, where yx, is the set of all functions
f: X* -+ Y (i.e., behaviors) and F(s) = ~+(., s): X* -+ Y, with 8+ (w, s) as defined just a little
later in the text (for example, [2] and [7] proceed this way). The above definition of observable
is easily seen to be equivalent to our definition of reduced.
Realization is Universal 367
tion of J; if and only if every morphism EM; --->fsurjective on inputs with M '
reachable is of the form Eg for a unique g: M ' --->M.
Now suppose L: B - + M is a functor such that Lfis reduced and E L f = f
We will show that L f satisfies the condition of the previous paragraph; then by
Theorem 2 of the Appendix again, E-I L; this is our assertion that minimal
realization functors are right adjoints of E. First, r/L:: Lf-+ NELfis an isomor-
phism since L f is reduced, and its codomain is N f since E L f = Ji But then the
universal property for N f given in the previous paragraph also holds for Lf:
every morphism EM --+f surjective on inputs with M reachable is of the form
Eg for a unique g: M --->L f surjective on inputs. For if also g': M -+ L f i n M
such that g 4= g' but Eg = Eg', then ~?L:g ¢ rlLfg' : M --~ N f in M, but E(~L:g)
= E(~L:g'): EM -+ F, contradicting the condition for Nf; thus we have unique-
ness ofg. For existence, we compose the unique morphism arising from N/with
the inverse of r/L:.
(O) As already mentioned, the extension from arbitrary discrete machines
to finite machines is immediate. If M F denotes the full subcategory of M whose
objects are machines with finite input, state, and output sets, then we take
BF = E M F, which is the full subcategory of B whose objects are behaviors of
finite machines. We then have an adjunction between E and N as functors
between Br and M F just as before. Of course, it is of interest to characterize B e
independently of the notion of machine, but this is already well known (see [1]).
Certain special subcategories are of particular interest in the finite discrete
case. Let 2 denote the set {0, 1 }. Then a functionf:X* --~ 2 determines a subset
f - l ( 1 ) of X*, and a subset L of X* determines a function f : X* --~2 byf(w) = 1
if and only if w EL; thus such behaviors are bijective with subsets L of X*,
which are also called languages over the alphabet X. Languages arising from
finite machines are called regular sets. Let M2v be the subcategory of M e whose
objects all have output set Y = 2 and morphisms (a, b, c) such that c --- 2,
the identity map of 2. This is the category of finite acceptors. Let Bf = EMf,
the category of regular sets. Then, very much as before, E and N are adjoint
functors between M~ and B~; that is, finite acceptors and regular sets are in an
adjoint relationship. The characterizations of. regular sets independently of
machines are too well known to bear repetition here (see [1]).
(P) This section is concerned with the case of linear machines, more
specifically, the case where X, S, Y are left k-modules, where k is an arbitrary
ring. Although things are quite analogous to the discrete case, it seems best to
be generous with details, in part because Give'on and Zalcstein [3] have claimed
that the simple dynamic treatment in Arbib and Zeiger [2] of the linear case
(k a field) cannot be made rigorous. We show that a dynamic treatment of the
linear case can be obtained from the preceding discrete treatment with quite
modest changes, yielding the same concise and sharp results, without the
awkward machinery (k-linear bundle monoids) or restrictions (k is commutative
with unit) of [3]. Rings of matrices and of polynomials in non-commuting
variables are important and non-commutative.
Let k be a ring. A k-linear machine M is a quintuple (X, & Y, 8, a), where
X, S, Y are left k-modules (k-linear spaces if k is a field), and 8: X x S ~ S
368 J . A . GOGUEN
and ,~: S ~ Y are k-linear maps. We can omit specification of an initial state
because we will always take it to be the zero in S. One obtains a category
Maeh k of k-linear machines, with the machines as objects and morphisms
(a, b, c) exactly as in the discrete case (i.e., such that the same diagrams com-
mute), except that each a, b, c is required to be k-linear. Hereafter we shall
frequently fail to m e n t i o n k explicitly. Let Maeh v be the full subcategory with
objects machines such that Jr, S and Y are finitely generated.
Instead 9 of X*, which does not have a natural k-linear structure, we use
X + = I_[tZo x , the countable copower of X (i.e., the countable coproduct of
X with itself). It is well k n o w n that X + can be taken to be the space of all
sequences x = ( x t ) such that xt c X for t e N and all but a finite n u m b e r of x t
are zero, where N is the set of non-negative integers.
Define an operation • of X* on X + by: (1) w . x e X + for w e X * , x ~ X + ;
(2) (w'x)t = wt for 0 _< t < Iwl; and (3) (w'x)t+lw I = x t for t e N . Thus, w . x
is w concatenated onto the front of x. In particular A - x = x. This operation
c a n n o t be linear because X* is not linear, but it is very convenient in giving a
development of the linear case parallel to the discrete. Notice that each x c X +
is of the form 1° w.0 for some (indeed, many) w c X*, because x has only
finitely m a n y initial non-zero components.
Define 3+: X + --~ S by 3+(0) = 0 and 3+(x.x) = 3(x, 3+(x)) for x ~ X,
x E J(+ ; this gives a n inductive definition since each x = w.0, some w ~ X*.
T h e n ~+ is k-linear. F o r convenience of n o t a t i o n in this p r o o f only, let us write
w for w.0. We begin by proving homogeneity; that is, for a e k, we show
3+(aw) = a3+(w). This is true for w = A, since then both sides are zero.
Assume the equation for all w e X* of some length Iw] = n > 0, and let Iw! = n,
x e X, a n d a e k. Then 3+(a(xw)) = 3 + ( a x . a w ) = 3(ax, 3+(aw)) = 3(ax, a3+(w))
= aS(x, 3+(w)) = a3+(xw). We now prove additivity, that is, for w, w' E X*,
we show 3 + ( w + w ') = ~+(w)+3+(w'). This is true if Iw[ = Iw'l = 0, i.e.,
w = w' = A, since both sides are zero. Assume the equation for all w, w' E X*
with Iwl, ]w'[ < n, for some n > 0, a n d let x, x ' E X. Then 3 + ( x w + x ' w ) =
+ ((x + x'). (w + w')) = ~(x + x', 3 + (w + w')) = 8(x + x', 3 + (w) + ~ + (w')) = ~(x,
3 + ( w ) ) + 3 ( x ', 3+(w')). This works because each x e X + is of the form w.0,
a n d w can be padded out on the right with as m a n y zeros as one wishes. The
commutative square condition in (B) holds for k-linear machines (as usual, +
replaces *).
We can now define a machine M to be r e a c h a b l e if 8 + is surjective, and
define the category Mk as M before, with objects reachable machines and
morphisms (a, b, c) such that a is surjective. The e x t e r n a l b e h a v i o r of a machine
M is E M = ~ + : X + -+ Y, linear because both A and 8 + are. We have a cate-
gory Beh of behaviors f : X + -+ Y, for X, Y left k-modules, and morphisms

9 Our X ÷ is isomorphic to the X * / P used by Arbib and Zeiger [2], but we are thinking of
time as running from the past to now at t = 0. Unfortunately, Arbib and Zeiger sometimes
confusingly identify P-classes in X* with their representatives. Kalman's [7] time runs into
both the past and future, since he uses an ~ isomorphic to (II2"=oX)x (lq~'=1X) to replace X*.
1o The equivalence relation generated by the surjection X*-*-X + given by w--~ w'0
is exactly the P mentioned in footnote 8. Notice in particular, that w0".0 = w'0, for all n e N.
In fact, wPw" if and only if wOn = w' or w = w'0n, some n e N.
Realization is Universal 369
4a, c ) : f ~ f ' with a: X ~ X', c: Y -+ Y' such that the same diagram as before
commutes. We also have the subcategory B of Beh all of whose morphisms have
surjective input component. Then E: Maeh ~ Beh is a functor as before, which
gives E: M ~ B when restricted, the behaL4orfunctor. We seek a right adjoint.
Given f : X + -~ Y linear, define the equivalence relation = : on X + by
x = : x' i f f ( w . x ) = f ( w . x ' ) for all we X*. We show this is a linear congruence
relation. To prove homogeneity, note that x = : x' implies a x - : ax', for
a s k. For f(w'(ax)) = f ( w . 0 + 0 Iwl.ax) = f(w.O)+af(O Iwl.x), and similarly
f ( w . (ax')) = f(w.O) + af(0 Iwt.x'), but x - : x' implies af(0 Iwl. x) = af(0 Iwl. x'),
and therefore f ( w . ( a x ) ) = f(w.ax')). For additivity, note that x---: x' and
y-:y' implies x + y - : x ' + y ' . For f ( w - ( x + y ) ) =f(w.x+01Wl.y) = f ( w . x )
+f(01 ~l.y) = f ( w . x ' ) +f(O I~l.y,) = f(w. (x' + y')), as desired.
Given f : X + -+ Y, set S: = X + / - : , another left k-module, and define
3:(x, Ix]:) = [x.x]:, A:([x]:) = f(x), and verify that N f = {X, S:, Y, ~:, A:) is
a linear machine. We show things are well defined as in the discrete case, and
linearity follows mainly from the linearity of -=:. This is the pattern for all the
rest: replace • by + and proceed as before, occasionally verifying linearity.
For example, it must be checked that ~M: M ~ N E M is linear, i.e., that qM:
S ~ S : is linear, where f = EM. This is straightforward. We define 3+(w, s)
for w ~ X*, s ~ S, exactly as in (M), 8+(A,s) = s and ~+(xw, s) = 3(x, 3+(w, s)),
and '~e prove by induction that 8+(w.x) = M(w, 3+(x)). In particular, M(w.0)
= 3+(w, 0). We then define for s, s' s S, s %: s' if A3+(w, s) = A3+(w, s') for
all w s X*, and we call a machine reduced if no two distinct states are equivalent.
Everything goes as before, and we obtain not only adjointness, but reflectivity
of reduced machines, etc.
The finiteness modification is similar to the discrete case. Let M e be the
intersection of the subcategories M and Maeh F of Maeh, and let B e = E M r,
the full subcategory of B whose objects are behaviors of finitely generated
machines. The restrictions E and N are again adjoint. Once again, machine-
independent characterization of B r is desirable, but in this case it does not seem
to be well known. Indeed, the usual references (even [7] and [2]) do not suffi-
ciently emphasize the fact that not every linear function f : X + ~ Y can be
realized with a finite-dimensional state space, At this writing I am unsure
whether Che characterization needed for the case we are considering actually
appears i/n print, but it turns out that Nerode, in the paper [12] from which the
famous equivalence relation was taken, considered a related problem. Some
interesting related results appear in Rouchaleau [13].

(Q) This paper suggests additional research. The definition of machine given
in Section (A), as modified in footnote 1, can easily be extended to give a notion
of a discrete-time machine in a category with products and terminal object. For
example, a strong argument can be made that affine maps are more natural than
linear, and that machines in affine categories ought to be studied. The question
of what additional assumptions must be made on the category for our con-
structions to be valid, and in particular, what to use for X*, is treated in [6].
Other work will consider continuous time, and other more general types of
machine, as well as other results on languages.
370 J . A . G(~u~zr~

APPENDIX

This appendix gives results and definitions from category theory in the form
needed in the body of the paper. The reader who wants more detail or motiva-
tion should consult the standard references [9], [11], and especially the excellent
[8]. We prove two easy lemmas which do not seem to be in the literature.
Definition. A category C consists of: a class ICJ whose elements are called
objects; for each A, Be-IC] a class C(A, B) whose elements are called morphisms
from A to B; and for each A, B, C c IC], a mapping
o : C(B, C) x C(A, B) --~ C(A, C)
called composition, with o ( f g) denoted fg, such that:
(1) For each A s[C[ there is a morphism A s C(A, A) called the identity at
A such that Ag = g and fA = f, whenever these compositions are defined;
(2) For each A, B, C, D s]C!, h ~ C(A, B), g s C(B, C), f e C(A, B), the
associative law f(gh) = (fg)h holds.
The following conventions are used: f s C(A, B) is indicated by f : A -~-B
or A I-~B; write C = UA,BEIcIC(A, B); identify A s C I with A c C(A, A), so
that ICI _ C; call A the domain and B the codomain o f f : A ~ B. A morphism
f : A --+ B in C is called an isomorphism if there is a morphism g: B --+ A in C
such that g f = A and f g = B; write g = f - 1, as it is uniquely determined if it
exists.
The standard examples are: Sets, whose objects are sets and morphisms are
set mappings; Fin, whose objects are finite sets, and morphisms as in Sets; Link,
with objects k-linear spaces (k-modules), and morphisms k-linear maps, for k
a field (or ring) Flink, whose objects are finitely generated k-linear, spaces,
and morphisms as in Lint. Other examples occur in the body of the paper.
A category B is a subcategory of C if B ~ C, !Bi _ IC], and composition in
B agrees with composition in C. A subcategory B of C is full if for each B,
B' s IB], B(B, B') = C(B, B'). Thus to specify a full subcategory B of C, one
needs only to give the objects [BI.
Each category C has an opposite category C °p, defined by IC°P/ = [C I, C°P(A,
B) = C(B, A), and for f e C°P(B, C), g ~ C°P(A, B),fgc C°P(A, C) defined to be
gf~ C(C, A). G i v e n f ~ C(C, B), it is often convenient to w r i t e f °p : B -~ C for the
corresponding morphism in C°P(B, C).
TE ICI is terminal if for each C ~ IC], C(C, T) has cardinality one; and I e ICi
is initial if for each C ~ ICI, C(I, C) has cardinality one.
Definition. A function F: B - 7 C between categories is called a functor if
F(fg) = (Ff)(Fg) whenever fg is defined, and FIB] c_ ICI.
Notice we write F f r a t h e r than F(f), but F(fg) for clarity. In general, we try
to avoid the proliferation of parentheses. An inclusion B _ C of a subcategory
is a functor. A functor F: B -~ C is said to be full if FB(B, B') = C(FB, FB')
for all B, B' E IBI, and is said to be faithful if F restricted to B(B, B') is injective
for each pair B, B' E IBI. An inclusion functor is always faithful, and is full if and
only if the subcategory is full.
Functors F: 13--~ C and G: A--+ B can be composed to give a functor
Realization is Universal 371
FG: A - ~ C. There is an identity functor A: A--+ A, defined by A f = f f o r all
/ ' e A. This gives rise to a "meta-category" Cat with categories as objects and
functors as morphisms.

Definition. A natural transformation ~ from F to G, where F, G: B ~- C are


functors, is a collection {~TB!Bs ~IB:!} of morphisms r~B: FB -+ GB in C such that
for each f : B - > B' in B, the diagram

1B' • GB

FB ' - r~B' ~ GB'


commutes. Write -,/: F ~ G or 7: F ~ G: B --+ C. If each ~B is an isomorphism,
is called a natural equivalence, or natural isomorphism, of functors F, G.
If F, G, H: B - - > C , ~/: G ~ H, and q~: F ~ G, define the composite trans-
formation ,?~: F ~ H to be the family {~Bq~BIBe ]B] }. it can be directly verified
that W is a natural transformation, that this notion of composition is associative,
and that the identity transformations F: F ~ F, defined by Ft3 = FB: FB -+ FB,
function as identities. Therefore there is a category whose objects are functors
B -+ C and whose morphisms are transformations. It is denoted by [B, C] or
C B, and is called a functor category.
l f ~ : G ~ H: B --~C and F: A -÷ B, then ~TF: G F ~ HF: A -+ C is given by
O?F)A = ~/FA"Similarly, if~: G ~ H: B --~ C and F: C -~ D, then F~: FG ~ FH:
A ~+ D is given by (Fr/) a = F~A.

Definition. A functor F: A --~ B is/eft adjoint to a functor G: B -~-A if there


exists a natural isomorphism

~AB: B(FA, B) -> A(A, GB).


One also says G is a right adjoint to F, and writes F-t G, or F-t G: A - > B,
and one calls ~o the adjunction.
In this definition B(-, .) and A(., .) are discrete class-valued functors with
domains B ° P x B and A ° P x A , respectively, so that B(F., .) and A ( ' , G . ) are
both discrete class-valued functors with domain A °p x B, and q~ is a transforma-
tion between them.

T H E O R E M 1. Any two left adjoints to G : B --->A are naturally isomorphic.


Any two right adjoints to F: A - ~ B are naturally isomorphic.
Given F: A--+ B and B ~ [B[, let FIB be the category with objects pairs
(A, b: FA ~ B ) where A ~ [A] and b c B, and with morphisms ( A , b: FA --> B )
--> ( A ' , b': FA' --> B ) taken as morphisms a E A(A, A') such that the diagram

FA Fa ~- FA '

B
372 J.A. GOGUEN
c o m m u t e s in B. A terminal object in F/B is called a universal arrow from F to B.
Dually, given G: B - + A and A c [A[, A/G has objects (a: A - ~ GB, B ) and
morphisms b: B ~ B' such that the diagram

/\
GB ~ Cb ~ GB'

commutes; and an initial object in A/G is ca!led a (co-) universal arrow from A
to G. Categories such as FIB and A/G are called comma categories.

T H E O R E M 2. F: A -~ B has a right adjoint if and only if each B ~ tB] has a


universal arrow from F; and then the value GB of the adjoint G at B is the object
A occurring in the universal arrow (A, A F -+ B ) f r o m F to B. Dually, G: B -+ A
has a left adjoint if and only if each A ~ IAI has a universal arrow to G; and the
value FA of the adjoint F a t A is the object B in the co-universal arrow ( A -+ GB,
B).
Actually, Theorem I follows from Theorem 2 because of the uniqueness up
to isomorphism of terminal and initial objects, and Theorem 2 can be proved
fairly directly.

T H E O R E M 3. Given F q G: A - ~ - B , the universal arrows ~a: A - ~ GFA


constitute a natural transformation ~ : A ~ GF called the unit of the adjunction,
and the universal arrows Es: FGB--> B constitute a natural transformation
E: FG ~ B called the co-unit of the adjunction. Moreover, F, G, ~7, ~ satisfy the
triangular identities given by commutativity of the diagrams

FGF GFG

F ]t,.--F G ),'G

T H E O R E M 4. I f F: A --> B, G: B -> A, ~q: A ~ GF, and ~: FG ~ B satisfy


the triangular identities then F q G, and ~, ¢ are the unit and co-unit of the
adjunct ion.
Two categories A and B are called equivalent if there is a functor F: A -+ B
which is full, faithful, and such that each B E [B I is isomorphic to FA for some
A E IAI. A is an equivalent subcategory of B if the inclusion functor satisfies these
conditions. Equivalently, A and B are equivalent if and only if there are functors
F: A -> B and G: B --->A such that FG and GF are both naturally isomorphic
to the identity. Yet another equivalent condition is that there is an adjunction
between A and B whose unit and co-unit are both isomorphisms.
A subcategory A of C is called reflective in C if the inclusion functor has a
left adjoint, called a reflector. Equivalently, A ~ C is reflective if and only if
for each C ~ [C I there is an object RC in A and a morphism ~lc: C--> RC in C
Realization is Universal 373
such that every morphism g: C - + A in A has the form g = f~Tc for a unique
f: R C - + A in A. This equivalence follows from Theorem 2.

r7C
C • RC

The following trivial lemma is used in Section (M).

L E M M A 1. Suppose A is an equivalent subcategory of a subcategory B of


C, and that A is a reflective subcategory of C. Then B is a reflective subcategory
of C.
Proof Let C ~ [C[. Then we have ~7c: C - + R C , with RCE IAI, universal
from C to the inclusion. Since A ___ B, of course RC ~ IBI, and we have to show
~c is still universal from C t o the inclusion of B in C. So letg: C -+ B, B E [B[, and
let A be an object of A isomorphic to B,j: B ~ A, using equivalence. Then there
is a unique f : RC ~ A such thatfiTc = jg. We conclude that there is a unique
f ' : RC ~ B in B such that f ' ~ c = g, namely f ' = j - l f, using fullness. For if
also f " : R C - + B satisfies f"~Tc = g , then (Jf")~Tc =Jg, so that j f " = j f ' , or
f " = f ' , since j is an isomorphism.

r/C
C _ • RC,.

g f'

A much simpler but more sophisticated proof uses the composition of adjoint
situations (see [8]).
The following fact is useful in considering various restrictions of the main
adjunctions in the body of this paper.

L E M M A 2. I f F: A ~ B, G: B --~ A, Fq G, and tfAo and Bo are full sub-


categories of A and B (respectively) such that the restriction of F to Ao .factors
through B o and the restriction of G to Bofactors through Ao, yielding Fo: Ao ~ Bo
and Go: Bo -+ Ao (respectively), then Fo q Go.
Proof The restriction of the natural isomorphism q~ on A °p × B to A~p × B o
yields the desired natural isomorphism ~Po,noting that B(Fa, b) = Bo(Foa, b) and
A(A, Gb) = Ao(a, Gob), for a z A~p and b z B o.
374 J . A . GOGUEN

REFERENCES

[1] M.A. ARBm, Theories of Abstract Automata, Prentice-Hall, Englewood Cliffs, New Jersey,
1969.
[2] M. A. ARBm and H. P. ZEmER, On the relevance of abstract algebra to control theory,
Automatica 5 (1969), 589-606.
[3] Y. GIVE'ONand' Y. ZALCSTEIN,Algebraic structures in linear systems theory, J. Computer
and System Sciences 4 (1970), 539-556.
[4] J. A. GOGUEN,"Mathematical Representation of Hierarchically Organized Systems", in
Global Systems Dynamics (E. Attinger, editor), S. Karger, Basel, 1970.
[5] J. A. GOGUEN, "Systems and Minimal Realization", in Proc. 1971 IEEE Conference on
Decision and Control, Miami, Florida, 1972.
[6] J. A. GOGUEN, Minimal realization of machines in closed categories, to appear in Bull.
Amer. Math. Soc.
[7] R. E. KALMAN, "Introduction to the Algebraic Theory of Linear Dynamical Systems",
Lecture Notes in Operations Research and Mathematical Economics, Vol. 11, Springer
Verlag, Berlin, 1969.
[8] S. MACLANE, Categories for the Working Mathematician, Springer Verlag, Berlin, 1972.
[9] S. MACLANEand G. BJRKHOFF,Algebra, MacMillan, New York, 1967.
[10] M. B. MESAROVld, "Foundations for a General Systems Theory", in Views on General
Systems, Proceedings of the Second Systems Symposium at Case Institute of Tech-
nology, J. Wiley, New York, 1964.
[11] B. MITCHELL, Theory of Categories, Academic Press, New York, 1965.
[12] A. NERODE,Linear automaton transformations, Proc. Amer. Mack Soc. 9 (1958), 541-544.
[13] Y. ROUCHALEAU,Finite-dimensional, constant, discrete-time, linear dynamical systems
over some classes of commutative rings, Ph.D. Dissertation, Operations Research
Dept., Stanford University, 1972.

(Received 25 January 1971,


and in revised form 12 June 1972)

You might also like