Bertrand Russell and Numbers:
Introduction to Mathematical
Philosophy : Chapters 1-7
Bradley M. Cardona
May 9, 2023
Submitted to the Department of Mathematics in partial fulfillment of the
requirements for the degree of Bachelor of Science.
First Reader: Dr. Anthony Lo Bello
Second Reader: Dr. Brent Carswell
I hereby recognize and pledge to fulfill my responsibilities, as defined in the
Honor Code, and to maintain the integrity of both myself and the College
community as a whole.
Pledge:
—————————————————————–
Bradley M. Cardona
Acknowledgements
My undergraduate years would not have been possible without the guid-
ance, mentorship, and support from many individuals.
I am indebted, first and foremost, to my advisor, Professor Anthony Lo
Bello, for his unwavering support, encouragement, and patience—not to men-
tion playful humor—throughout this senior project process. His insightful
feedback and guidance were invaluable in directing and shaping my research,
which only helped to elevate my already-lofty respect for Bertrand Russell.
I owe a special word of thanks to Professor Bradley Hersh, with whom I
spent the Summer of 2021 studying the genetics of fruit flies, and to Professor
Caryn Werner, with whom I spent the Summer of 2022 exploring systems of
algebraic curves in the projective plane.
I wish to extend many thanks to the several other mathematics professors
under whom I studied during my time at Allegheny; namely, Professor Brent
Carswell, Professor Harald Ellers, Professor Tamara Lakins, and Professor
Rachel Weir, each of whose time, expertise, and valuable insights I will al-
ways be greatly appreciative of.
I also owe a great many thanks to my friends for their support, encour-
agement, and valuable feedback throughout my studies. For her constant
support and ineffable friendship, I am forever indebted to Isabella James.
Finally, I would like to express my heartfelt gratitude to my family—my
mother, Adrienne, and my dear siblings, Donna and Xavier—and loved ones,
without whom none of this would be possible.
Abstract
We investigate Introduction to Mathematical Philosophy by Bertrand
Russell, first published in 1919. This book is an accessible introduc-
tion to what Russell and Alfred North Whitehead wrote in Principia
Mathematica, the famous three-volume work on the foundations of
mathematics.
Contents
1 Bertrand Russell: Mathematician, Philosopher, Humanitar-
ian, and Writer 5
2 Part I: Russell’s Introduction to Mathematical Philosophy 7
2.1 Chapter 1: The Series of Natural Numbers . . . . . . . . . . . 7
2.2 Chapter 2: Definition of Number . . . . . . . . . . . . . . . . 10
2.3 Chapter 3: Finitude and Mathematical Induction . . . . . . . 14
2.4 Chapter 4: The Definition of Order . . . . . . . . . . . . . . . 20
2.5 Chapter 5: Kinds of Relations . . . . . . . . . . . . . . . . . . 27
2.6 Chapter 6: Similarity of Relations . . . . . . . . . . . . . . . . 34
2.7 Chapter 7: Rational, Real, and Complex Numbers . . . . . . . 40
3 Part II: The Modern Approach 50
3.1 Outline of the Set Theory needed for the Study of Numbers . 50
4 Conclusion 89
References 92
1 Bertrand Russell: Mathematician, Philoso-
pher, Humanitarian, and Writer
Born into a family of the British aristocracy in Monmouthshire, United King-
dom, Bertrand Russell (1872–1970) is rightly hailed as a polymath of the
twentieth century. A mathematician, philosopher, humanitarian, and writer,
his influence spanned (and today still spans) a breadth of subjects.
As a mathematician, Russell is well-known for having discovered a para-
dox in set theory, aptly named Russell’s Paradox (See Chapter 3 below).
This paradox was immediately recognized by mathematicians as a signifi-
cant obstable to naive set theory (the early version of set theory in which
Russell found his paradox), and was one of several paradoxes that impelled
them to create axiomatic set theory.
Russell is most celebrated in mathematics for having co-written, along-
side fellow British mathematician Alfred North Whitehead, a three-volume
mathematical text entitled Principia Mathematica. Written as a defense of
logicism (the idea that mathematics is reducible to logic), this weighty tome
was an impetus for research in the foundations of mathematics, leading even-
tually to the development of modern mathematical logic. Due to its being
densely-worded and symbology-heavy, this work is sometimes taunted for
having probably been read in its entirety by no one. It takes 762 pages be-
fore Russell and Whitehead prove definitively, for instance, that 1` 1 “ 2.
Being someone who instinctively championed his own beliefs, Bertrand
Russell was also a humanitarian, frequently criticizing political causes that
he thought wrong. A crystal-clear instance in which he displayed his con-
trarian disposition occurred during World War I. During a lecture in 1918,
Russell publicly denounced the idea of the United States entering the war
on the United Kingdom’s side, thereby earning himself a six months’ stay at
Brixton Prison. (It was at Brixton, in fact, where Russell wrote Introduction
to Mathematical Philosophy, the primary book on which this senior project
is based.) Though not a complete pacifist, Russell, during the Second World
War, fervently advocated for the total abolition of nuclear weapons.
In the world of philosophy, Russell is no doubt most well-known for hav-
5
ing written A History of Western Philosophy, in which he spans the timeline
from the philosophy of the early Greeks to that of John Dewey (1859–1952).
Russell won the Nobel Prize in Literature for this book in 1950. Though this
book has since been heavily critiqued by scholars of philosophy, its residual
effect on me has remained deeply profound and influential. One chapter I
was deeply struck by two years ago concerned the philosophy of Socrates, in
which Russell explains the Socratic definition of philosophers as those who
are “lovers of the vision of truth.” Believing then that mathematics is the
means by which one can get closest to truth, I decided to switch my major
to mathematics.
In the following pages I present, in Part I, a detailed summary, chapter by
chapter, of Russell’s ideas about number in his Introduction to Mathematical
Philosophy. Then, in Part II, I give an outline of the modern treatment of
the topic following Robert Stoll and Paul Halmos. For the sake of brevity I
omit the proofs, which are standard and can be easily found in the sources
indicated. The purpose here is to show the influence of Russell in the way the
modern treatment of the subject is organized. In both parts I add whatever
comments and critical observations I find necessary or useful. Finally, in
the conclusion, I summarize what I have discovered and indicate the most
obvious instances of Russell’s influence.
6
2 Part I: Russell’s Introduction to Mathe-
matical Philosophy
2.1 Chapter 1: The Series of Natural Numbers
In Chapter 1 of Introduction to Mathematical Philosophy, Russell explains
the aim of the book, in addition to why Giuseppe Peano’s treatment of the
natural numbers is not as complete as it first seems.
The familiar way of studying mathematics is in the “constructive” man-
ner: from natural numbers and integers to rationals and real numbers; from
addition and multiplication to differentiation and integration, and so forth
to higher mathematics. In this book, Russell begins from the opposite di-
rection. Rather than first building on top of the natural numbers, Russell
attempts to reduce mathematics to its logical components. Although it is
tempting to define or deduce from an initial assumption, Russell urges us to
“ask instead what more general principle can be found, in terms of which
what was our starting point can be defined or deduced” [3, 1]. Two ap-
proaches are therefore required to expand the scope our logical abilities, one
“to take us backward to the logical foundations of the things that we are
inclined to take for granted in mathematics,” the other “to take us forward
to the higher mathematics.” [3, 2].
Russell begins the “backward” approach by discussing the notable work of
the Italian mathematician Giuseppe Peano, who published his famous Peano
axioms in the treatise Arithmetices principia, nova methodo exposita, Turin:
Bocca Brothers, 1889. In this publication, Peano attempted to show that
the entire theory of natural numbers could be derived from three “primitive
ideas” (undefined terms) and five “primitive propostions” (postulates).
Peano’s three primitive ideas are: 0, number, and successor.∗ By “suc-
cessor,” he means the number that follows another number. By “number,”
he means the “class”—or, more commonly, the “set”—of natural numbers.
Peano’s five primitive propositions can be stated as follows:
∗Peano actually started with 1 but we are starting with 0 because Russell does.
7
1. 0 is a number.
2. The successor of any number is a number.
3. No two numbers have the same successor.
4. 0 is not the successor of any number.
5. Any property which belongs to 0, and also to the successor of every
number which has the property, belongs to all numbers.
Russell explains why Peano’s treatment of the natural numbers is not as
complete as it first seems. Why Peano’s treatment is not the last word is
explained by Russell in the quotation I give on page 9.
To begin, he observes that Peano’s three primitive ideas are capable of an
infinite number of different concrete interpretations. Consider, for example,
the three following cases.
First, since “0” is not strictly defined, “0” can be taken as any other
natural number, allowing us to arbitrarily pick the first natural number. For
instance, let “0” (the number we commonly think of as the first natural num-
ber) be taken to mean 42, and let “number” be taken to mean all numbers
from 42 onward in the series of natural numbers. Here, we find that all five
of Peano’s propositions are satisfied.
Second, since “number” is not strictly defined, let “number” mean what
we normally call “even numbers”, and let “successor” be what results from
adding two to it. Here, all five of Peano’s propositions are satisfied still.
Third, since “successor” is not strictly defined, let “0” mean the number
1, let “number” mean the set t1, 1 , 1 , 1 , ¨ ¨ ¨ u, and let “successor” mean a
3 9 27
“third.” Again, all five of Peano’s propositions are satisfied.
It is clear from these examples that “0” and “number” and “successor” are
ideas that each have many different concrete interpretations, casting much
doubt that Peano’s five propostions should be taken as definite arithmetic
truth. Russell refines this point by providing a generalization, proving that,
given any series that is “endless, contains no repetitions, has a beginning,
8
and has no terms that cannot be reached from the beginning in a finite num-
ber of steps’—a series he calls a progression—we will have a set of terms
satisfying Peano’s axioms [3, 8]. Peano’s five propositons, therefore, cannot
be definitive arithmetically, since “each different progression will give rise to
a different interpretation of all the propositions of traditional pure mathe-
matics; [and] all these possible interpretations will be equally true” [3, 9].
Although Peano’s system assumes that we know what is meant by “0”
and “number” and “successor”, Russell has shown that this is not so. It
is true that this discovery might not impair pure mathematics, but it most
certainly will impair arithmetic in daily life. Believing then that mathe-
matics should lead us to pragmatic conclusions in addition to theoretical
ones, Russell rightly notes, “We want ‘0’ and ‘number’ and ‘successor’ to
have meanings which will give us the right allowance of fingers and eyes and
noses” [3, 9]. We thus do not yet have an adequate basis for arithmetic: we
do not know if there are any definite sets of terms verifying Peano’s axioms;
moreover, we do not have numbers that can be used for counting common
objects, which requires that they have a definite meaning.
9
2.2 Chapter 2: Definition of Number
The second chapter of Russell’s Introduction to Mathematical Philosophy is
dedicated to the definition of number.
In 1884, the German logician Friedrich Ludwig Gottlob Frege published
The Foundations of Arithmetic (German: Die Grundlagen der Arithmetik),
which investigates the philosophical foundations of arithmetic. Although this
publication was largely ignored by his contemporaries, Russell believed that
the correct definition of number was contained therein.
It is not uncommon, when attempting to define “number,” to mistak-
enly define “plurality,” which is altogether something different. A key detail
about plurality is that it is not an instance of number, but of some particular
number. For example, a pair of women is an instance of the number 2, and
the number 2 is an instance of number; the pair, however, is not an instance
of number. That is, the number 2 is not the pair comprised of Bella and
Lainee; rather, it is something that all pairs have in common, and which
distinguishes them from other sets.
A set may be defined in two ways: (1) by enumeration or (2) by a
defining property. A set would be defined by enumeration—that is, “by
extension”—if we were to say “This set consists of Bella and Lainee.” And a
set would be defined by a defining property—i.e., by intension—if we were
to say “students of Allegheny College” or “blonde-haired women.” Of these
two types of definitions, the one by intension, as emphasized by Russell, is
logically more fundamental. This is for two reasons; namely, that (I) “the ex-
tensional definition can always be reduced to an intensional one”; and that
(II) “the intensonal one often cannot even theoretically be reduced to the
extensional one” [3, 24]. It is clear that (I) must be true, since the enumer-
ation of the set consisting of Bella and Lainee can be reduced to the defining
property “x is Bella or x is Lainee,” where x is contained in the set. (In other
words, this defining property is true for two x’s, namely, Bella and Lainee.)
Moreover, (II) must also be true, since a set may be impossible to enumerate.
Russell gives three reasons why it is important that a definition by in-
tension is logically more fundamental than one by extension. First, numbers
themselves form an infinite set, and hence cannot be defined by enumeration.
10
Second, the sets having a given number of elements themselves presumably
form an infinite set. Third, we want to define ‘number’ in such a way that
we can speak of the number of elements in an infinite set; and it necessarily
follows that such a number must be defined by intension [3, 13].
A set is often interchangable with a defining property of it. One differ-
ence between the two, however, is that there is only one set having a given
set of elements, whereas there are always many different defining properties
by which a given set may be defined. Knowing that defining properties are
never unique is useful, since any defining property can be used in place of
the set whenever uniqueness is not important.
A family of sets is a set whose elements are themselves sets. When
deciding whether two sets should belong to the same family of sets, our first
reaction might be to put them in the same family of sets if they have the same
number of elements. But this way of thinking is incorrect. Although we are
all used to the operation of counting, counting in itself is, logically speaking,
a complex operation. Furthermore, counting the number of elements in a set
is only possible when the set itself is finite. When we define number, then,
we cannot assume that all numbers are finite—and even if we did, we still
could not use counting to define numbers, since numbers themselves are used
in counting.
Hence, we must invoke the concept of one-to-one relations. “A relation is
said to be one-to-one,” says Russell, “when, if x has the relation in ques-
tion to y, no other element x1 has the same relation to y, and x does not
have the same relation to any term other than y [3, 15]. Using this relation
allows us to discover whether two sets have the same number of elements,
even when we do not know what that number is. Consider, for instance, a
world in which there is no polygamy or polyandry; in such a world there must
necessarily be a one-to-one relation of husband and wife. This implies that
the number of husbands must be equal to the number of wives, even though
the exact number of husbands and wives is unknown.
The set of elements that have a given relation to something (i.e., the set
of input values for which a function is defined) is called the domain of the
relation; hence, husbands are the domain of the “husband to wife” relation.
Conversely, the “wife to husband” relation is called the converse of the
11
“husband to wife” relation. The converse domain (range) of a relation
is the domain of its converse; the set of wives, therefore, is the range of the
“husband to wife” relation. Using these definitions, we may say that one set
is similar to another when there is a one-to-one relation in which the one
set is the domain, and the other is the range.
The similarity relation is reflexive (“every set is similar to itself”), sym-
metrical (“if a set α is similar to a set β, then β is similar to α”), and
transitive (“if α is similar to β, and β is similar to γ, then α is similar to
γ”) [3, 16].
A key detail that Russell points out is that the act of counting is only
applicable to finite sets, and “depends upon and assumes the fact that two
[sets] that are similar have the same number of [elements]” [3, 17]. (If we
were to count 20 elements, for example, we would simply be showing that
the set of these elements is similar to the set of numbers 1 to 20.) Hence, the
notion of similarity is “logically presupposed” in the operation of counting,
and the notion of similarity is for several reasons “logically simpler” than
the operation of counting. For one, the notion of similarity does not require
an order. (Above it was established that the number of husbands must be
equal to the number of wives, even though the exact number of husbands
and wives is unknown.) Moreover, the notion of similarity does not require
that the sets which are similar should be finite. By way of illustration, if we
had the natural numbers (excluding 0) on the one hand, and their respective
reciprocals on the other hand, it is clear that we could map 2 to 1 , 3 to 1 , 4
2 3
to 1 , and so on, thus showing that these two sets are similar.
4
We can consequently use the notion of similarity to decide when two sets
should belong to the same family of sets. Regardless of the number of ele-
ments a set may have, the sets that are similar to it will have the same number
of elements. We may thus use similarity as a definition of “having the same
number of elements.” Naturally we might think that the set of couples (say)
is something different from the number 2. But, as Russell reassuringly adds,
“there is no doubt about the set of couples: it is indubitable and not difficut
to define, whereas the number 2, in any other sense, is a metaphysical entity
about which we can never feel sure that it exists or that we have tracked it
down” [3, 18]. It is therefore more sensible to use the set of couples, which
we are sure of, than to use a slippery definition of the number 2. Accordingly,
12
we may state the following definition: “The number of a set is the set of
all those sets that are similar to it” [3, 18]. It follows from this definition
that the set of all couples will itself be the number 2. And we can thus say
that “a number is anything which is the number of some set” [3, 19].
Hence, the main result of this chapter is the finding that the investigation
into the meaning of number. The question “What is a number?” leads to
the development of the Theory of Sets, relations, and equivalence relations.
13
2.3 Chapter 3: Finitude and Mathematical Induction
It was established in Chapter I that the theory of natural numbers can be
defined if we know what is meant by “0,” “number,” and “successor.” But
the natural numbers can actually be defined even if we only know what is
meant by “0” and “successor.” To explain how this can be done, Russell dif-
ferentiates “finite” from “infinite,” demonstrating why the method by which
it is done cannot be applied in the case of the infinite.
Russell begins by stating that if we start with 0 and proceed stepwise
from each number to its successor, then it is clear we can reach any specific
number. For example, to reach the number 3 from 0, we could say “1 is the
successor of 0, 2 is the successor of 1, and 3 is the succesor of 2.” However,
this is not enough to prove the general propostion that all such numbers
can be reached in this way [3, 20]. Therefore, we must see whether there
is another way by which this proposition can be proved. In answering this
question, Russell first considers the numbers that can be reached using “0”
and “successor.” If we say something such as “1 is the successor of 0, 2 is the
successor of 1, and so on,” it is tempting to say that “and so on” means that
the process of proceeding to the successor may be repeated a finite number of
times. But this definition assumes that we know what is meant by a “finite
number”—which has not yet been defined.
The answer to this problem, says Russell, lies in mathematical induction.
Although mathematical induction was previously presented as a principle,
Russell now shows that it is in fact a definition. To do so, he provides several
new definitions.
A hereditary property in the natural number series is a property such
that, whenever it belongs to a number n, it also belongs to n ` 1, the suc-
cessor of n.
A hereditary set is a set such that, whenever n is an element of that
set, so is n` 1.
An inductive property is a hereditary property which belongs to 0.
An inductive set is a hereditary set of which 0 is an element.
14
The posterity of a given natural number with respect to the rela-
tion ‘immediate predecessor’ (which is the converse of ‘successor’)
is the set of all those elements that belong to every hereditary set to which
the given number belongs [3, 22]. The ‘posterity of 0,’ for instance, is the set
which consists of those elements which belong to every inductive set. (No-
tice that ‘0’ is an element of the inductive set, and is thus an element of the
‘posterity of 0.’)
The ‘posterity of 0’ then is ‘the set of those elements (including 0) that
can be reached from 0 by successive steps from next to next.’ However, Rus-
sell outlines the distinction between these two sets thus: “the notion of ‘the
set of elements that can be reached from 0 by successive steps from next to
next’ is vague, though it seems as if it conveyed a definite meaning; on the
other hand, ‘the posterity of 0’ is precise and explicit just where the other
idea is hazy” [3, 22]. Russell thus defines the “natural numbers” as the
posterity of 0 with respect to the relation ‘immediate predecessor.’
Russell has thus defined one of Peano’s three primitive ideas in terms of
the other two. Specifically, he has arrived at “number” using only “0” and
“successor.” (Since “posterity of 0” is what wemeant to mean when we spoke
of “the elements that can be reached from 0 by successive steps from next
to next,” it might be more specific to say that we have arrived at “number”
using only “0” and “immediate predecessor.”) Additionally, two of Peano’s
primitive propositions—namely, the one asserting that 0 is a number (propo-
sition 1) and the one asserting mathematical induction (propostion 5)—have
become unnecessary, since they result from the definition of “number.”
Using postulates 2 to 4 of Peano’s postulates, with the relation “imme-
diate predecessor” in place of the relation “successor,” we may summarize
the preceding discussion as follows. Russell has emended Peano’s Postulates
to read: There exists a set X and a relation P (“immediate predecessor”)
defined in X and an element 0 P X such that
• If y P X and y ‰ 0, there exists some x P X for which xPy.
• xPz and yPz ùñ x “ y, where x, y, z P X.
• There is no x P X such that sP0.
15
The natural numbers, symbolized by N, is the posterity of 0 with respect to P .
Russell has thus established that Peano’s primitive idea of “number” can
be defined in terms of the other two primitive ideas, namely, “0” and “suc-
cessor,” both of which can be defined by the general definition of number.
By the general definition of number, we can say “0” is the number of
elements in a set that has no elements (this is often called the “empty set”),
which is the set of all sets that are similar to the empty set; that is, the
set whose only element is the empty set. (Hence, “0” is the set whose only
element is the empty set.)
Using this same definition, we can also define “successor.” Given any
number n, let A be a set that has n elements, and let x be an element that is
not an element of A. Then the set consisting of A with x added on will have
n` 1 elements. We thus can state the following definition: “The successor
of the number of elements in the set A is the number of elements in the set
consisting of A together with x, where x is any element not belonging to the
set” [3, 23]. In modern terminology, this defines successor as an operation
in the set of cardinal numbers.
Above we established that two of Peano’s primitive propositions—namely,
propositions 1 and 5—become unnecessary, since they result from the defi-
nition of “number.” This leaves us to prove the three remaining primitive
propositions; specifically, that (2) the successor of any number is a number;
that (3) no two numbers have the same successor; and that (4) 0 is not the
successor of any number. According to Russell, (2) and (4) are easily proven;
however, proving (3) is difficult if we assume that the total number of things
in the universe is finite.
Let us first consider if the total number of things in the universe was not
finite. We could say that for two numbers, say a and b, neither of which is the
total number of things in the universe, that it is easy to prove that we cannot
have a`1 “ b`1 unless we have a “ b. Hence, proving p3q poses no problem.
Let us now consider if the total number of things in the universe was
finite. If the number of things in the universe was (say) 10, then there would
be no set of 11 things, and the number 11 would be the empty set. (This is to
16
be contrasted with the number 0; the number 0 would be the set containing
the empty set, whereas the number 11 would be the empty set.) Likewise,
there would be no set of 12 things, and the number 12 would also be the
empty set. Hence, the successor of 10 would be the same as the successor of
11, but 10 is clearly not the same as 11. Hence, proving p3q poses a problem.
We now know then that if we assume the number of things in the uni-
verse to be not finite, then we can define Peano’s three primitive ideas, in
addition to proving his five primitive propositions, “by means of primitive
ideas and propositions belonging to logic” [3, 25]. “It follows,” Russell says
satisfyingly, “that all pure mathematics, in so far as it is deducible from the
theory of the natural numbers, is only a prolongation of logic” [3, 25].
We have shown that the process of mathematical induction can be used
to define the natural numbers; it will be useful to recognize, however, that
this type of induction is generalizable. Recall that the natural numbers were
defined as the posterity of 0 with respect to the relation of a number to its
“immediate predecessor” (the converse of “successor”). If N is the “imme-
diate predecessor” relation, then clearly any number a will have the relation
N to a`1. A property is “hereditary with respect to N,” or N -hereditary,
if, whenever the property belongs to a number a, it also belongs to a ` 1.
Moreover, a number b will be said to belong to the “posterity of a with re-
spect to the relation N” if b has every N-hereditary property belonging to a.
These definitions can be generalized to any other relation. Thus if R is
any relation whatsoever, we can state the following definitions.
An R-hereditary property is a property such that, if it belongs to a
term c, and c has the relation R to d, then it belongs to d. This is made
more precise by the next definition, since “property” has not been defined.
An R-hereditary set is a set whose defining property is R-hereditary.
(Recall that a “defining property” of a set is a property shared between
all elements of that set.) That is, A is an R-hereditary set if x P A and
xRy ùñ y P A.
An element c is an R-ancestor of the term d if d has every R-hereditary
property that c has, provided c is a term which has the relation R to some-
17
thing or to which something has the relation R. (Such a definition helps us
to avoid the situation where c has the relation to nothing and where noth-
ing has the relation to c, in which case we do not want to say that c is an
R-ancestor of d.)
The R-posterity of c is the set of all terms to which c is an R-ancestor.
From the foregoing definitions, it is clear that if an element is the ancestor
of anything, then it is its own ancestor and belongs to its own posterity.
Let us now take R to be the relation “parent.” It is of note that, prior
to Frege developing his generalized theory of induction, no one could define
“ancestor” precisely in terms of “parent.” It would have involved the as-
sumption that the number of things in consideration is finite. For instance,
suppose we were given the following series:
A,B,C, . . . , X, Y, Z.
A beginner’s definition of “ancestor” in terms of “parent” would “naturally
say that A is an ancestor of Z if, between A and Z, there are a certain num-
ber of people, B,C . . . , of whom B is a child of A, each is a parent of the
next, until the last, who is a parent of Z” [3, 26]. But this definition is not
satisfactory unless we add that the number of intermediate terms is finite.
This series begins with a series of letters with no end, and then ends with
a series of letters with no beginning. Is C an ancestor of X? It will be
so, according to the beginner’s definition of ancestor suggested above. (The
beginner’s definition allows us to have a series with an “infinite” number of
intermediate terms.) However, it will not be so according to any definition
which will give the idea of “finite” that we would like to define. For this rea-
son, it is essential that the number of intermediaries (between any two terms
in a series) is “finite.” But, as we saw, “finite” can be defined by means of
mathematical induction.
Using Frege’s generalized theory of induction, therefore, we now can con-
cretely define “ancestor” in terms of some ancestral relation. It is clear that
it is simpler to define the ancestral relation generally, instead of defining it
strictly initially for the case of the relation of n to n` 1, and then extending
it to other cases (as in the case of mathematical induction).
18
Hence, we now understand that mathematical induction is a definition—not
a principle. There are some numbers to which mathematical induction can
be applied (for example, the natural numbers), and there are other numbers
to which it cannot be applied (for example, the cardinal numbers). If “natu-
ral numbers” are defined as numbers that possess all inductive properties, it
will follow that all numbers that possess all inductive properties are natural
numbers.
Mathematical induction enables us to differentiate the “finite” from the
“infinite,” and might be stated simply in the following way: “What can be
inferred from next to next can be inferred from first to last” [3, 27]. But
this statement is true only when the number of intermediate steps between
first and last is finite. To elucidate the argument from “next to next,” and
its connection with the idea of “finite,” Russell uses a perceptive analogy
of the jerks of a goods train: “When a train is very long, it is a very long
time before its last truck moves. If the train were infinitely long, there would
be an infinite succession of jerks, and the time would never come when the
whole train would be in motion. Nevertheless, if there were a series of trucks
no longer than the series of [natural] numbers . . . every truck would begin
to move sooner or later if the engine persevered, though there would always
be other trucks further back which had not yet begun to move” [3, 28].
19
2.4 Chapter 4: The Definition of Order
In Chapter 4 of Introduction to Mathematical Philosophy, Russell seeks a
definition of order.
When we think of the natural numbers, it is not uncommon to think of
them in terms of their order of magnitude (0, 1, 2, 3, . . . ), but they are
actually capable of an infinite number of other arrangements. Although one
order—for instance, the order of magnitude—might be more familiar, others
are equally valid. But whichever order we may choose, the resulting order
will be one which the elements of the set certainly have, whether we choose
to notice it or not.
It is important to recognize that order lies in a relation among the ele-
ments of the set, in respect of which some appear as “earlier” and some as
“later.” If a set has many orders, then there are many relations among the
elements of that set. But are there certain properties a relation must have
in order to give rise to an order?
With respect to an ordering relation, we must be able to say, of two el-
ements in a set, that one “precedes” and the other “follows.” In order to
use “precedes” and “follows” in the way in which we should normally under-
stand them, we require that such a relation is asymmetrical, transitive, and
connected [3, 31].
An asymmetrical relation is one such that, if x precedes y, then y must
not also precede x. For example, the relation “taller” is asymmetrical: if x
is taller than y, then y is not taller than x.
A transitive relation is one such that, if x precedes y and y precedes z,
then x precedes z. The relation “taller” is also transitive: if x is taller than
y and y is taller than z, then x is taller than z. It should be noted that
some relations are asymmetrical but not transitive, while other relations are
transitive but not asymmetrical. An example of the former case is the rela-
tion “father”, and an example of the latter case is the relation “sameness of
height.”
A connected relation is one such that, given any two elements of the
20
set which is to be ordered, there must be one which precedes and the other
which follows. For instance, of any two integers, one is smaller and the other
greater; but of any two complex numbers this is not true [3, 32].
Russell claims that whenever an order exists, some relation having these
three properties can be found generating it [3, 32]. To demonstrate why this
must be true, Russell introduces a few definitions:
A relation is an aliorelative, or “is contained in (or implies) diversity,”
if no term has this relation to itself. The relation “greater” is an aliorelative;
the relation “equal” is not.
If a given relation holds between x and y and between y and z, then the
square of that relation is the one which holds between x and z. For instance,
if the relation “father” holds between x and y and between y and z, then the
square of that relation is “grandfather,” since x is the grandfather of z.
“The domain of a relation consists of all those terms that have the rela-
tion to something or other, and the converse domain consists of all those
terms to which something or other has the relation” [3, 32].
“The field of a relation consists of its domain and converse domain to-
gether” [3, 32].
“One relation is said to contain or be implied by another if it holds
whenever the other holds” [3, 32].
An asymmetrical relation is the same thing as a relation whose square
is an aliorelative. (Take the asymmetrical relation “father.” The square
of this relation is “grandfather,” and no term is the grandfather of itself.)
An asymmetrical relation is always an aliorelative. (The relation “father”
is aliorelative, since no term is the father of itself.) But an aliorelative is
often not asymmetrical. (The aliorelative relation “is a sibling of” is not
asymmetrical. If Donna is a sibling of Xavier, for instance, then it does not
follow that Xavier is not a sibling of Donna.)
A transitive relation is one which is implied by its square. Therefore, the
relation “ancestor” is transitive; but the relation “father” is not.
21
“A relation is ‘connected’ when, given any two different terms in its field,
the relation holds between the first and the second or between the second
and the first (not excluding the possibility that both may happen, though
both cannot happen if the relation is asymmetrical.)” [3, 33].
The three properties of being (1) aliorelative, (2) transitive, and (3) con-
nected, are mutually independent, since a relation may have two without
having the third. For example:
• The relation “ancestor” satisfies (1) and (2), but not (3). (The field of
the “ancestor” relation is all people, but it is not uncommon for person
1 not to be the ancestor of person 2, and, moreover, for person 2 not
to be the ancestor of person 1.)
• The relation “less than or equal to,” among numbers, satisfies (2) and
(3), but not (1).
• The relation “greater or less,” among numbers, satisfies (1) and (3),
but not (2).
A serial relation is aliorelative, transitive, and connected; or, equiv-
alently, asymmetrical, transitive, and connected. (Recall an asymmetrical
relation is always an aliorelative.)
A series is the same thing as a serial relation.
Russell notes briefly that a series is the serial relation itself and not the
field of a serial relation. It would be a mistake to consider the field of the
relation as the series, as a field can have multiple series with different order-
ing relations. The serial relation determines both the field and the order,
making it the series, but the field cannot be considered the series.
If P is a serial relation, then the phrase “x precedes y” refers to the
relation between x and y, written as xPy. The relation P , then, emphasizing
once more what was said above, must abide by three properties:
1. x cannot precede itself (P is aliorelative).
2. If x precedes y and y precedes z, then xmust precede z (P is transitive).
22
3. If x and y are in the field of P , then either x precedes y or y precedes
x (P is connected).
These three properties ensure that the characteristics of a series will also
be present in the ordering relation, and vice versa.
The definition is purely logical and applies to any serial relation. Al-
though a serial relation always exists where there is a series, it may not
always be the most natural relation to consider as the generator of the se-
ries. For example, in the case of the natural number series, the relation of
“immediate succession” between consecutive numbers is asymmetrical but
not transitive or connected. (Hence, the relation of “immediate succession”
is not serial.) However, from immediate succession we can derive the “an-
cestral” relation (considered in Chapter 3) by mathematical induction; and
this relation is the same as the relation “less than or equal to” among the
natural numbers. And the relation “less than,” excluding “equal to,” is what
is needed to generate the series of natural numbers. This relation is defined
as “m is less than n” when n possesses every hereditary property
possessed by the successor of m. This relation is asymmetrical, transi-
tive, and connected and orders the natural numbers. This order is known as
the “natural order” or ”order of magnitude.”
The generation of series by means of relations resembling that of n to
n ` 1 is very common. The generation of a series can be understood as the
passing from one term to the next, as long as there is a next, or back to the
one before, as long as there is one before [3, 35].
The proper posterity of x with respect to R is the set of all terms
that possess every R-hereditary property possessed by every term to which
x has the relation R. (This definition is slighty different from that of R-
posterity of x †, so as to account for cases where there may be many terms
to which x has the relation R. For example, there may be many children to
whom one father has the relation “father of.”)
A term x is a proper ancestor of a term y with respect to R (or
a proper R-ancestor of y) if y belongs to the proper posterity of x with
†The R-posterity of x is the set of all terms to which x is an R-ancestor (from page
19 above), i.e., the set of all terms that have every R-hereditary property that x has.
23
respect to R.
For the generation of series by the relation R between consecutive terms
to be possible, the relation “proper R-ancestor” must be an aliorelative, tran-
sitive, and connected. This relation will always be transitive, but it may not
always be aliorelative or connected, which would prevent the generation of a
series.
For instance, let R be the relation of sitting on someone’s left at a round
table at which there are twelve people. Then the proper R-posterity of each
sitting person consists of everyone who can be reached by going around the
table from left to right. Specfically, the proper R-posterity of each person
includes everyone at the table, including the person himself. In such a case,
though the relation “proper R-ancestor” is connected (given any two people
sitting at the table, there must be one person that is the proper R-ancestor
of the other), and the relation R itself is aliorelative (no person is seated to
the left of himself), a series is not generated because the relation “proper R-
ancestor” is not an aliorelative (each person belongs to the proper R-posterity
of himself) [3, 36].
The question of when series can be generated by ancestral relations de-
rived from relations of consecutiveness is important. If the relation R is a
one-to-one (or many-to-one) relation, then the “proper R-ancestor” must be
connected, and all that remains is to ensure that it is aliorelative.
There are several ways to generate series, but all of them require the
identification of a serial relation. For example, let us consider the three-term
relation “between,” which allows for the ordering of points in a straight line.
To define the relation “between,” we first need to consider three points
on a straight line in ordinary space. There must be one of these points that
lies between the other two. This is not true for points on a closed curve, like
a circle, as we can travel from one point to another without passing through
the third. The relation “between” is thus unique to open series (as opposed
to cyclic series) and allows us to arrange points in a line in an ordered fashion.
Let’s suppose that we have two points a, b, such that the line pabq consists
of three parts (besides a and b themselves):
24
1. Points between a and b.
2. Points x such that a is between x and b.
3. Points y such that b is between a and y.
To ensure that the relation “between” can arrange the points on the line in
a meaningful way, we need to make certain assumptions. These assumptions
are:
1. If anything is between a and b, then a and b cannot be the same point.
2. Anything between a and b must also be between b and a.
3. Anything between a and b cannot be identical to either a or b.
4. If x is between a and b, then anything between a and x must also be
between a and b.
5. If x is between a and b and b is between x and y, then b must be
between a and y.
6. If x and y are between a and b, then they must be the same or x must
be between a and y or between y and b.
7. If b is between a and x and also between a and y, then x and y must
be the same or x must be between b and y or y must be between b and
x.
Therefore, the concept of order can be generated by means of a three-term
relation, such as the “between” relation. To effectively use the “between”
relation to arrange points on a straight line, these seven properties must be
made to ensure that the relationship is meaningful and can be used to order
the points in a specific manner.
Russell observes that any three-term relation which verifies these proper-
ties give rise to series. By way of illustration, Russell considers the relation
“to the left of.” If a is to the left of b, then the points on the line pabq are
defined as follows:
1. Those between which and b lies a—which we will call those to the left
of a.
25
2. The point a itself.
3. Those between a and b.
4. The point b itself.
5. Those between which and a lies b—which we will call those to the right
of a.
The definition of “to the left of” is given as follows: For two points x, y
on a line pabq, x is said to be to the left of y if one of the following cases
holds:
1. Both x and y are to the left of a, and y is between x and a.
2. x is to the left of a, and y is a or b or between a and b or to the right
of b.
3. x is a and y is between a and b or is b or is to the right of b.
4. Both x and y are between a and b, and y is between x and b.
5. x is between a and b, and y is b or to the right of b.
6. x is b and y is to the right of b.
7. Both x and y are to the right of b, and x is between b and y.
From the seven properties that were assigned to the relation “between,”
says Russell, “it can be deduced that the relation ‘to the left of,’ as above
defined, is a serial relation as we defined the term” [3, 43].
Cyclic order, such as that of the points on a circle, cannot be generated
by three-term relations of “between.” In fact, a relation of four terms, called
“separation of couples,” is needed to generate cyclic order. Given any four
points on a circle—– e.g., a, b, x, and y—it is possible to separate them
into two couples, say pa, bq and px, yq, such that in order to get from “a to
b one must pass through either x or y, and in order to get from x to y one
must pass through either x or y” [3, 43]. This relation can generate a cyclic
order, but the process is more complicated than generating an open order
from “between.”
26
2.5 Chapter 5: Kinds of Relations
In Chapter 5 of Introduction to Mathematical Philosophy, Russell discusses
the significance of the different types of relations.
Russell begins by emphasizing the importance of having a clear under-
standing of the various kinds of relations and their properties, as some prop-
erties may only be relevant for specific types of relations. One relation he
considers is the serial relation, whose three properties, as discussed in Chap-
ter 4, are asymmetry, transitiveness, and connexity.
Asymmetry refers to the property of a relation that is incompatible with
its converse, that is, if xÑ y, then y Û x Russell explains that it is possible
to separate a symmetrical relation into two asymmetrical relations. Consider
the symmetrical relation “spouse.” If we assume the spouse of a male is always
female and the spouse of a female is always male, then the relation “spouse”
can be separated into two asymmetrical relations, as follows:
1. By limiting the domain of “spouse” to males or by limiting the converse
of “spouse” to females, we obtain the relation “husband.”
2. By limiting the domain of “spouse” to females or by limiting the con-
verse of “spouse” to males, we obtain the relation “wife.”
The symmetrical relation “spouse” can be separated into two asymmetri-
cal relations because there are two mutually exclusive sets, namely, “males”
and “females,” such that, whenever the relation “spouse” holds between two
people, one person is a member of “males” and one person is a member of
“females.” Hence, the relation “spouse” with its domain confined to “males”
will be asymmetrical, and so will the relation when its domain is confined
to “females.” But such cases are rare. If we have a series of more than two
terms, for instance, then all terms, except “the first and last (if these exist),
belong both to the domain and to the converse domain of the generating
relation, so that a relation like husband, where the domain and converse do-
main do not overlap, is excluded.” [3, 43].
Russell then discusses the important question of how to construct rela-
tions that have certain useful properties by using operations on relations that
only have rudimentary versions of these properties. It is relatively easy, for
27
instance, to construct transitiveness and connexity in many cases when the
original relation does not have these properties. For example, if R is any
relation whatsoever, the “ancestral relation derived from R by generalized
induction is transitive, and if R is a many-one relation, the ancestral rela-
tion will be connected if it is confined to the posterity of a given term” [3, 43].
However, it is much more difficult to construct asymmetry. The method
used to derive the relation “husband” from the relation “spouse,” as men-
tioned above, cannot be used in the cases where the domain and converse
domain overlap—in cases such as “greater,” “before,” or “to the right of.”
[3, 43]. In these cases, a symmetrical relation can be obtained by adding
the original relation and its converse, but it is not possible to go back to the
original asymmetrical relation without the help of some asymmetrical rela-
tion. For example, the “greater” relation can be combined with its converse
(the “less” relation) to form the “greater or less”—i.e., “unequal”—relation,
which is symmetrical, but there is nothing in this relation to indicate that it
is the sum of two asymmetrical relations.
From a classification perspective, asymmetry is a more important charac-
teristic than being aliorelative. Asymmetrical relations are aliorelative, but
the reverse is not true. (The aliorelative relation “unequal,” for example, is
symmetrical.)
Russell then notes that it is possible to replace relational propositions
with predicates so long as the relations are symmetrical. Symmetrical rela-
tions that are not aliorelative, if they are not transitive, may be regarded
as asserting a common predicate; whereas symmetrical relations that are
aliorelative may be regarded as asserting incompatible predicates ’ [3, 44].
For example, the relation “similarity between sets,” used to define “num-
bers” in Chapter 2, is symmetrical and transitive yet not aliorelative. It is
possible, although less simple, to regard the “number” of a collection as a
predicate of the collection. In this case, two similar sets will have the same
numerical predicate, while two sets that are not similar will have different
numerical predicates. This method of replacing relations with predicates is
not possible when the relations are asymmetrical, because “both sameness
and difference of predicates are symmetrical” [3, 44]. Hence, asymmetrical
relations are, according to Russell, “the most characteristically relational of
relations, and the most important to the philosopher who wishes to study
28
the ultimate logical nature of relations” [3, 45].
Russell next provides a comprehensive overview of one-many relations. A
one-many relation is a relation where at most one term is related to a given
term. (Hence, one-one relations are a subset of one-many relations.) Ex-
amples of one-many relations include “father,” “mother,” and “square of.”
Relations like “parent” and “square root of” are not one-many. (“Parent” is
many-one or many-many; and “square root of” is many-one.)
In theory, all relations can be converted into one-many relations. For
instance, consider the “less” relation among the natural numbers. For any
number greater than 1, there will not be just one number that has the “less”
relation to it, but a whole set of numbers that are less than it. This set,
known as the proper ancestry of the number, is a one-many relation since
each number determines a unique set of numbers that constitutes its proper
ancestry. The proper ancestry of the number 2, for example, would be the set
of numbers t0, 1u. But “proper ancestry” is a one-many relation (recall that
a one-one relation is a one-many relation.), since each number determines
a single set of numbers as constituting its proper ancestry. Therefore, says
Russell, “the relation less than can be replaced by being a member of the
proper ancestry of ” [3, 45]. Sticking with the previous example, then, we
may write
0 is a member of the proper ancestry of 2.
1 is a member of the proper ancestry of 2.
According to Russell, though, this reduction of a relation to a one-many
relation does not provide a technical simplification and is not considered
a philosophical analysis due to the notion that sets are “logical fictions.”
Therefore, one-many relations will continue to be regarded as a special type
of relation.
The concept of one-many relations is present in all phrases of the form
“the so-and-so of such-and-such.” For instance, “the mother of John Stuart
Mill” describes a person by means of a one-many relation to a specific term.
As a person cannot have more than one mother, the phrase “the mother of
John Stuart Mill” refers to a specific person, even if her identity is unknown.
29
It is worth noting that all mathematical functions arise from one-many
relations; terms such as the “sine of x,” are described through a one-many
relation (in this case, “sine”) to a given term x, similar to “the mother of x”:
ÝsÝinÑex “sine of x”;
ÝmÝox ÝtÝhÑer “mother of x”.
These functions are known as descriptive functions, which can be rep-
resented as “the term having the relation R to x” or simply “the R of x,”
where R represents any one-many relation [3, 46]:
R
x ÝÑ “R of x”.
The use of “the R of x” as a descriptive term requires that x is a term to
which something has the relation R, and that only one term has the relation
R to x, because the use of “the” implies uniqueness. For example, we can
talk about “the father of x” if x refers to a human being except Adam and
Eve, but not if x refers to a table or chair or any other object without a father.
Therefore, the existence of “the R of x” is determined by there being
only one term with the relation R to x. This occurs when x is part of the
converse domain of R, but not otherwise. In mathematical terms, x is the
“argument” of the function, and the term with the relation R to x—i.e., “the
R of x”—is the “value” of the function for the argument x. For a one-many
relation R, the range of possible arguments for the function is the converse
domain of R, and the range of possible values is the domain.
Important concepts in relation logic, such as converse, domain, converse
domain, and field, are examples of descriptive functions. Russell introduces
more examples as the discussion continues.
Above it was noted that one-one relations are a subset of one-many re-
lation; in addition to knowing their formal definition, one-one relations are
crucial to understand. The formal definition of one-one relations can be de-
rived from that of one-many relations. On the one hand, one-one relations
are defined as relations that are both one-many and many-one, i.e., “one-
many relations which are also the converses of one-many relations” [3, 46].
One-many relations, on the other hand, can be defined as those such that,
30
if x has some relation to y, then there is no other term that has that same
relation to y. Or, they can be defined as relations such that, given two terms
x and x1, the terms to which x has the given relation and those to which x1
has the given relation have no member in common.
The relative product of two relations, R and S, is a relation that holds
between x and z when there is an intermediate term y, such that x has the
relation R to y and y has the relation S to z:
ÝÑR Sx y ÑÝ z.
In the case of one-many relations, the relative product of the relation and
its converse implies identity. For instance, if we take R to be the one-many
relation “father” and we take S to be its converse (say, the relation “son”)
it follows that x must be identical to z. For one-one relations, the relative
product of the relation and its converse, as well as the converse and the re-
lation, implies identity.
When a relation R exists, it is helpful to think of y as being reached from
x through an “R-step” or “R-vector.” And in the same manner, x can be
reached from y through a “backward R-step.” For one-many relations, then,
an R-step followed by a backward R-step should bring you back to your
starting point. However, this is not always the case for other relations, such
as the relation of child to parent or grandchild to grandparent.
It should be noted that the relative product of two relations is not always
commutative, meaning the relative product of R and S is not the same as
the relative product of S and R. For example, the relative product of parent
and sister is aunt, but the relative product of sister and parent is parent.
One-one relations establish a correspondence between two sets, term by
term. This means that every term in one set has a corresponding term in
the other set. The concept is easiest to understand when the two sets have
no overlapping members, such as the set of husbands and the set of wives.
In this case, it is clear which term represents the referent (“the term from
which the relation goes”) and which represents the relatum (“the term to
which the relation goes”) [3, 48]. For example, if x and y represent husband
and wife, respectively, then with respect to the relation “husband,” x is the
31
referent and y is the relatum, but with respect to the relation “wife,” y is
the referent and x is the relatum.
Relations can have a sense, which refers to the direction in which the
relation goes. The sense of a relation that goes from x to y is opposite to the
sense of the corresponding relation from y to x. This concept of a relation
having a sense is fundamental and helps explain why order can be created
through relation. The set of all possible referents in a relation is referred to
as its domain, and the set of all possible relata is its converse domain.
However, it is not uncommon for the domain and converse domain of a
one-one relation to overlap. For example, the relation between the first 10
positive integers (excluding 0) and the result of adding 1 to each, results in
the same 10 positive integers but with 1 removed from the beginning and 11
added to the end. This relation of n to n` 1 is a one-one relation. Another
example is the relation between the first 10 positive integers and their double,
which results in 5 of the original 10 integers. The relation between a number
and its double is also a one-one relation.
An especially interesting case occurs when the converse domain is only
a part of the domain. For example, consider the relation “n ` 1” where
the domain is all the natural numbers n, instead of just the first 10 positive
integers. If we arrange two rows of numbers, such that the numbers in the
domain are in the top row and the number in the converse domain are in the
bottom row, we have:
1, 2, 3, 4, 5, . . . , n, . . .
2, 3, 4, 5, 6, . . . , n` 1, . . .
In each of these cases, all natural numbers are in the top row, but only
some are in the bottom row. These types of relations, where the converse
domain is a proper part (i.e., a part but not the whole) of the domain, are
explored later by Russell when dealing with the concept of infinity.
Another type of relation is called a “permutation,” where the domain and
converse domain are identical. For example, the six possible arrangements of
pa, b, cq illustrate permutations. Each arrangement can be transformed into
another by a correlation. For example, pa, b, cq can be transformed to pc, b, aq,
32
if a is correlated with c, b is correlated with itself, and c is correlated with a.
The combination of two permutations results in another permutation, and
the permutations of a given set form a group.
These different types of correlations are important in different contexts.
The uses of one-one correlations are especially important, and will be ex-
plored in the next chapter.
33
2.6 Chapter 6: Similarity of Relations
In Chapter 2, Russell defined two sets to be similar if they have the same
number of terms, meaning there is a one-to-one correlation between them. In
this chapter, Russell seeks to define a comparable relation between relations
called “likeness.”
To define likeness, Russell employs the notion of correlation, assuming
that “the domain of one relation can be correlated with the domain of the
other, and the converse domain with the converse domain” [3, 52]. However,
this is not enough for the desired resemblance between the two relations.
What is desired is that whenever one relation holds between two terms, the
other relation should hold between the correlates of those terms. (By “cor-
relate,” what Russell means is that if x has some relation R to y, then, with
respect to R, the correlate of x is y, and the correlate of y is x.)
Russell uses the example of a map to illustrate the concept of “likeness”
between relations. If we say one place is north of another, then the place on
the map corresponding to the one is above the place on the map correspond-
ing to the other. Hence, writes Russell, the “space-relations in the map have
‘likeness’ to the space-relations in the country mapped” [3, 53]. And it is
this connection between relations that he wishes to define.
In defining “likeness,” Russell imposes a constraint on the types of re-
lations he will consider. Specifically, he considers only those relations that
have fields, i.e., those that allow for the creation of a single set by combining
the domain and converse domain. Russell uses the example of the relation
“domain” to illustrate a case where this constraint does not hold, as it has all
sets as its domain (every set is the domain of some relation) and all relations
as its converse domain (every relation has a domain). However, sets and
relations cannot be combined to create a single set since they are of different
logical types. Russell does not delve into the complex topic of types, but
emphasizes that it is important to recognize when we are avoiding it.
This raises the question: When does a relation have a field? Without
being pedantic, Russell asserts that a relation only has a “field” if its domain
and converse domain belong to the same logical type; in other words, if it
is homogeneous. To provide a general sense of what he means by “logical
34
types,” Russell says that “individuals, [sets] of individuals, relations between
individuals, relations between [sets], and relations of [sets] to individuals,
and so on, are different types.” [3, 53]. The concept of likeness—which
has yet to be defined—is not particularly useful, according to Russell, when
applied to relations that are not homogeneous. Therefore, when defining
likeness, he will simplify our task by referring to the “field” of one of the re-
lations involved. (In other words, he will restrict himself to relations that are
homogenous.) While this restriction limits the generality of our definition,
Russell notes that it is not of any practical significance, and once mentioned,
need not be remembered [3, 53].
The concept of likeness between two relations P and Q is defined as the
existence of a one-to-one relation S that has the field of P as its domain and
the field of Q as its converse domain. For every instance in which the rela-
tion P holds, there must be a corresponding instance in which the relation
Q holds, and vice versa. Figure 1 makes this clearer.
This definition can be simplified by introducing the concept of a corre-
lator. We say that S is a correlator of P and Q if S is a one-to-one relation
that “has the field of Q as its converse domain, and is such that P is the
relative product of S and Q and the converse of S” [3, 54]. “Two relations
P and Q,” therefore, “are said to be similar, or to have likeness, if there
exists at least one correlator of P and Q” [3, 54].
Russell explains that relations that have likeness share all properties that
are independent of the terms in their fields. For instance, if one relation is
transitive, then the other is also transitive; and the same holds for other
general properties of relations. Statements involving the actual terms in the
field of a relation may not hold when applied to a similar relation; however,
these statements can always be translated into analogous statements that do
hold.
We are thus led to a problem in mathematical philosophy related to the
interpretation of statements, where we may know the grammar and syntax
of the statement, but not the vocabulary. The problem is stated as follows:
What are the possible meanings of a statement whose vocabulary is unknown
but whose grammar and syntax are known, and what are the meanings of
the unknown words that would make it true? This question is significant be-
35
P
x y
S S
s w
Q
Figure 1: Likeness between two relations P and Q.
cause it reflects the state of our knowledge of nature, where we have a better
understanding of the form of nature than the matter, and we only know that
there is likely some interpretation of the terms used in scientific propositions
that will make them approximately true. This question will be answered in
a later chapter; for now, Russell says that we must further investigate the
subject of likeness.
It is noted by Russell that the properties of similar relations are identical
except for those dependent on the specific terms in their fields. To group
these similar relations, the term relation-number of a given relation is
introduced, which refers to the set of all relations similar to a given relation.
(This definition reflects that of a “number” of a set in Chapter 2, which is the
set of sets similar to that set.) More broadly, we may say that the relation-
numbers are the set of all those sets of relations that are relation-numbers
of various relations; in other words, a relation number is a set of relations
consisting of all those relations that are similar to one element of the set.
By establishing this terminology, says Russell, we can create a system for
grouping and studying similar relations in a more structured way.
To avoid confusion with relation-numbers, Russell now uses cardinal
number in place of the “number” of a set. Hence, the cardinal number of a
set is the set of all sets that are similar to that set [3, 56].
The relation-numbers are applicable to series. Two series are considered
equally long if they have the same relation-number. If we have two finite
series whose fields have the same cardinal number of terms, then they will
have the same relation-number. Hence, in the case of finite series, “there is
a parallelism between cardinal and relation-numbers” [3, 56]. Russell uses
the term “serial numbers” to refer to relation-numbers that are applicable to
36
series. Therefore, a finite serial number can be determined when the cardinal
number of terms in the field of a series having that serial number is known.
“If n is a finite cardinal number,” says Russell, “the relation-number of a se-
ries which has n terms is called the ordinal number n” [3, 57]. But when
the cardinal number of terms in the field of a series is infinite, the relation-
number of the series cannot be determined by the cardinal number alone. In
fact, an infinite number of relation-numbers can exist for one infinite cardinal
number. This is because the “length” or relation-number of an infinite series
can vary without a change in the cardinal number of terms. For example,
the cardinal numbers of N and Z. (The cardinal number of both sets is ℵ0,
but they‡ have different relation-numbers because a correlator from N to Z
cannot map 0 to an integer that has no predecessor.) In contrast, for finite
series, the relation-number is uniquely determined by the cardinal number of
terms in the field.
Arithmetic operations can be defined for relation-numbers just as they
are for cardinal numbers. Russell considers the sum of two non-overlapping
series. To define the sum of their relation-numbers as the sum of the relation-
numbers of the two series, we must first order the series by placing one before
the other. Let P and Q be the generating relations of the two series. In the
sum of P and Q, with P preceding Q, every element of the field of P comes
before every element of the field of Q. Therefore, the serial relation that we
need to define as the sum of P and Q is not solely “P or Q,” but “P or Q
or the relation of any [element] of the field of P to any [element] of the field
of Q” [3, 57]. Assuming that P and Q do not overlap, this relation is serial,
but “P or Q” is not serial because it is not connected.§
Series are not the only application of the idea of likeness. Russell has
already mentioned maps, but extends our thoughts to geometry generally.
When the system of relations applied by geometry to one set of terms can be
brought fully into relations of likeness with a system applied to another set
of terms, the resulting geometries are indistinguishable from a mathematical
standpoint, i,e., “all the propositions are the same, except for the fact that
‡The successor relation in N and the successor relation in Z.
§A “connected” relation, as discussed in Chapter 4, is a relation such that, given any
two different terms in its field, the relation holds between the first and the second or
between the second and the first, not excluding the possibility that both may happen,
though both cannot happen if the relation is asymmetrical.
37
they are applied in one case to one set of terms and in the other to another”
[3, 58].
That said, Russell says that a mathematician need not be preoccupied
with the specific nature or essence of points, lines, and planes, even when
engaging in applied mathematics. While there is empirical evidence support-
ing some aspects of geometry that are not definitional, there is no empiri-
cal evidence regarding the true nature of a “point.” A point should satisfy
our axioms as closely as possible, but need not necessarily be “very small”
or “without parts” [3, 59]. As long as a logical structure, no matter how
complex, can be constructed from empirical material that satisfies our ge-
ometrical axioms, it may legitimately be called a “point.” This illustrates
the general principle that what is important in mathematics, and to a large
extent in physical science, is not the intrinsic nature of our terms, but rather
the logical nature of their interrelations.
We can describe two similar relations as having the same “structure.”
For mathematical purposes, what matters about a relation is not its intrinsic
nature, but the instances in which it holds. Just as a set can be defined by
different but co-extensive concepts—Russell gives the example of “man” and
“featherless biped”—two relations that are conceptually distinct can hold in
the same set of instances. An “instance” in which a relation holds is a pair
of terms with an order, such that one term comes first and the other second,
and the first term has the relation in question to the second. If we consider
the relation “mother,” for instance, we can define its “extension” as the set
of all ordered pairs px, yq in which x is the mother of y. Mathematically
speaking, the only thing that matters about the relation mother is that it
defines this set of ordered pairs. In general, we can say that the “extension”
of a relation is the set of ordered pairs px, yq in which x has the relation in
question to y.
Russell takes a further step in the process of abstraction and examines
what is meant by “structure.” If we are given a sufficiently simple relation,
we can create a map of it. For example, we can consider a relation whose
extension includes the following ordered couples: ab, ac, ad, bc, ce, dc, de,
where a, b, c, d, e are five terms, regardless of what they are. We can make
a “map” of this relation (see Figure 2) by placing five points on a plane and
connecting them with arrows.
38
a b
d c
e
Figure 2: A map of the relation whose ordered couples are ab, ac, ad, bc, ce,
dc, and de.
The map reveals the “structure” of the relation. It is evident that the
“structure” of the relation does not depend on the specific terms that com-
pose the field of the relation. The field can be altered without altering the
structure, and the structure can be altered without altering the field. (Re-
placing a with f being an example of the former case. Adding the ordered
couple ae being an example of the latter case.) Generally speaking, then,
the “structure” of a relation is independent of the specific terms in its field.
If two relations can be mapped onto each other or each can be its own map,
they have the same “likeness” or relation-number. And the relation-number
is equivalent to what is meant by the vague term “structure.”
Russell concludes this chapter by emphasizing the importance of under-
standing structure in philosophy to avoid speculation. The dismissal of objec-
tive counterparts to subjective concepts such as space, time, and phenomena
suggests limited knowledge of the objective world. However, if objective
counterparts exist, they must have the same structure as the phenomenal
world; in other words, all true propositions about phenomena must also be
true for the objective world, differing only in irrelevant individuality. While
some philosophers avoid asserting objective counterparts, others are reserved
on the subject to prevent excessive convergence between the real and phe-
nomenal worlds. For these reasons, the notion of structure or relation-number
is significant in many ways.
39
2.7 Chapter 7: Rational, Real, and Complex Numbers
Russell has so far provided definitions for cardinal numbers and relation-
numbers, which has allowed us to define “ordinal” numbers. But these def-
initions are not sufficient to cover other types of numbers, such as negative,
fractional, irrational, and complex numbers. Hence, in this chapter, Russell
provides logical definitions for each of these types of numbers.
To begin, Russell notes that one of the reasons for the delay in discover-
ing accurate definitions of number extensions is the mistaken belief that each
type of extension includes the “previous sorts as special cases” [3, 63]. For
instance, it was previously believed that positive integers could be identified
with signless integers, and that a fraction with a denominator of 1 could be
identified with its numerator. Similarly, rational numbers were thought to
belong to the set of real numbers, and the complex numbers were thought to
include real numbers with an imaginary part of zero. Russell, however, be-
lieves all these suppositions are incorrect and must be discarded for accurate
definitions to be established.
Russell starts with positive and negative integers. He notes that `1
and ´1 must both be relations, and must be each other’s converses. Specifi-
cally, `1 is the relation of n to n` 1, while ´1 is the relation of n` 1 to n.
Likewise, `m is the relation of n to n`m, while ´m is the relation of n`m
to n, where m is any natural number. Russell emphasizes the distinction
between `m and m: `m is a one-to-one relation so long as n is a cardinal
number (finite or infinite), and m is a natural cardinal number. That is, `m
is distinct from m in that it is a relation—not a set of sets.
Russell then discusses fractions. Although Russell’s close friend Dr.
Alfred North Whitehead developed a theory of fractions for their application
to measurement, an easier method can be adopted for defining fractions that
have the required mathematical properties. Russell defines the fraction m as
n
the relation that holds between two natural numbers x and y when xn “ ym.
(Russell defines fractions this way so that x “ m .) For clarity, we can define
y n
m as the relation
n
m
x ÝÑn y, when xn “ ym.
This definition allows us to prove that m is a one-to-one relation if neither
n
40
m nor n is zero. It thus follows that the relation m is a relation between two
1
integers x and y, provided that x “ my. That is,
m
x ÝÑ1 y, when x “ ym.
However, like `m, the fraction m is distinct from the natural cardinal
1
number m, since “a relation and a [set] of [sets] are objects of utterly differ-
ent types” [3, 64]. Note that Russell has thus defined what we call positive
fractions.
Russell also mentions that the relation 0 is always the same, regardless
n
of the natural number n, and may be called the “zero of rational numbers.”
That is,
0
x ÝÑn y, when xn “ 0, since y ¨ 0 “ 0.
However, 0 is not identical to the cardinal number 0, since they are of
n
different types.
Conversely, the relation m is always the same, regardless of the natural
0
number m, and may be called “the infinity of rationals.” That is,
m
x ÝÑ0 y, when 0 “ ym, since x ¨ 0 “ 0.
However, this type of infinity is different from the “Cantorian infinite,”
which Russell discusses in the next chapter. While the infinity of rationals is
not too important and could be dispensed with if necessary, the Cantorian
infinite, according to Russell, “opens the way to whole new realms of math-
ematics and philosophy” [3, 65].
Russell notes that among fractions, zero and infinity are unique in that
they are not one-one. Zero is one-many (since y can be any natural number),
whereas infinity is many-one (since x can be any natural number).
Russell next defines greater and less among fractions. Given two frac-
tions, a and c , we say that a is less than c when ad is less than cb. That
b d b d
is,
a ÝlÝesÝsÝthÝaÑn c , when ad is less than cb.
b d
41
The relation “less than” is serial, and therefore the fractions form a series
in order of magnitude. The smallest term in this series is zero and the largest
is infinity. However, if we omit zero and infinity from the series, there is no
longer a smallest or largest fraction.
Any fraction other than zero and infinity can be shown to have a smaller
and a larger fraction, and this implies there are always other fractions be-
tween any two fractions. For example, if a and c are two fractions, and a is
b d b
less than c , then a`c` will be greater than
a and less than c . This means that
d b d b d
the series of fractions is “compact,” as there are always other terms between
any two. A compact series is one in which there are always other terms
between any two, and no two terms are consecutive. The fractions in order
of magnitude, therefore, form a compact series, which is “generated purely
logically without any appeal to space, time, or any other empirical datum”
[3, 66].
Positive and negative fractions can be defined in a way similar to positive
and negative integers. The sum of two fractions, a and c , is defined as ad`cb .
b d bd
We define ` c as the relation of a to a ` c , where a is any fraction; and ´ c
d b b d b d
is the converse of ` c . That is,
d
` c
a d a ` c .
b ´ c b d
d
There are other possible ways to define positive and negative fractions, but
this method is a clear adaptation of the way positive and negative integers
are defined.
Russell next introduces the concept of real numbers as an extension
of the idea of number, which includes irrational numbers. He references the
discovery of incommensurables by Pythagoras—for example, the lengths
of the diagonal and the side of a square are incommensurable, because they
cannot be expressed as a ratio of two integers—and how, through geometry,
this led to the idea of irrational numbers. He then discusses the proof in
Euclid’s tenth book (Book X, Proposition 10) that there is no fraction whose
square is 2. The inability to express the length of the diagonal of a square
with a one-inch side in terms of a fraction, he argues, seems like a challenge
to arithmetic by nature itself. The scope of this problem is not limited to
42
geometry; it also encompasses algebra when solving equations.
Russell then explains that it is possible to find fractions whose squares
approach closer and closer to 2. We can construct an ascending series
of fractions whose squares are all less than 2, but differing from 2 in their
“later fractions” by less than any assigned amount. For instance, if we choose
one-trillionth as the assigned amount, then after a certain term, say the fif-
teenth, all terms in the series will have squares that differ from 2 b?y less
than this amount. Using the standard arithmetic rule to extract the 2, an
infinite decimal can be obtained that satisfies these conditions. Similarly, it
is possible to construct a descending series of fractions whose squares are
all greater than 2, but differing from 2 in their “later fractions” by less than
any assigned amount
?
However, this method does not lead to 2. By way of illustration, Russell
divides all fractions into two sets: those whose squares are less than 2, and
those whose squares are greater than 2. The former set has no maximum,
and the latter set has no minimum. Hence, all fractions can be divided into
two sets such that “all the terms in one set are less than all in the other,
there is no maximum to the one set, and there is no minimum to the other”
?[3, 68]. This implies that there is nothing between these two sets, where the
2 should be located. Thus, althoug?h a tight “cordon” (as Russell calls it)
has been drawn, it has not captured 2.
The foregoing method of dividing all the terms of a series into two sets,
where one entirely precedes the other, is known as a Dedekind cut. With
respect to what happens at the point of section, there are four cases:
1. The lower section has a maximum, the upper section a minimum.
2. The lower section has a maximum, the upper section no minimum.
3. The lower section has no maximum, the upper section a minimum.
4. The lower section has no maximum, the upper section no minimum.
Case 1 only occurs in series with consecutive terms (like the integers) and
can be neglected. In case 2, the maximum of the lower section is the lower
limit of the upper section. In case 3, the minimum of the upper section is the
43
upper limit of the lower section. In case 4, there is a Dedekind gap; that
is, neither the lower section nor the upper section has a limit or a last term.
In this case, Russell says there is an irrational section, “since sections of
the series of fractions have ‘gaps’ that correspond to irrationals” [3, 70].
The delay in developing the true theory of irrationals, says Russell, was
due to a mistaken belief that series of fractions must have limits. Therefore,
the term “limit” must be defined thoroughly.
According to Russell, a set S has an upper limit L with respect to a
relation R if three conditions are satisfied:
1. S has no maximum in R. (See definition of “maximum” below.)
2. Every element of S that belongs to the field of R precedes L. By
“precedes,” Russell means “has the relation R to.”
3. Every element of the field of R that is before L precedes some element
of S.
A term m is considered a maximum of a set S with respect to a relation
R if m is an element of S and of the field of R, and does not have the relation
R to any other element of S [3, 70].
Russell emphasizes that these definitions do not require the terms to
which they are applied to be quantitative. In a series of moments of time ar-
ranged by earlier and later, for example, the “maximum” (if it exists) would
be the last moment. However, if arranged by later and earlier, the “maxi-
mum” (if it exists) would be the first moment.
The minimum of a set with respect to R is defined as the maximum of
S with respect to the converse of R, while the lower limit with respect to
R is the upper limit with respect to the converse of R [3, 70].
Although the ideas of limit and maximum do not require the relation with
respect to which they are defined to be serial, they are most often applied in
serial or quasi-serial cases.
44
Another important idea is the “upper boundary,” which is the “maximum
or upper limit” of a set of terms chosen from a series. That is, the upper
boundary of a set is the maximum (“the last element”) if it has one, and if
not, it is the upper limit (“the first term after all of them”), if it exists [3,
70]. If there is neither a maximum nor an upper limit, then there is no upper
boundary. Similarly, the lower boundary is the minimum or lower limit.
Regarding the four kinds of Dedekind cuts, Russell notes that in the first
three cases, each section has a boundary (either upper or lower), while in
case 4, neither section has a boundary. Moreover, he notes that if the lower
section has an upper boundary, then the upper section has a lower boundary.
In cases 2 and 3, the two boundaries are identical, while in case 1, they are
consecutive terms of the series.
Russell defines a series as Dedekindian if every section of the series has
a boundary, whether upper or low?er. The series of fractions in order of mag-
nitude has “gaps” (for example, 2); hence, this series is not Dedekindian.
Russell surmises that people have been influenced by spatial imagination and
have believed that series must have limits “in cases where it seems odd if they
do not” [3, 71]. For example, some people allowed themselves to “postulate”
an irrational limit to fill the Dedekind gap when they realized there was no
rational limit to the fractions whose square is less than 2. Dedekind himself,
in fact, wrote the axiom that “the gap must always be filled,” meaning that
every section has a boundary. Therefore, series that satisfy this axiom are
called “Dedekindian.” There are an infinite number of series, however, for
which this axiom is not verified.
Comparing the advantages of the method of “postulating” to those of
theft, Russell wisely suggests using “honest toil” to find a precise definition
of an irrational Dedekind cut. To do this, he says, we must rid ourselves
of the idea “that an irrational must be the limit of a set of fractions” [3,
71]. Instead, Russell proposes defining a new kind of number called “real
numbers,” which will include both rational and irrational numbers. Rational
numbers correspond to fractions, “in the same kind of way in which n cor-
1
responds to the integer n; but they are not the same as fractions” [3, 72].
To represent an irrational number, we may use an irrational cut, which is
represented by its lower section. In order to define rational numbers, then,
Russell suggests confining ourselves to cuts in which the lower section has no
45
maximum—such cuts he calls segments. Since no segment has a maximum,
we care only whether it has an upper limit; if it has an upper limit, then, by
definition, it has an upper boundary.
On the one hand, a segment that corresponds to a fraction has an up-
per boundary (the fraction itself), since its upper limit is the fraction itself.
Such a segment consists of all fractions less than their upper boundary. For
example, consider the fraction 2 . The corresponding segment would be the
3
set of all fractions that are less than 2 . This segment would include fractions
3
like 1 , 3 , and 6 —but not fractions like 2 , 7 , or 1.
2 4 10 3 6
On the other hand, a segment that corresponds to an irrational number
has no upper boundary, since it has no upper limit. (Recall from above that
we must rid ourselves of the idea “that an irrational must be the limit of a
set of fractions.”)
Segments, whether they have a boundary or not, are such that, of any two
associated with one series, one must be part of the other; hence, all segments
can be arranged in a series by the relation of whole and part. A series
with Dedekind gaps, where there are “segments without boundaries” (e.g.,
the irrational numbers), “will give rise to more segments than it has terms,
since each term will define one segment having that term for boundary, and
then the ‘segments without boundaries’ will be extra” [3, 72]. For example,
consider the series
3, 3.1, 3.14, 3.141, 3.1415, 3.14159, . . .
If x is a term in this series, then x will define one segment having x as
its boundary. However, there will be a gap at the area of a circle of radius
1, thus giving rise to segments without boundaries that will be extra. Rus-
sell has thus reached a point where he can define “real number,” “irrational
number,” and “rational real number.”
A real number is “a segment of the series of fractions in order of mag-
nitude” [3, 72]. This definition is equivalent to “a segment of the series of
fractions which has (or does not have) an upper boundary.”
An irrational number is “a segment of the series of fractions that has
46
no [upper] boundary” [3, 72].
A rational real number is “a segment of the series of fractions which
has [an upper] boundary” and thus “consists of all fractions less than a cer-
tain fraction, which corresponds to the rational real number. For example,
the real number 1 is the set of all proper fractions [3, 72].
We might think intuitively that an irrational number is the limit of a set
of fractions, but Russell declares that “it is actually the limit of the corre-
sponding set of rational real numbers i?n the series of segments ordered by
whole and part” [3, 73]. For instance, 2—although it is not a segment of
the series of fractions that has an upper limit—is the upper limit of all those
segments of the series of fractions that corres?pond to fractions whose square
is less than 2. In other words, says Russell, “ 2 is the segment consisting of
all those [fractions] whose square is less than 2” [3, 73].
Russell notes that it is easy to prove that the series of segments of any
series is Dedekindian. Given any set of segments, their boundary will be
their “logical sum,” i.e., the set of all those terms that belong to at least one
segment of the set.
Russell’s definition of real numbers is an example of construction as
opposed to postulation, which was used to define the cardinal numbers.
This method has the advantage of requiring no new assumptions and allows
us to “proceed deductively from the original apparatus of logic” [3, 73].
The above definition allows us to easily define addition and multiplica-
tion. Given two real numbers u and v, each of which is a set of fractions, we
take any element of u and any element of v and add them together according
to the rule for the addition of fractions. The set of all such sums that can
be obtained by varying the selected elements of u and v forms a new set of
fractions, which is easy to prove is a segment of the series of fractions. This
new set is defined as the sum of u and v. Hence, the arithmetical sum of
two real numbers is “the set of the arithmetical sums of an element of the
one and an element of the other chosen in all possible ways” [3, 73].
According to Russell, the arithmetical product of two real numbers can
be defined in a similar way to the arithmetical sum, expect that we instead
47
multiply an element of one by an element of the other in all possible ways.
The set of fractions generated by this process is defined as the product of
the two real numbers. Russell notes, however, that in these definitions,
the series of fractions is defined so as to exclude 0 and infinity.
Complex numbers involve the square root of a negative number; hence,
the letter i is used to represent the square root of ´1, and any number in-
volving the square root of a negative number can be expressed in the form
x` yi, where x and y are real numbers. The “real part” of such a number is
x, while the “imaginary part” is i.
Complex numbers are less important in geometry than in algebra and
analysis. In the latter two cases, they are required for the extraction of roots
and the solution of equations. If we were operating in the complex numbers,
for instance, the equation x2 ` 1 “ 0 would have two roots; but if we were
confined to real numbers, it would have zero roots. Although the creation of
such “non-physical” numbers might at first seem disingenuous, Russell notes
that “every generalization of number has presented itself as needed for some
simple problem” [3, 74]. For example, negative numbers were needed so that
subtraction could be possible, and fractions were needed for division. But
“extensions of number are not created by the mere need for them: they are
created by the definition” [3, 75]. Hence, Russell claims we must now turn
our attention to the complex numbers.
Russell claims that a complex number can be defined “as an ordered
couple of real numbers” [3, 75]. By defining complex numbers as ordered
couples of real numbers, we can ensure that two real numbers are needed to
determine a complex number and that two complex numbers are only equal
if their corresponding couples of real numbers are equal. Further properties
can be achieved by defining addition and multiplication rules as follows:
Addition : px` yiq ` px1 ` y1iq “ px` x1q ` py ` y1qi
Multiplication : px` yiqpx1 ` y1iq “ pxx1 ´ yy1q ` pxy1 ` x1yqi
Thus, given two ordered couples of real numbers px, yq and px1, y1q, their
sum is the couple px ` x1, y ` y1q, and their product is the couple pxx1 ´
yy1, xy1 ` x1yq. These definitions ensure that the ordered couples have the
48
desired properties. For example, the product of p0, yq and p0, y1q is p´yy1, 0q.
Therefore, the square of p0, 1q is p´1, 0q. The couples with an imaginary part
of zero can be identified with real numbers, even though this “is an error in
theory”, but “a convenience in practice” [3, 76]. Thus, p0, 1q is represented
by i, and p´1, 0q by ´1. These multiplication rules ensure that the square
of i is ´1, as desired. Hence, these definitions serves all necessary purposes.
Russell concludes this chapter by providing some other use cases of com-
plex numbers. Complex numbers can be geometrically interpreted in the
plane; and complex numbers of higher orders have uses in geometry. In the
latter case, complex numbers can be defined as one-many relations whose do-
main consists of certain real numbers, and whose converse domain consists
of integers from 1 to n. This is usually denoted by the notation
px1, x2, x3, ..., xnq,
where “the suffixes denote correlation with the integers used as suffixes, and
the correlation is one-many, not necessarily one-one, because xr and xs may
be equal when r and s are not equal” [3, 76]. This definition, if accompa-
nied by a suitable multiplication rule, serves all purposes for which complex
numbers of higher orders are needed.
Russell has thus completed his review on number extensions not involving
infinity.
49
3 Part II: The Modern Approach
3.1 Outline of the Set Theory needed for the Study of
Numbers
Chapter 1: Sets and Relations
1.1: Cantor’s Concept of a Set
Set and element of a set are primitive or undefined terms, and attempts to
define them are futile, like Euclid’s attempt to define point, line, and plane.
Such attempts, such as those of Cantor, characterize what is called Naive or
Intuitive Set Theory.
We say a set S is any collection of definite, distinguishable objects of our
intuition or of our intellect to be conceived as a whole [4, 2].
Moreover, we say the objects in a set S are called the elements or mem-
bers of S [4, 2].
Assumptions about sets are called axioms, which are statements that are
taken to be true without proof.
1.2: The Basis of Intuitive Set Theory
Axiom. [4, 4]. The intuitive principle of extension states that two sets
are equal if and only if they have the same elements.
This intuitive principle of extension is called by Halmos the Axiom of
Extension [2, 2].
Definition. [4, 5]. The set txu, a so-called unit set, is the set whose sole
member is x.
Definition. [4, 5]. A collection of sets is a set whose elements are them-
selves sets.
Axiom. [4, 6]. The intuitive principle of abstraction states that a
formula P pxq defines a set A by the convention that the elements of A are
50
exactly those objects a such that P paq is a true statement.
This intuitive principle of abstraction is called by Halmos the Axiom of
Specification. This axiom states that to every set A and to every condition
Spxq there corresponds a set B whose elements are exactly those elements x
of A for which Spxq holds [2, 6].
Russell’s Paradox is a well-known paradox in set theory that arises from
the Intuitive Principle of Abstraction (the Axiom of Specification). Using
Russell’s terminology, we define a set A to be normal if A R A, a set A to
be abnormal if A P A, and a set a to be the set of all normal sets. Russell’s
paradox arises from the question: Is a normal or abnormal?
If a is normal, then it is abnormal—a contradiction. Conversely, if a is
abnormal, then it is normal—again a contradiction.
Russell’s paradox is a consequence of a fundamental issue in naive set
theory, which is that the unrestricted ability to form sets based on arbitrary
properties leads to paradoxes. These paradoxes demonstrate the need for
axiomatic set theory, which provides a rigorous and formal framework for
avoiding such paradoxes. Such an apparatus was formulated by Zermelo as
follows.
Zermelo’s Axioms for Set Theory: There exist sets (denoted A, B,
C, . . . ) and there exist elements of sets (denoted a, b, c, . . . ) satisfying the
following postulates:
Axiom of Extension: If for all x, x P A implies x P B, and x P B
implies x P A, then A “ B.
Existence of the Null Set: There exists a set H with no elements.
Axiom of Pairing: If a and b are elements of sets A and B respectively,
there exists the set ta, bu.
Axiom of Infinity: If A and B are sets and x P B but x R A, then there
exists the set AY txu.
51
Axiom of Complements: If A and B are sets, then there exists the set
A´B consisting of all elements of A that are not also elements of B.
Axiom of Union: If A and B are sets, there exists the set A Y B of
all elements that are either elements of A or of B or of both A and B. The
strong form of this axiom is: If B is any collection of sets, there exists the
Ť
set B consisting of all elements that belong to some member of B.
Axiom of Intersection: If A and B are sets, there exists the set AXB
of all elements that are elements both of A and of B. The strong form of this
Ş
axiom is: If B is any collection of sets, there exists the set B consisting
of all elements that belong to every member of B.
Axiom of the Power Set: If X is a set, there exists the set PpXq, the
set of all subsets of X.
Axiom of Separation (or Abstraction or Specification): Rather,
If X is a set, and if ϕ is a property that an element of X either has or does
not have, then there exists the subset of X consisting of all elements of X
that have property ϕ.
Axiom of the Cartesian Product: If A and B are sets, there exists
the set A × B consisting of all elements ta, ta, buu where a is an element of
A and b is an element of B. The strong form of this axiom follows from the
Axiom of Choice. (See below.)
Axiom of Foundation: Every non-empty set A has an element x such
that AXtxu “ H. (This axiom rules out the existence of “the set of all sets.”)
Once one has defined what a function f from a set A to a set B is, one
may state the Axioms of Replacement and of Choice.
Axiom of Replacement: If A and B are sets and f is a function from
A to B, then there exists the set fpAq.
Axiom of Choice: If B is any collection of sets, then there exists a set
B consisting of one element each chosen in any way from every set in B.
52
1.3: Inclusion
Definition. [4, 10]. If A and B are sets, then A is included in B, symbol-
ized by
A Ď B,
if and only if each element of A is an element of B. This is synonymous with
saying A is a subset of B. Moreover, this is the same as saying B includes
A, symbolized by
B Ě A.
Hence, A Ď B and B Ě A each means that, for all x, if x P A, then x P B.
Definition. [4, 10]. The set A is properly included in B, symbolized by
A Ă B
(or, alternatively, A is a proper subset of B, and B properly includes
A), if and only if A Ď B and A ‰ B.
Definition. [4, 11]. The intuitive principle of extension implies that there
can be only one set with no elements. We call this set the empty set and
symbolize it by
H.
The existence of the empty set is really an axiom.
Axiom: There exists the “empty set” H containing no elements.
Axiom. [4, 11]. The set of all subsets of a set A is the power set of A,
symbolized by
PpAq.
Thus, PpAq is an abbreviation for
tB|B Ď Au.
The existence of the power set is called by Halmos the Axiom of Powers.
This axiom states that there exists a collection of sets that contains among
its elements all the subsets of a given set [2, 10].
Axiom. [2, 9]. Axiom of Pairing. For any two sets there exists a set that
they both belong to.
53
1.4: Operations on Sets
Axiom. [4, 12]. The union (sum, join) of the sets A and B, symbolized
by AYB and read “A union B” or “A cup B,” is the set of all objects which
are elements of either A or B; that is,
AYB “ tx|x P A or x P Bu.
The existence of the union is called by Halmos the Axiom of Unions. This
axiom states that for every collection of sets there exists a set that contains
all the elements that belong to at least one set of the given collection [2, 12].
Zermelo’s Axiom. If A is a set and x R A, there there exists a set
AY txu.
This postulate allowed Zermelo to present an alternative definition of the
natural numbers to that given by Peano. Zermelo’s definition of the natural
numbers is:
• 0 is the empty set H.
• 1 is the set tHu.
• 2 is the set tH, tHuu.
• 3 is the set tH, tHu, tH, tHuuu.
...
• etc.
.
The problem Russell sees with this definition is the symbol .., which means
“we repeat this process indefinitely.”
Definition. [4, 13]. The intersection of the sets A and B, symbolized
by A X B and read “A intersection B,” is the set of all objects which are
elements of both A and B; that is,
AXB “ tx|x P A and x P Bu.
The existence of the intersection is called by Halmos the Axiom of Inter-
sections. This axiom states that for every collection of sets there exists a set
that contains all the elements that belong to all sets of the given collection.
54
Definition. [4, 13]. Two sets A and B are disjoint if and only if AXB “ H.
Definition. [4, 13]. Two sets A and B intersect if and only if AXB ‰ H.
Definition. [4, 13]. A collection of sets is a disjoint collection if and only
if each distinct pair of its element sets is disjoint.
Definition. [4, 13]. A partition of a set X is a disjoint collection α of
nonempty and distinct subsets ofX such that each element ofX is an element
of some (and, hence, exactly one) element of α.
Definition. [4, 13]. The relative complement of A with respect to a set
X is X X A; this is usually shortened to X ´ A, read “X minus A.” Thus,
X ´ A “ tx P X|x R Au,
that is, the set of the elements of X which are not elements of A.
The existence of the complement of A in X is really an axiom:
Axiom: If A is a set and A Ď X, then there exists the complement of A
in X containing all elements of A not in X.
Definition. [4, 13]. The symmetric difference of sets A and B, symbol-
ized by A`B, is defined as follows:
A`B “ pA´Bq Y pB ´ Aq.
Definition. [4, 13]. If all sets under consideration in a certain discussion are
subsets of a set U , then U is called the universal set (for that discussion).
1.5: The Algebra of Sets
Definition. [4, 17]. The basic ingredients of the algebra of sets are various
identities—equations which are true whatever the universal set U and no
matter what particular subsets the letters (other than U and H) represent.
55
Theorem (5.1). [4, 17-18] For any subsets A, B, C of a set U the following
equations are identities. Here, A is an abbreviation for U ´ A.
AY pB Y Cq “ pAYBq Y C. p1q
AX pB X Cq “ pAXBq X C. p11q
AYB “ B Y A. p2q
AXB “ B X A. p21q
AY pB X Cq “ pAYBq X pAY Cq. p3q
AX pB Y Cq “ pAXBq Y pAX Cq. p31q
AYH “ A. p4q
AX U “ A. p41q
AY A “ U. p5q
AX A “ H. p51q
Identities 1 and 11 are referred to as the associative laws for union and
intersection, respectively, and identities 2 and 21 as the commutative laws
for these operations. Identities 3 and 31 are the distributive laws for unions
and intersection, respectively. Notice that each member of a pair is obtain-
able from the other member by interchanging Y and X and, simultaneously,
H and U .
Definition. [4, 18]. An equation, or an expression, or a statement within
the framework of the algebra of sets obtained from another by interchanging
Y and X along with H and U throughout is the dual of the original.
Definition. [4, 19]. Accepting the fact that every theorem of the algebra
of sets is deducible from 1 ´ 5 and 11 ´ 51, we then obtain the principle of
duality for the algebra of sets: If T is any theorem expressed in terms of Y,
X, and , then the dual of T is also a theorem.
Theorem (5.2). [4, 19-20]. For all subsets A and B of a set U , the following
56
statements are valid. Here, A is an abbreviation for U ´ A.
If, for all A, AYB “ A, then B “ H. p6q
If, for all A, AXB “ A, then B “ U. p61q
If AYB “ U and AXB “ H, then B “ A. p7, 71q
A “ A. p8, 81q
H “ U. p9q
U “ H. p91q
AY A “ A. p10q
AX A “ A. p101q
AY U “ U. p11q
AXH “ H. p111q
AY pAXBq “ A. p12q
AX pAYBq “ A. p121q
AYB “ AXB. p13q
AXB “ AYB. p131q
10 and 101 are the idempotent laws, 12 and 121 are the absorption laws,
and 13 and 131 the DeMorgan laws. The identities 7, 71 and 8, 81 are each
numbered twice to emphasize that each is unchanged by the operation which
converts it into its dual; such formulas are call self-dual.
Theorem (5.3). [4, 20]. The following statements about sets A and B are
equivalent to one another.
1. A Ď B.
2. AXB “ A.
3. AYB “ B.
1.6: Relations
Definition. [4, 24]. The ordered pair of x and y, symbolized by
px, yq,
57
is the set
ttxu, tx, yuu,
that is, the two-element set one of whose elements, tx, yu, is the unordered
pair involved, and the other, txu, determines which element of this unordered
pair is to be considered as being “first.”
Definition. [4, 24]. We call x the first coordinate and y the second
coordinate of the ordered pair px, yq.
Definition. [4, 25]. The ordered triple of x, y, z, symbolized by px, y, zq,
is defined to be the ordered pair ppx, yq, zq.
Definition. [4, 25]. Assuming that ordered pn´1q-tuples have been defined,
we take the ordered n-tuple of x1, x2, ¨ ¨ ¨ , xn, symbolized by px1, x2, ¨ ¨ ¨ , xnq
to be ppx1, x2, ¨ ¨ ¨ , xn´1q, xnq.
Definition. [4, 25]. A binary relation is a set of ordered pairs, that is, a
set each of whose elements is an ordered pair.
Definition. [4, 25]. If R is a relation, we write px, yq P R and xRy inter-
changably, and we say that x is R-related to y.
Definition. [4, 26]. If R is a relation, then the domain of R is
tx | for some y, px, yq P Ru.
Definition. [4, 26]. If R is a relation, then the range of R, is
ty | for some x, px, yq P Ru.
Definition. [4, 26]. The Cartesian product, denoted X ˆ Y , is the set
of all pairs px, yq, such that x is an element of some fixed set X and y is an
element of some fixed set Y . Thus,
X ˆ Y “ tpx, yq | x P X and y P Y u.
Definition. [4, 26]. If R is a relation and R Ď X ˆ Y , then R is referred to
as a relation from X to Y .
Definition. [4, 26]. A relation from Z to Z will be called a relation in Z.
58
Definition. [4, 26]. If X is a set, then X ˆX is a relation in X which we
shall call the universal relation in X; this is a suggestive name, since, for
each pair x, y of elements in X, we have xpX ˆXqy. At the other extreme
is the void relation in X, consisting of the empty set. Intermediate is the
identity relation in X, symbolized by ı or ıx, which is tpx, xq|x P Xu.
Definition. [4, 27]. If R is a relation and A is a set, then RpAq is defined
by
RpAq “ ty | for some x in A, xRyu.
This set is suggestively called the set of R-relatives of elements of A.
1.7: Equivalence Relations
Definition. [4, 29]. A relation R in a set X is reflexive if xRx for each x
in X.
Definition. [4, 29]. A relation R in a set X is symmetric if xRy implies
yRx.
Definition. [4, 29]. A relation R in a set X is transitive if xRy and yRz
implies xRz.
Definition. [4, 29]. A relation in a set is an equivalence relation if it is
reflexive, symmetric, and transitive.
Definition. [4, 30]. If R is an equivalence relation on the set X, then a
subset A of X is an equivalence class (R-equivalence class) if and only if
there is a member x of A such that A is equal to the set of all y for which
xRy.
Theorem (7.1). [4, 30]. Let R be an equivalence relation on X. Then the
collection of distinct R-equivalence classes is a partition of X. Conversely,
if P is a partition of X, and a relation R is defined by aRb if and only if
there exists A in P such that a, b P A, then R is an equivalence relation on
X. Moreover, if an equivalence relation R determines the partition P of X,
then the equivalence relation defined by P is equal to R. Conversely, if a
partition P of X determines the equivalence relation R, then the partition
of X defined by R is equal to P .
59
Definition. [4, 29]. The relation of congruence modulo n in Z is defined
for a non-zero integer n as follows: x is congruent to y, symbolized x ” y
(mod n), if and only if n divides x´y. This relation is an equivalence relation
on Z.
Definition. [4, 31]. A class of congruent numbers is often called a residue
class modulo n.
Definition. [4, 31]. If R is an equivalence relation on X, we shall denote
the partition of X induced by R by X{R (read “X modulo R”) and call it
the quotient set of X by R.
Theorem (7.2). [4, 32]. A relation R is an equivalence relation if and only
if there exists a disjoint collection P of nonempty sets such that
R “ tpx, yq|for some C in P , px, yq P C ˆ Cu.
1.8: Functions
Definition. [4, 34]. A function is a relation such that no two distinct
elements have the same first coordinate. Hence, f is a function if and only
if it satisfies the following conditions:
1. The members of f are ordered pairs.
2. If px, yq and px, zq are elements of f , then y “ z.
Synonyms for the word “function” are numerous and include transforma-
tion, map or mapping, correspondence, and operator.
Definition. [4, 34]. If f is a function and px, yq P f , so that xfy, then x is
an argument of f .
Definition. [4, 34]. If f is a function and px, yq P f , so that xfy, there
is a great variety of terminology for y; for example, the value of f at x,
the image of x under f , the element into which f carries x. There are also
various symbols for y: xf , fpxq (or, more simply, fx), xf .
Definition. [4, 35]. A function f with range Rf is into Y if and only if
Rf Ă Y , and f is onto Y if and only if Rf “ Y .
60
Definition. [4, 35]. For corresponding notation for the domain of a function
we shall say that f is on X when the domain of f is X. The symbols
f : X Ñ fY and X ÑÝ Y
are commonly used to signify that f is a function on the set X into the set
Y .
Definition. [4, 36]. If f : X Ñ Y , and if A Ď X, then f X pA ˆ Y q is a
function on A into Y (called the restriction of f to A and abbreviated f |A).
Explicitly, f |A is the function on A such that pf |Aqpaq “ fpaq for a in A.
Definition. [4, 36]. Complementary to the definition of a restriction, the
function f is an extension of a function g if and only if g Ď f .
Definition. [4, 36]. We denote the identity map on X as iX .
Definition. [4, 36]. If A Ď X, then iX |A “ iA. If iX |A is considered as a
function on A into X, then it is the injection mapping on A into X.
Definition. [4, 36]. A function is called one-to-one if it maps distinct
elements onto distinct elements. Symbolically, a function f is one-to-one if
and only if
x1 ‰ x2 implies fpx1q ‰ fpx2q.
That is,
fpx1q “ fpx2q implies x1 “ x2.
Because of the symmetrical situation that a one-to-one map on X onto Y
portrays, it is often called a one-to-one correspondence between X and
Y .
Definition. [4, 36]. Introducing the notation Xn for the set of all n-tuples
px1, x2, ¨ ¨ ¨ , xnq, where each x is a member of the set X, a function, whose
domain is Xn and whose range is included in X, is an n-ary operation in
X. In place of “1-ary” we shall say “unary”; in place of “2-ary,” we shall say
“binary.”
61
1.9: Composition and Inversion for Functions
Definition. [4, 38-39]. The composite of functions f and g, symbolized
g ˝ f
is the set
tpx, zq|there is a y such that xfy and ygzu.
This relation is a function, and this operation for functions is called (func-
tional) composition.
Definition. [4, 41]. If f is one-to-one, the function resulting from f by
interchanging the coordinates of members of f is called the inverse function
of f , symbolized
f´1.
This operation, which is defined only for one-to-one functions, is called (func-
tional) inversion.
Definition. [4, 42]. A set of the form f´1rAs we call the inverse or counter
image of A under f .
1.10: Operations for Collections of Sets
Axiom. [4, 43]. Let A be a collection of sets. The union of A is the set of
all objects x such that x belongs to at least one set of the collection A. That
is, it is
tx|x P X for some X in Au.
This set is symbolized by
ď ď ď
A or tX|X P Au or X.
XPA
As in the case of the union of two sets, the existence of the union of an
arbitrary number of sets (possibly infinite) is really an axiom.
Axiom. [4, 44]. The intersection of a nonempty collection A of sets is the
set of all objects x such that x belongs to every set of the collection A. That
is, it is
tx|x P X for all X in Au.
62
This set is symbolized by
č č č
A or tX|X P Au or X.
XPA
Similarly to the situation with regard to arbitrary unions, the existence of
the intersection of an arbitrary, possibly infinite, collection of sets is really
an axiom.
Definition. [4, 45]. Let y be a function on a set I into a set Y . Let us call
an element i of the domain I an index; I itself an index set; the range of
y an indexed set; and the function y itself a family. We shall denote the
value of y at i by yi and call yi the ith coordinate of the family. Thereby,
we may write
y “ tpi, yiq P I ˆ Y |i P Iu.
Once we shall have defined the natural numbers we will be able to make
the following definition:
Definition. [4, 45]. A sequence is a family on the set of positive (or,
nonnegative) integers into a set Y . That is, a sequence is a function for
which t1, 2, ¨ ¨ ¨ , n, ¨ ¨ ¨ u or t0, 1, ¨ ¨ ¨ , n, ¨ ¨ ¨ u serves as an index set.
Definition. [4, 45]. By the phrase “a family tAiu of subsets of U” we shall
understand a function A on some set I of indices into P(U). The union of
the range of such a family is called the union of the family tAiu or the union
of the sets Ai. The standard notation for it is
ď ď ď
tAi|i P Iu or Ai or Ai.
iPI i
Theorem (10.1). [4, 46]. Let tAiu with i P I be a family of subsets of U
and let B Ď U . Then
ď ď č č
B X Ai “ pB X Aiq and B Y Ai “ pB Y Aiq. p1q
i i i i
ď ď č ď
U ´ Ai “ pU ´ Aiq and U ´ Ai “ pU ´ Aiq. p2q
i i i i
If J is a subset of I, then
ď ď ď ď
Aj Ď Ai and Aj Ě Ai. p3q
jPJ iPI jPJ iPI
63
Axiom. [4, 47]. If tAiu with i P I is a family of sets, then the Cartesian
product of the family, in symbols
ą ą ą
tAi|i P Iu or Ai or Ai
iPI i
is the set of all families taiu with domain I and such that ai P Ai for each i P I.
The existence of the Cartesian product of an arbitrary (possibly infinite)
collection of sets is called by Halmos the Axiom of Choice. This axiom
states that the Cartesian product of a non-empty family of non-empty sets
is non-empty [2, 59].
Definition. [4, 47]. Let tAiu with i P I be a family of sets and let A be its
Cartesian product. If J is a subset of I, then there is a natural correspondence
Ś
of the elements of A with those of iPJ Ai. To formulate this explicity, we
use the fact that an element a of A is a family taiu with I as domain. Then
Ś
the element b, let us say, of iPJ Ai which is the natural correspondent of a
is the restriction of a to J . We shall write bi for ai when i P J . The function
Ś
on A whose value at a is b is called the projection on A onto iPJ Ai. If
J “ tju and pj is the projection on A onto Aj, then pjpaq “ aj, which is
called the j-coordinate of a.
Definition. [4, 48]. A relation R in X is antisymmetric if and only if for
each x and y in X the validity of xRy and yRx imply that x “ y.
Definition. [4, 48]. A partial ordering in a set X is a reflexive, anti-
symmetric, and transitive relation in X. Note that it is custom to designate
partial orderings by the symbol ď.
Definition (4, 49). A relation R partially orders a set Y if and only if
R X pY ˆ Y q is a partial ordering in Y [4, 49].
Definition. [4, 49]. If the relation ď partially orders X, and x and y are
elements of X, it may or may not be the case that x ď y. If it is not, we
write x ę y. Additionally, we abbreviate “x ď y and x ‰ y” to “x ă y” and
say x is less than y, or x precedes y, or y is greater than x. When it
is convenient, we use y ě x and y ą x as alternatives for x ď y and x ă y,
respectively.
Definition. [4, 49]. A relation R in X is irreflexive (Russell’s “aliorela-
tive”) if and only if for no x in X is xRx.
64
Definition. [4, 50]. A relation R is a total (or simple or linear) ordering
if and only if it is a partial ordering such that xRy or yRx whenever x and
y are distinct members of the domain (which is equal to the range) of R
Definition. [4, 50]. A relation R totally orders a set Y if and only if
R X pY ˆ Y q is a total ordering in Y .
Definition. [4, 50]. A partially ordered set is an ordered pair pX,ďq
such that ď partially orders X.
Definition. [4, 50]. A totally ordered set or chain is an ordered pair
pX,ďq such that ď totally orders X.
Definition. [4, 52]. A function f : X Ñ X 1 is order-preserving (isotone)
relative to an ordering ď for X and an ordering ď1 for X 1 if and only if x ď y
implies fpxq ď1 fpyq.
Definition. [4, 52]. An isomorphism between the partially ordered sets
pX,ďq and pX 1,ď1q is a one-to-one correspondence between X and X 1 such
that both it and its inverse are order-preserving. (Note that if we do not have
an inverse function that can reverse an order-preserving function between X
and X 1, then we call such a function a homomorphism.) If such a corre-
spondence exists, then one partially ordered set is an isomorphic image of
the other, or, more simply, the two partially ordered sets are isomorphic.
Definition. [4, 53]. The least element of a set X relative to a partial
ordering ď is a y in X such that y ď x for all x in X. If such an element
exists, then it is unique, so one should speak of the least element of X.
Definition. [4, 53]. A minimal element of a set X relative to ď is a y
in X, such that for no x in X is x ă y. Such an element is not necessarily
unique.
Definition. [4, 53]. The greatest element of a set X relative to ď is a
y in X such that x ď y for all x in X. If such an element exists, then it is
unique, so one should speak of the greatest element of X.
Definition. [4, 53]. A maximal element of a set X relative to ď is a y
in X, such that for no x in X is x ą y. Such an element is not necessarily
unique.
65
Definition. [4, 53]. A partially-ordered set pX,ďq is well-ordered if and
only if each nonempty subset has a least element.
Definition. [4, 53]. If pX,ďq is a partially ordered set and A Ď X, then an
element x in X is an upper bound of A if and only if a ď x for all a in A.
Similarly, an element x in X is a lower bound of A if and only if x ď a for
all a in A. A set may have many upper and lower bounds.
Lemma. [2, 62]. Zorn’s Lemma. If X is a partially ordered set such that
every chain in X has an upper bound, then X contains a maximal element.
Definition. [4, 53]. An element x in X is a least upper bound or supre-
mum for A, denoted supA, if and only if x is an upper bound for A and
x ď y for any upper bound y for A. Hence, a supremum is an upper bound
that is a lower bound for the set of all upper bounds.
Definition. [4, 53]. An element x in X is a greatest lower bound or
infimum for A, denoted inf A, if and only if x is a lower bound for A and
x ě w for any lower bound w for A. Hence, an infimum is an lower bound
that is an upper bound for the set of all lower bounds.
Chapter 2: The Natural Number Sequence
and Its Generalizations
2.1: The Natural Number Sequence
The rigorous study of the natural numbers began with Peano. The central
importance of his postulates is evident from the fact that Russell begins his
Introduction to Mathematical Philosophy by discussing them.
Stoll’s naive approach to the natural numbers system N is as follows.
Definition. [4, 57]. The successor function, denoted 1 (prime), is a prim-
itive function, used in generating the natural numbers. We start with an
initial object 0 (zero) and say that the successor of any object n already
generated is another uniquely determined object n1. Therefore, the natural
numbers N appear as a set of objects
0, 01, p01q1, pp01q1q1, ppp01q1q1q1 ¨ ¨ ¨
66
or, more simply,
0, 01, 02, 03 ¨ ¨ ¨ .
The transition to the usual notation is made upon introducing
0, 1, 2, 3 ¨ ¨ ¨
to stand for
0, 01, 02, 03 ¨ ¨ ¨ ,
and then employing decimal notation. The remainder of the description of
this function can be expressed in two properties:
1. 1 is a one-to-one mapping on N into N z t0u.
2. if M is a subset of N such that 0 P M and m1 P M whenever m P M ,
then M “ N.
Property p2q is the basis for the principle of induction (or the principle
of weak induction).
This approach is cleaned up a bit by the following formulation.
Modern Formulation of Peano’s Postulates:
There exists a non-empty set N, called the set of natural numbers.
• Postulate 1: There exists a natural number 0 P N.†
• Postulate 2: There exists a function S, called the successor function,
from N to N z t0u.
• Postulate 3: The function S is one-to-one, that is, Spmq “ Spnq ùñ
m “ n.
• Postulate 4: There is no m P N such that Spmq “ 0.
• Postulate 5: The Principle of Mathematical Induction: If M Ď N,
0 PM , and n PM ùñ Spnq PM , then M “ N.
†Although Peano actually used the natural number 1 as the unique non-successor, we
will, for the sake of consistency, use the natural number 0.
67
Postulates (1) through (4) give us a series of successors, resulting in an
endless series of continually new numbers. And by the mathematical induc-
tion of (5), every number belongs to the series.
From these postulates it was customary to define the addition and mul-
tiplication of natural numbers, as follows.
Definition. Addition, `, is the function from Nˆ N into N defined by
#
@ P m` 0 “ mm,n N :
m` Spnq “ Spm` nq.
Definition. Multiplication, ¨, is the function from N ˆ N into N defined
by
#
@m,n P m ¨ 0 “ 0N :
m ¨ Spnq “ m` pm ¨ nq.
Russell saw various problems with these definitions. For example, are
addition and multiplication both well-defined? That is, does a given number
actually define the number we want it to define? How do we know, for in-
stance, that 5` 12 “ 10` 7?
As a result of Russell’s criticism of the Peano Postulate method of deriv-
ing the natural numbers, the following approach was developed.
Definition. [4, 58]. A triple pX, g, xoq, where X is a set, g is a function on
X into X (in other words, a unary operation in X), and xo is an element
of X, is a unary system.
Definition. [4, 58]. An integral system is a unary system pX, g, xoq such
that
1. g is a one-to-one mapping on X into X ´ txou
2. if Y is a subset of X such that xo P Y and yg P Y whenever y P Y ,
then Y “ X.
68
The existence of such an integral system is guaranteed by the Axiom of
Infinity, which Zermelo was the first to recognize as necessary. This axiom
states that there exists a set containing 0 and containing the successor of
each of its elements [2, 44].
Hence, the natural number system of Peano’s Postulates listed above may
be summarized by the assertion that pN, S, 0q is an integral system.
Definition. [4, 59]. Two integral systems pX, s, x0q and pY, t, y0q are iso-
morphic if there exists a one-to-one correspondence f between X and Y
with fpx0q “ y0 and fpxsq “ pfxqt for all x in X. This means that the
elements of X can be paired with those of Y in such a way that successors
of corresponding elements correspond
Definition. [4, 59]. Let pX, g, x0q be a unary system. The set of descen-
dents of x0 under g (in symbols, Dgx0) is the intersection of all subsets A
of X, such that x0 P A and xg P A, whenever x P A.
Lemma (1.1). [4, 59]. Let pX, g, x0q be a unary system. Then Dgx0 is
the smallest subset of X which contains x0 and which is closed under g.
Alternatively, x P Dgx0 if and only if x “ x0 or there exists a y in x P Dgx0
such that x “ yg.
Lemma (1.2). [4, 59]. Let pX, s, x0q be an integral system and pY, t, y0q be
a unary system. Define
s▽t : X ˆ Y Ñ X ˆ Y with px, yqs▽t “ pxs, ytq.
Then pX ˆ Y, s▽t, px0, y0qq is a unary system. If f is the set of descendents
of px0, y0q under s▽t, then
1. f is a function on X into Y .
2. fx0 “ y0 and fpxsq “ pfxqt for all x in X, and
3. f is uniquely determined by the properties in (2).
This result is called by Halmos the Recursion Theorem. It enables us
to proceed to the non-naive definitions of the addition and multiplication of
natural numbers.
Theorem (1.1). [4, 61]. Any two integral systems are isomorphic.
69
We now consider the particular integral system pN, S, 0q of Peano’s Pos-
tulates.
Theorem (1.2). [4, 61]. Let B be a nonempty set, c be an element of B,
and g be a function on NˆB into B. Then there exists exactly one function
k : NÑ B such that
kp0q “ c and kpSpnqq “ gpn, kpnqq.
Theorem (1.3). [4, 63]. The relation ď well-orders N.
Theorem (1.4). [4, 63]. For the integral system pN, S, 0q there exists exactly
one function α : Nˆ NÑ N such that
1. for each n in N, αp0, nq “ n, and
2. for all m and n in N, αpSpmq, nq “ Spαpm,nqq.
This function is addition in N. We shall henceforth write m` n instead of
αpm,nq.
Theorem (1.5). [4, 64]. Addition in N has the following properties.
1. Associativity. For all m, n, and p in N,
m` pn` pq “ pm` nq ` p
2. Commutativity. For all m and n in N, m` n “ n`m.
3. Cancellation laws. For all m, n, and p in N, p ` m “ p ` n implies
m “ n and m` p “ n` p implies m “ n.
4. For all m, n, and p in N, m ď n if and only if there exists p in N such
that p`m “ n.
5. For all m, n, and p in N, m ă n if and only if p`m ă p` n.
6. For all m and n in N, m` n “ 0 implies m “ 0 and n “ 0.
Theorem (1.6). [4, 66]. For the integral system pN, S, 0q there is exactly
one function µ : Nˆ NÑ N such that
1. for each n in N, µp0, nq “ 0, and
70
2. for all m and n in N, µpSpmq, nq “ µpm,nq ` n.
This function is multiplication in N. We shall henceforth write mn instead
of µpm,nq.
Theorem (1.7). [4, 66]. Multiplication in N has the following properties.
1. Associativity. For all m, n, and p in N,
mpnpq “ pmnqp
2. Commutativity. For all m and n in N, mn “ nm.
3. Cancellation laws. For all m, n, and p in N, p ‰ 0 and pm “ pn or
mp “ np imply m “ n.
4. Distributivity over addition. For all m, n, and p in N, mpn ` pq “
mn`mp and pn` pqm “ nm` pm.
5. For all m, n, and p in N, p ‰ 0 implies that m ă n if and only if
pm ă pn.
6. For all m and n in N, mn “ 0 implies m “ 0 or n “ 0 or, what is
equivalent, if m ‰ 0 and n ‰ 0, then mn ‰ 0.
Theorem. [4, 68]. Let pX, s, x0q and pX˚, s˚, x˚0q be integral systems. Let
`, ¨, and ď be the addition, the multiplication, and the ordering relation,
respectively, in X which satisfy the earlier definitions. Let `˚, ¨˚, and ď˚ be
the corresponding relations in X˚.Then there exists a one-to-one mapping f
on X onto X˚ which preserves each of these relation in the following sense:
1. fpx` yq “ fpxq `˚ fpyq,
2. fpx ¨ yq “ fpxq ¨˚ fpyq,
3. x ď y if and only if fpxq ď˚ fpyq.
71
2.3: Cardinal Numbers
Definition. [4, 79]. Two sets are similar or equinumerous, symbolized
A „ B,
if and only if there exists a one-to-one correspondence between A and B.
Definition. [4, 80]. Recall that the set of all subsets of a set A is the power
set of A, symbolized byPpAq. Similarity is an equivalence relation onPpUq
and a cardinal number is a similarity set. If A PPpUq, then the cardinal
number of A, symbolized
A or card A,
is the cardinal number having A as an element. For example, let U “ t1, 2, 3u,
so that
PpUq “ ttu, t1u, t2u, t3u, t1, 2u, t2, 3u, t1, 3u, t1, 2, 3uu.
Let A “ t1u. The cardinal numbers of U are
ttuu, tt1u, t2u, t3uu, tt1, 2u, t2, 3u, t1, 3uu, and tt1, 2, 3uu.
Since the cardinal number tt1u, t2u, t3uu has A as its element, it follows that
tt1u, t2u, t3uu is the cardinal number of A.
This definition is slightly different from that of Frege and Russell, who
identified the cardinal number M with the set of all sets similar to M . Both
definitions are satisfactory, however, since it can be successfully argued that
it is irrelevant to know in mathematics what cardinal numbers are, so long
as cardinal numbers have the property A “ B if and only if A „ B.
Definition. [4, 81]. To compare cardinals, we define the notion of “domi-
nation” for sets. If A and B are sets such that A is similar to a subset of B,
we write A À B, and say that A is dominated by B or that B dominates
A.
Theorem (3.1). [4, 81]. If A À B and B À A, then A „ B. This is known
as the Schröder–Bernstein Theorem.
72
Definition. [4, 82]. We define A ă B for sets A and B to mean that
A À B and not B À A (abbreviating “it is not the case that B À A” to “not
B À A”).
Lemma (3.1). [4, 82]. For sets A and B, A À B if and only if either A „ B
or A ă B.
Lemma (3.2). [4, 82]. For cardinal numbers a and b, a ă b if and only if
there exist respective representatives A and B such that A ă B.
Definition. [4, 83]. The natural numbers in the role of cardinal numbers
are the finite cardinals, and sets which have these cardinals are finite sets.
Theorem (3.2). [4, 84]. For each natural number n, the finite cardinal n
is the cardinal of the set of natural numbers which precede n in the natural
ordering.
Theorem (3.3). [4, 84]. For each natural number n, if A “ n, then A is not
similar to a proper subset of itself.
Theorem (3.4). [4, 85]. In their new role as cardinals, the natural numbers
are subject to the ordering of cardinal numbers generally, as defined above,
following Cantor; this ordering we write temporarily as ăc. In their original
role as members of N, the natural numbers possess the familiar ordering,
which we write as ăN. The natural ordering and the cardinal ordering agree
on N. That is, for all natural numbers p and q,
q ăN p if and only if q ăc p.
Definition. [4, 85]. A non-finite cardinal is an infinite or transfinite
cardinal.
Definition. [4, 85]. If the cardinal number of a set is infinite, then the set
is called infinite.
Definition. [4, 85]. The cardinal number of the set of natural numbers is
symbolized by
ℵ0.
Theorem (3.5). [4, 85]. If n is a finite cardinal, then n ă ℵ0.
73
Theorem (3.6). [4, 86]. For every set A, A ă PpAq or, in other words,
A ăPpAq.
The Cantor Paradox. [4, 128]. This paradox is derived from the set
defined by the formula
x is a set.
Let C be the set defined by this formula. Then C is the set of all sets. By
Theorem 3.6, PpCq ą C. Also, since C is the set of all sets and PpCq is
a set (the set whose members are the subsets of C), PpCq Ď C. Hence,
PpCq ď C or, in other words, it is false that PpCq ą C. Thus, it follows
that both “PpCq ą C” and the negation of the statement are valid. This is
a contradiction.
2.4: Countable Sets
Definition. [4, 87]. A set is denumerable if and only if it has cardinal
number ℵ0.
Definition. [4, 87]. A set is countable if and only if is is either finite or
denumerable.
Definition. [4, 87]. An enumeration of a denumerable set A is a specific
one-to-one correspondence between N and A.
Theorem (4.1). [4, 89]. A subset of a countable set is countable.
Theorem (4.2). [4, 89]. If the domain of a function is countable, then its
range it also countable.
Theorem (4.3). [4, 90] Nˆ N is denumerable.
Corollary. If X is a denumerable set, then so is X ˆX. More generally, if
n is a natural number, then Xn`1 is denumerable.
Theorem (4.4). [4, 91] If A is a nonempty finite collection of denumer-
Ť
able sets, then A is denumerable. If A is a nonempty finite collection of
Ť
countable sets, then A is countable.
Theorem (4.5). [4, 92] If A is a countable collection of countable sets, then
Ť
A is countable.
74
Definition. [4, 92]. We denote card PpNq by
ℵ.
Definition. [4, 92]. To say that a set is uncountable means that it is
infinite and non-denumerable.
Theorem (4.6). [4, 93] 2N is an uncountable set.
Proof. If a1, a2, . . . , an, . . . is any enumeration of elements from 2
N, then
an element a of 2N can be constructed that does not correspond to any an in
the enumeration. Consider the following enumeration of elements from 2N:
a1 “ p0, 0, 0, 0, 0, 0, 0, . . .q
a2 “ p1, 1, 1, 1, 1, 1, 1, . . .q
a3 “ p0, 1, 0, 1, 0, 1, 0, . . .q
a4 “ p1, 0, 1, 0, 1, 0, 1, . . .q
a5 “ p1, 1, 0, 1, 0, 1, 1, . . .q
a6 “ p0, 0, 1, 1, 0, 1, 1, . . .q
a7 “ p1, 0, 0, 0, 1, 0, 0, . . .q
...
By definition, the complementary of 0 is 1 and the complementary of 1 is 0.
To construct a sequence a, we choose its first digit to be the complementary
of the first digit of a1. Similarly, we choose the second digit of a to be the
complementary of the second digit of a2, the third digit as complementary
to the third digit of a3, and so on. In general, for every n, the nth digit of a
is chosen to be the complementary of the nth digit of an. For the example
75
given above, this procedure yields:
a1 “ p0, 0, 0, 0, 0, 0, 0, . . .q
a2 “ p1,1, 1, 1, 1, 1, 1, . . .q
a3 “ p0, 1,0, 1, 0, 1, 0, . . .q
a4 “ p1, 0, 1,0, 1, 0, 1, . . .q
a5 “ p1, 1, 0, 1,0, 1, 1, . . .q
a6 “ p0, 0, 1, 1, 0,1, 1, . . .q
a7 “ p1, 0, 0, 0, 1, 0,0, . . .q
...
a “ p1,0,1,1,1,0,1, . . .q
The procedure guarantees that a is an element of 2N, but it differs from
every an, since their n-th digits differ (as highlighted in the example). Since
a cannot be included in the enumeration, 2N is an uncountable set.
Definition. [4, 93-94]. The question of whether ℵ is the smallest cardinal
greater than ℵ0 is known as the continuum problem.
Definition. [4, 94]. It has been discovered that a number of theorems, some
of them important, can be based on the hypothesis that the answer to the
continuum problem is in the affirmative. This conjecture is known as the
continuum hypothesis.
2.5: Cardinal Arithmetic
Definition. [4, 95]. The sum, u ` v, of the cardinal numbers u and v is
AYB, where A and B are disjoint representatives of u and v, respectively.
Theorem (5.1). [4, 95]. For cardinal numbers u, v, and w,
1. u` v “ v ` u,
2. u` pv ` wq “ pu` vq ` w,
3. u ď v implies u` w ď v ` w.
Definition. [4, 95]. The product, uv, of the cardinal numbers u and v is
AˆB, where A and B are representatives of u and v, respectively
76
Theorem (5.2). [4, 96]. For cardinal numbers u, v, and w,
1. uv “ vu,
2. upvwq “ puvqw,
3. u ď v implies uw ď vw,
4. pu` vqw “ uw ` vw.
Definition. [4, 97]. If u and v are cardinals, the vth power of u, in symbols
uv, is card AB, where A and B are representatives of u and v, respectively.
Theorem (5.3). [4, 97] For cardinal numbers u, v, and w,
1. uvuw = uv`w,
2. puvqw “ uwvw,
3. puvqw = uvw,
4. u1 “ u and 1u “ 1,
5. u ď v implies wu ď wv,
6. u ď v imples uw ď vw.
2.6: Order Types
Definition. [4, 98]. Two chains X and Y are called ordinally similar,
symbolized
X « Y,
if and only if they are isomorphic ordered sets.
Definition. [4, 99]. An equivalence class under ordinal similarity is called
an order type.
Definition. [4, 100]. Let A and B be disjoint sets of order types α and
β. Then the sum, α ` β, of α and β is the order type of A Y B, totally
ordered as follows. Pairs in A and pairs in B are ordered according to the
total orderings of A and B, respectively, and each a in A precedes each b in
B.
Definition. [4, 100]. The product, αβ, of α and β is the order type of
AˆB ordered by
pa, bq ă pa1, b1q if and only if b ă b1, or b “ b1 and a ă a1.
77
2.7: Well-ordered Sets and Ordinal Numbers
Definition. [4, 103]. The principle of proof by transfinite induction is
as follows, where, as earlier, P pxq stands for “the element x has the property
P .” If P px0q, where x0 is the first element of the well-ordered set X, and if
for all z in X, P pyq for all y ă z implies P pzq, then P pxq for all x in X.
Definition. [4, 103]. If A is a well-ordered set and if x P A, then
ta P A|a ă xu is called the initial segment determined by x; this is denoted
by Ax.
Definition. [4, 103]. If B is an arbitrary nonempty set, then by a sequence
of type x in B we shall mean a function on Ax into B.
Definition. [4, 103]. The principle of definition by transfinite induction
may be stated as follows. Let A be a well-ordered set having a0 as its least
element, let B be a set, and let c be a member of B. If h is a function whose
range is included in B and whose domain is the set J of all sequences j of
type x in B for some x ‰ a0, then there exists exactly one function k : AÑ B
such that
kpa0q “ c and kpxq “ hpk|Axq,
for each x in A other than a0.
Theorem (7.1). [4, 103]. If A is a well-ordered set and f is an isomorphism
of A into itself, then a ď fpaq for each a in A.
Theorem (7.2). [4, 104]. A well-ordered set is not ordinally similar to any
of its initial segments.
Corollary (4, 104). If A is a well-ordered set and if Ax « Ay, then x “ y.
Theorem (7.3). [4, 104]. If A and B are ordinally similar well-ordered sets,
then there exists exactly one isomorphism between them.
Theorem (7.4). [4, 104]. If A and B are well-ordered sets, then exactly one
of the following hold: A is ordinally similar to B, A is ordinally similar to
an initial segment of B, or B is ordinally similar to an initial segment of A.
Corollary (4, 105). For well-ordered sets A and B, exactly one of A “ B,
A ă B, B ă A holds. In other words, any two cardinal numbers which have
well-ordered sets as representatives are comparable.
78
Definition. [4, 105]. The order type of a well-ordered set is called an ordi-
nal number, or simply an ordinal.
Definition. [4, 105]. The ordinals which are not natural numbers are called
transfinite ordinals.
Definition. [4, 105]. If α and β are ordinals, we shall say that α is less
than β, symbolized
α ă β,
if and only if there exists a representative of α which is ordinally similar to
an initial segment of β.
Theorem (7.5). [4, 106]. The set spαq of all ordinals less than the ordinal
α is a well-ordered set of ordinal number α.
Theorem (7.6). [4, 106]. Any set of ordinals is well-ordered.
Theorem (7.7). [4, 106]. If ∆ is any set of ordinals, then there exists
ordinals greater than any ordinal of ∆. Indeed, there exists a smallest such
ordinal.
Theorem (7.8). [4, 107]. If α and β are ordinals and β ą 0, then α`β ą α.
Definition. [4, 107]. Ordinals having a predecessor are ordinal numbers
of the first kind.
Definition. [4, 107]. Ordinals having no predecessor are ordinal numbers
of the second kind.
Theorem (7.9). [4, 107]. Let α and β be ordinals with α ă β. Then there
exists exactly one ordinal γ ą 0 such that α ` γ “ β.
Theorem (7.10). [4, 109]. For ordinal numbers α, β, and γ,
1. α ă β implies γ ` α ă γ ` β and conversely;
2. α ă β implies α` γ ď β ` γ; conversely, α` γ ă β ` γ implies α ă β.
3. α ă β and γ ą 0 imply γα ă γβ; conversely, γα ă γβ implies α ă β;
4. α ă β implies αγ ď βγ; conversely, αγ ă βγ implies α ă γ;
5. γ ` α “ γ ` β implies α “ β;
79
6. γα “ γβ and γ ą 0 imply α “ β.
Theorem (7.11). [4, 109]. If α and β are ordinals and β ą 0, then α has a
unique representation in the form
α “ βξ ` γ where 0 ď γ ă β.
2.10: Some Theorems Equivalent to the Axiom of Choice
Theorem (10.3). [4, 125]. (Hartog). The axiom of choice is equivalent to
the assertion that any two cardinal numbers are comparable.
Chapter 3: The Extension of the Natural Num-
bers to the Real Numbers
3.1: The System of Natural Numbers
Definition. [4, 132] A binary operation ‹ has the cancellation property
if and only if each of x ‹ z “ y ‹ z and z ‹ x “ z ‹ y implies that x “ y.
3.2: Differences
Definition. [4, 133]. By a difference we shall mean an ordered pair pm,nq.
In the set NˆN of all differences we introduced the relation „d (the subscript
is for “difference”) by defining
pm,nq „d pp, qq if and only if m` q “ p` n.
Lemma (2.1). [4, 133]. „d is an equivalence relation on Nˆ N.
Definition. [4, 133]. We shall call a difference pm,nq positive if and only
if m ą n.
Lemma (2.3). [4, 133]. If x, y, u, and v are differences and x „d u and
y „d v, then x` y „d u` v.
Lemma (2.4). [4, 133]. Addition of differences is associative and commu-
tative. The sum of two positive differences is a positive difference. Further,
addition is cancellable with respect to „d.
80
Lemma (2.5). [4, 133]. If x and y are differences, then there exists a differ-
ence z such that z ` x „d y.
Definition. [4, 133]. Another binary operation in N ˆ N, which we call
multiplication and symbolize by ¨, is defined for differences by
pm,nq ¨ pp, qq “ pmp` nq,mq ` npq.
Usually we shall write “xy” or “x ¨ y” for a product of differences.
Lemma (2.6). [4, 134]. If x, y, u, and v are differences and x „d u and
y „d v, then xy „d uv.
Lemma (2.7). [4, 134]. Multiplication of differences is associative and com-
mutative, and distributes over addition. The product of two positive dif-
ferences is a positive difference. Further, multiplication is cancellable with
respect to „d for differences other than those of the form pm,mq.
3.3: Integers
Definition. [4, 134]. Recalling Lemma 2.1, we define an integer to be a
„d-equivalence class. We shall write
rxsi
for the equivalence class determined by the difference x (The new subscript
is for “integer”). The set of integers will be symbolized by Z.
Definition. [4, 134]. We shall call an integer positive if and only if one
of its members is a positive difference. The set of positive integers will be
symbolized by Z`.
Definition. [4, 134]. Consider the relation Zˆ Z into Z:
tpprxsi, rysiq, rx` ysiq | x and y are differencesu.
We call this operation addition and symbolize it by +. Hence,
rxsi ` rysi “ rx` ysi.
Lemma (3.1). [4, 134]. Addition of integers is associative and commutative,
and has the cancellation property. Further, the sum of two positive integers
is a positive integer.
81
Lemma (3.2). [4, 135]. If x and y are integers, then there exists exactly one
integer z such that z ` x “ y.
Definition. [4, 135]. Let x be an integer. The negative of x, denoted ´x,
is an integer such that
p´xq ` x “ x` p´xq “ rp0, 0qsi.
Definition. [4, ]. Let Z be the set of positive integers. Consider the relation
Zˆ Z into Z:
tpprxsi, rysiq, rxysiq | x and y are differencesu.
We call this operation multiplication and symbolize it by ¨. Hence,
rxsi ¨ rysi “ rxysi.
Lemma (3.3). [4, 135]. Multiplication is associative and commutative, dis-
tributes over addition, and has the cancellation property if p0, 0q is not a
member of the factor to be cancelled. Further, the product of two positive
integers is a positive integer.
Definition. Z0 is the set of integers of the form rpn, 0qsi.
Definition. [4, 135]. Theorem 2.1.8 implies that the mapping f in N into
Z such that fpnq “ rpn, 0qsi is one-to-one, onto Z0, and preserves addition,
multiplication, and less than. We summarize these properties of f by calling
it an order-isomorphism of N onto Z0 and indicate the relationship of Z0
to N by referring to Z0 as an order-isomorphic image of N (or, saying
that Z0 is order-isomorphic to N).
Definition. [4, 136]. The order-isomorphism of N onto Z0 suggests that we
call the members of Z0 the integers which correspond to the natural numbers
and adopt “0i,” “1i,” “2i” ¨ ¨ ¨ as names for them. Since the remaining integers
(that is, the members of Z ´ Z0) have the form rp0,mqsi with m P N ´ t0u,
and since
rp0,mqsi “ ´rpm, 0qsi “ ´mi,
we acquire “´1i”, “´2i”, ¨ ¨ ¨ as names for the so-called negative integers.
Theorem (3.1). [4, 136]. The operations of addition and multiplication for
integers, together with 0i, 1i, and the set Z` of positive integers, have the
following properties for all integers x, y, and z.
82
1. x` py ` zq “ px` yq ` z.
2. x` y “ y ` x.
3. 0i ` x “ x.
4. There exists an integer z such that z ` x “ 0i.
5. xpyzq “ pxyqz.
6. xy “ yx
7. 1ix “ x.
8. xpy ` zq “ xy ` xz.
9. xz “ yz and z ‰ 0i imply that x “ y.
10. 0i ‰ 1i.
11. x, y P Z` imply that x` y P Z`.
12. x, y P Z` imply that xy P Z`.
13. Exactly one of x P Z`, x “ 0i,´x P Z` holds.
14. If ă is defined by x ă y if and only if y ´ x P Z`i i , then ăi totally
orders Z and well-orders t0 `iu Y Z .
3.4: Rational Numbers
Definition. [4, 138]. An ordered pair pa, bq with b ‰ 0i is called a quotient.
The quotient pa, bq will be written as
a
.
b
Definition. [4, 138]. The relation „q is introduced into the set of all quo-
tients by defining
a „ cq if and only if ad “ bc.
b d
This is an equivalence relation on the set of all quotients and has the further
property that
ac „ aq if c ‰ 0i.
bc b
83
Definition. [4, 138]. A quotient is positive if and only if ab is a positive
integer.
Definition. [4, 138]. We introduce addition and multiplication into the
set of quotients by way of the following definitions:
a ` c “ ad` bc
b d bd
a ¨ c “ ac.
b d bd
Since b ‰ 0i and d ‰ 0i imply that bd ‰ 0i, these are operations in the set of
quotients.
Lemma (4.1). [4, 138]. If x, y, u, and v are quotients and x „q u and y „q v,
then x` y „q u` v, xy „q uv and, if x is positive, then u is positive.
Definition. [4, 138]. We define a rational number to be a „q-equivalence
class. The rational number having the quotient x as a representative we write
as
rxss.
The letter s stands for “rational”—the letter r is reserved for the real num-
bers. The set of rational numbers is symbolized by Q.
Definition. [4, 139]. We say rxss is positive if and only if it contains a
quotient y such that y is positive. The set of positive rationals we symbolize
by Q`.
Theorem (4.1). [4, 139]. The operations of addition and multiplication for
rational numbers, together with 0s, 1s, and the set Q` of positive rationals,
have the following properties for all rationals x, y, and z.
1. x` py ` zq “ px` yq ` z.
2. x` y “ y ` x.
3. 0s ` x “ x.
4. There exists a z such that z ` x “ 0s.
5. xpyzq “ pxyqz.
84
6. xy “ yx
7. 1sx “ x.
8. If x ‰ 0s, there exists a z such that zx “ 1s.
9. xpy ` zq “ xy ` xz.
10. 1s ‰ 0s
11. x, y P Q` imply that x` y P Q`.
12. x, y P Q` imply that xy P Q`.
13. Exactly one of x P Q`, x “ 0s,´x P Q` holds.
14. If P is the intersection of all subsets of Q` which contain 1s and are
closed under addition, then, for each x P Q`, there exist a, b P P such
that xb “ a.
Definition. [4, 140]. For each x ‰ 0s the solution of zx “ 1s is unique.
This solution is called the inverse of x and is symolized by x´1.
Theorem (4.2). [4, 141]. Between any two distinct rational numbers there
is another rational number.
Theorem (4.3). [4, 141]. (Archimedean property). If r and s are positive
rational numbers, then there exists a positive integer n (properly, a positive
integral rational number n) such that nr ą s.
3.5: Cauchy Sequences of Rational Numbers
Definition. [4, 143]. A Cauchy sequence of rational numbers is a
sequence x of rational numbers such that for every positive rational number
ϵ there exists a positive integer N such that for every m,n ą N
|xn ´ xm| ă ϵ.
Definition. [4, 144]. The operations of addition and multiplication for
sequences of rational numbers is defined in the following way:
x` y “ u where un “ xn ` yn
xy “ v where vn “ xnyn.
85
Lemma (5.1). [4, 145]. If x is a Cauchy sequence of rational numbers, then
there exists a positive rational number δ such that for every n
|xn| ă δ.
Lemma (5.2). [4, 145]. If x and y are Cauchy sequences of rational numbers,
then x` y and xy are Cauchy sequences of rational numbers.
Definition. [4, 146]. If x and y are Cauchy sequences of rational numbers,
then
x „c y
if and only if for every positive rational number ϵ there is an integer N such
that for every n ą N ,
|xn ´ yn| ă ϵ.
Lemma (5.3). [4, 146]. The relation „c is an equivalence relation on the
set of all Cauchy sequences of rational numbers.
Definition. [4, 146]. If x is a Cauchy sequence of rational numbers, then x
is called positive if and only if there is a positive rational number ϵ and an
integer N such that for every n ą N
xn ą ϵ.
Lemma (5.4). [4, 146]. If x, y, u, and v are Cauchy sequences of rational
numbers and x „c u and y „c v, then x ` y „c u ` v, xy „c uv and, if x is
positive, then u is positive.
Lemma (5.5). [4, 147]. The sum and product of two positive Cauchy se-
quences are positive Cauchy sequences. Further, if x is any Cauchy sequence,
then exactly one of the following hold: x is positive, x „c 0c, ´x is positive.
Lemma (5.6). [4, 148]. If the Cauchy sequence x is not equivalent to 0c,
then there is a Cauchy sequence z such that zx „c 1c.
3.6: Real Numbers
Definition. [4, 149]. We define a real number as a „c-equivalence class of
Cauchy sequences of rational numbers. The real number having the Cauchy
sequence x as a representative we write as
rxsr.
86
The letter r stands for the real numbers. The set of real numbers is symbol-
ized by R.
Definition. [4, 149]. A real number is positive if and only if it contains a
positive Cauchy sequence.
Theorem (6.1). [4, 150]. The operations of addition and multiplication
for real numbers, together with 0r, 1r, and the set of positive reals, have
properties p1q ´ p13q listed in Theorem 4.1.
Theorem (6.2). [4, 151]. Between any two distinct real numbers there is
a rational real number. Precisely, if x and y are distinct real numbers, then
there exists a rational real number z such that if x ă y, then x ă z ă y while
if y ă x, then y ă z ă x.
Theorem (6.3). [4, 152]. (Archimedean property). If x and y are pos-
itive real numbers, then there exists a positive integer n (properly, a real
number n which corresponds to a rational which, in turn, corresponds to a
positive integer) such that nx ą y.
Theorem (6.4). [4, 152]. A nonempty set of real numbers which has an
upper bound has a least upper bound.
3.7: Further Properties of the Real Number System
Definition. [4, 154]. A Cauchy sequence of real numbers is a sequence
x of real numbers such that for every positive real number ϵ there exists a
positive integer N such that for every m,n ą N
|xn ´ xm| ă ϵ.
Definition. [4, 155]. The real number y is a limit of the sequence x of real
numbers if and only if for every positive real number ϵ there exists a positive
integer N such that for every n ą N
|xn ´ y| ă ϵ.
Lemma (7.1). [4, 155]. A sequence of real numbers has at most one limit.
87
Lemma (7.2). [4, 155]. Let a be a sequence of rational numbers and let x
be the sequence of real numbers such that for every n, xn “ panqr, the real
number corresponding to an. Then x is a Cauchy sequence if and only if a
is a Cauchy sequence. Further, if a is a Cauchy sequence and y is the real
number which it defines, then limxn “ y.
Theorem (7.1). [4, 156]. (Cauchy convergence principle). A sequence of
real numbers has a limit if and only if it is a Cauchy sequence.
Definition. [5, 160]. The famous method of defining the real numbers, by
Dedekind cuts, is outlined as follows. A cut of the rational numbers is an
ordered pair pA,Bq of sets such that
1. A and B are both non-empty,
2. AYB “ the set of rationals,
3. if x P A and y P B, then x ă y.
A is called the lower class and B the upper class, since every element of
A precedes every element of B. A real number is then simply a cut of the
rationals.
88
4 Conclusion
We conclude by discussing the material in Chapters 1 to 7 of Russell’s Intro-
duction to Mathematical Philosophy (Part I) that has since been adopted by
modern mathematics (Part II).
Russell introduces the concept of similarity (a “one-to-one relation”) and
uses it to define the notion of a finite set. Specifically, if there exists a one-
to-one relation between two sets, then they are said to have the same size or
“cardinality.” These concepts are also adopted in Robert Stoll’s “Set Theory
and Logic.” In Chapter 2 of Stoll’s book, the concept of a one-one corre-
spondence is used to define the notion of equinumerosity between sets [4,
79]. Moreover, Stoll uses a one-one correspondence to compare the sizes of
infinite sets. In particular, he introduces the concept of countability, which is
a way to classify sets with the same cardinality as the set of natural numbers
[3, 87].
Russell also focuses on the concept of finitude and its relation to mathe-
matical induction. Specifically, Russell says that mathematical induction is,
above all else, the essential characteristic that distinguishes the finite from
the infinite [3, 27]. It is thus unsurprising that the principle of mathematical
induction is treated by Stoll as a way to define the natural numbers, which
in the role of cardinal numbers are the finite cardinals [4, 83].
Russell also discusses the concept of order and how it is defined. Keeping
in mind that a relation is aliorelative (and hence asymmetric) if and only if
it is both antisymmetric and irreflexive, and that Russell’s “connected” and
“aliorelative” are synonymous with Stoll’s “comparable” and “irreflexive,” re-
spectively, we may say that Russell’s “serial” relation (which is aliorelative,
transitive, and connected) [3, 34] is, for all intents and purposes, equiva-
lent to Stoll’s “totally ordered” relation (which is reflexive, antisymmetric,
transitive, and comparable) [4, 50]. (The true equivalent of Russell’s serial
relation, not treated directly by Stoll, is a “strict totally ordered” relation,
which is irreflexive, antisymmetric, transitive, and comparable).
It is worth noting that Stoll’s definition of a “totally ordered” relation is
more flexible than that of Russell’s “serial” relation, in two ways. For one,
89
we could remove the “comparable” property from Stoll’s definition and be
left with the definition of a partially-ordered relation. Second, we could add
the property that “each non-empty subset of the set has a least element”
to Stoll’s definition and have a well-ordered relation. Such additions and
deletions are difficult with Russell’s definition, owing to the fact that the
“aliorelative” relation encompasses both antisymmetry and irreflexivity.
Russell also touches on the concept of “similarity between relations,”
defining it as the existence of a one-to-one relation between the elements of
one relation and the elements of the other. Specifically, if there is a one-to-one
correspondence between the domain of the first relation and the domain of
the second relation, and a one-to-one correspondence between the co-domain
of the first relation and the co-domain of the second relation, and so on for
all the terms of the relations, then the two relations are similar. In such a
case the two relation “do not depend upon the actual terms in their fields”
[3, 54]. This idea closely resembles that of an isomorphism [4, 52].
Russell then talks about the development of the different number systems.
Russell and Stoll both use the natural numbers as a starting point for
the construction of the real numbers. Russell defines the “natural numbers”
as the posterity of 0 with respect to the relation ”immediate predecessor”
(which is the converse of ”successor”) [3, 27]. But then generalizes this
definition, defining the natural numbers to be those “to which proofs by
mathematical induction can be applied, i.e., as those that possess all induc-
tive properties” [3, 27]. Stoll (as noted above) incorporates the principle
of mathematical induction to define the natural numbers, but asserts more
formally that pN, S, 0q is an integral system [4, 58].
Russell and Stoll then move to positive and negative integers. Russell
says that, if m is any natural number, `m will be the relation of n `m to
n (for any n), and ´m will be the relation of n to n ` m [4, 63]. On the
other hand, Stoll calls an integer positive if and only if one of its elements is
a positive difference. And if x is any integer, then there exists exactly one
integer, symbolized by ´x (the “negative of x”), such that
p´xq ` x “ x` p´xq “ rp0, 0qsi [4, 134-135].
Russell defines a fraction as being that relation which holds between two
90
inductive numbers x, y, when xn “ ym. Similarly, Stoll introduces the rela-
tion „q into the set of quotients by defining
a „ cq if and only if ad “ bc.
b d
Finally, both Russell and Stoll arrive at the real numbers. Russell uses
Dedekind cuts, which involves dividing all the terms of a series into two sets,
of which the one “wholly precedes” the other. Russell confines himself to cuts
in which the lower section has no maximum; these no-maximum sections are
called “segments.” This allows him to define a “real number” as a segment
of the series of ratios in order of magnitude. Stoll introduces a relation,
symbolized by „c, in the set of all Cauchy sequences of rational numbers. If
x and y are Cauchy sequences of rational numbers, then
x „c y
if and only if for every positive rational number ϵ there is an integers N such
that for every n ą N ,
|xn ´ yn| ă ϵ.
This allows Stoll to define a real number as a „c-equivalence class of Cauchy
sequences of rational numbers.
91
References
[1] Frege, Gottlob, The Foundations of Arithmetic, translated by J. L.
Austin, second revised edition, Harper Torchbooks/The Science Library,
Harper and Brothers, New York, 1960.
[2] Halmos, Paul R., Naive Set Theory, The University Series in Under-
graduate Mathematics, D. Van Nostrand Co., Inc. Princeton, 1960.
[3] Russell, Bertrand A. W., Introduction to Mathematical Philosophy, Sec-
ond Ed., Dover-Hill, 1993.
[4] Stoll, Robert R., Set Theory and Logic, First Ed., W. H. Freeman and
Company, 1963.
[5] Suppes, Patrick, Axiomatic Set Theory, D. Van Nostrand Co., Inc.
Princeton, 1965.
92