Download as pdf or txt
Download as pdf or txt
You are on page 1of 53

Fundamentals of Linear Algebra and

Optimization Jean Gallier And Jocelyn


Quaintance
Visit to download the full and correct content document:
https://textbookfull.com/product/fundamentals-of-linear-algebra-and-optimization-jean-
gallier-and-jocelyn-quaintance/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Fundamentals of linear algebra and optimization Gallier


J.

https://textbookfull.com/product/fundamentals-of-linear-algebra-
and-optimization-gallier-j/

Algebra Topology Differential Calculus and Optimization


Theory For Computer Science and Engineering Jean
Gallier

https://textbookfull.com/product/algebra-topology-differential-
calculus-and-optimization-theory-for-computer-science-and-
engineering-jean-gallier/

Fundamentals of optimization theory with applications


to machine learning Gallier J.

https://textbookfull.com/product/fundamentals-of-optimization-
theory-with-applications-to-machine-learning-gallier-j/

Fundamentals of optimization theory with applications


to machine learning Gallier J.

https://textbookfull.com/product/fundamentals-of-optimization-
theory-with-applications-to-machine-learning-gallier-j-2/
Linear Algebra and Optimization for Machine Learning: A
Textbook Charu C. Aggarwal

https://textbookfull.com/product/linear-algebra-and-optimization-
for-machine-learning-a-textbook-charu-c-aggarwal/

Basics of algebra topology and differential calculus


Gallier J

https://textbookfull.com/product/basics-of-algebra-topology-and-
differential-calculus-gallier-j/

Fundamentals of Linear Algebra Textbooks in Mathematics


1st Edition J.S. Chahal

https://textbookfull.com/product/fundamentals-of-linear-algebra-
textbooks-in-mathematics-1st-edition-j-s-chahal/

Linear Algebra Seymour Lipschutz

https://textbookfull.com/product/linear-algebra-seymour-
lipschutz/

Basic Linear Algebra Jürgen Müller

https://textbookfull.com/product/basic-linear-algebra-jurgen-
muller/
Fundamentals of Linear Algebra
and Optimization

Jean Gallier and Jocelyn Quaintance


Department of Computer and Information Science
University of Pennsylvania
Philadelphia, PA 19104, USA
e-mail: jean@cis.upenn.edu

c Jean Gallier

October 22, 2017


2

N. BOURBAKI FASCICULE VI

ELEMENTS
'
DE MATHEMATIQUE
PREMIERE PARTIE

LIVRE II

ALGEBRE
CHAPITRE 2

ALGEBJlE LINEAIRE
J!:DITION, REFONDUE

HERMANN
115, BOULEVARD SAINT-GERMAIN, PARIS VI

Figure 1: Cover page from Bourbaki, Fascicule VI, Livre II, Algèbre, 1962

..........6.
3

156 ALGEBRE LINEAIRE § "i'


a) uest bijectif; .et
b) u
est injectif;
c) u
est surjectif;
d) u
est inflersible a droite;
e) U est iMersible a gauche; .son
f) u est de rang n.
dire
Si E est un espace vectoriel de dimension infinie, il y a des endo-
morphismes injectifs (resp. surjectifs) de E qui ne sont pas bijec-
tifs (exerc. 9).

Soient K, K ' d eux corps, cr : K -+ K' un isomorphisme de K d' es


sur K', E un K-espace vectoriel, E' un K'-espace vectoriel, la s
u : E -+ E' une application semi-lineaire relative a cr ( § 1, no 13) ;
on appelle encore rang de u la dimension du sous-espace u(E) de-
E'. C'est aussi le rang de u considere comme application Iineaire- est e
de E dans cr*(E') , car toute base de u(E) est aussi une base de-
cr*( u(E) ).
riel
5. Dual d'un espace vectoriel. le so

THEOREME 4. - La dimension du dual E* d'un espace 'recto-


riel E est au moins egale a la dimension de E. Pour que E* soit .I
l' app
de dimension finie, il faut et ·il suffit que E le soit, et on a alors- pour
dimE* = dim E. finie
Si K est Ie corps des scalaires de E, E est isomorphe a_ -
un espace et par suite E* est isomorphe a ( § 2, no 6,. finie
prop. 10). Comme est un sous-espace de on a dimE = Supp
Card( I) dimE* (n° 2, cor. 4 du th. 3); en outre, si I est fini, ser q
on a = (cf. exerc. 34)). E* =
famil
CoROLLAIRE. - Pour espace flectoriel E, les relations· le sou
E = I0I et E* = 10 ! sont equifJalentes. la so
traine
THEOREME 5. - Etant donnees deux suites exactes d'espaces: F' (n
flectoriels (sur un meme corps K) .et d' applications lineaires . de m
0 -+ E' -+ E -+ E" -+ 0 H" d
0 -+ F' -+ F -+ F" -+ 0 est c

Figure 2: Page 156 from Bourbaki, Fascicule VI, Livre II, Algèbre, 1962
4
Contents

1 Vector Spaces, Bases, Linear Maps 11


1.1 Motivations: Linear Combinations, Linear Independence, Rank . . . . . . . 11
1.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3 Linear Independence, Subspaces . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4 Bases of a Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.5 Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2 Matrices and Linear Maps 47


2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.2 Haar Basis Vectors and a Glimpse at Wavelets . . . . . . . . . . . . . . . . 63
2.3 The Effect of a Change of Bases on Matrices . . . . . . . . . . . . . . . . . 80
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3 Direct Sums, Affine Maps, The Dual Space, Duality 85


3.1 Direct Products, Sums, and Direct Sums . . . . . . . . . . . . . . . . . . . . 85
3.2 Affine Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.3 The Dual Space E ∗ and Linear Forms . . . . . . . . . . . . . . . . . . . . . 102
3.4 Hyperplanes and Linear Forms . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.5 Transpose of a Linear Map and of a Matrix . . . . . . . . . . . . . . . . . . 121
3.6 The Four Fundamental Subspaces . . . . . . . . . . . . . . . . . . . . . . . 128
3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

4 Gaussian Elimination, LU, Cholesky, Echelon Form 133


4.1 Motivating Example: Curve Interpolation . . . . . . . . . . . . . . . . . . . 133
4.2 Gaussian Elimination and LU -Factorization . . . . . . . . . . . . . . . . . . 137
4.3 Gaussian Elimination of Tridiagonal Matrices . . . . . . . . . . . . . . . . . 163
4.4 SPD Matrices and the Cholesky Decomposition . . . . . . . . . . . . . . . . 166
4.5 Reduced Row Echelon Form . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
4.6 Transvections and Dilatations . . . . . . . . . . . . . . . . . . . . . . . . . . 188
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

5 Determinants 195
5.1 Permutations, Signature of a Permutation . . . . . . . . . . . . . . . . . . . 195

5
6 CONTENTS

5.2 Alternating Multilinear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 199


5.3 Definition of a Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
5.4 Inverse Matrices and Determinants . . . . . . . . . . . . . . . . . . . . . . . 210
5.5 Systems of Linear Equations and Determinants . . . . . . . . . . . . . . . . 213
5.6 Determinant of a Linear Map . . . . . . . . . . . . . . . . . . . . . . . . . . 214
5.7 The Cayley–Hamilton Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 215
5.8 Permanents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
5.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
5.10 Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

6 Vector Norms and Matrix Norms 225


6.1 Normed Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
6.2 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
6.3 Condition Numbers of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 244
6.4 An Application of Norms: Inconsistent Linear Systems . . . . . . . . . . . . 253
6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

7 Eigenvectors and Eigenvalues 257


7.1 Eigenvectors and Eigenvalues of a Linear Map . . . . . . . . . . . . . . . . . 257
7.2 Reduction to Upper Triangular Form . . . . . . . . . . . . . . . . . . . . . . 264
7.3 Location of Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

8 Iterative Methods for Solving Linear Systems 273


8.1 Convergence of Sequences of Vectors and Matrices . . . . . . . . . . . . . . 273
8.2 Convergence of Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . 276
8.3 Methods of Jacobi, Gauss-Seidel, and Relaxation . . . . . . . . . . . . . . . 278
8.4 Convergence of the Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

9 Euclidean Spaces 291


9.1 Inner Products, Euclidean Spaces . . . . . . . . . . . . . . . . . . . . . . . . 291
9.2 Orthogonality, Duality, Adjoint of a Linear Map . . . . . . . . . . . . . . . 299
9.3 Linear Isometries (Orthogonal Transformations) . . . . . . . . . . . . . . . . 311
9.4 The Orthogonal Group, Orthogonal Matrices . . . . . . . . . . . . . . . . . 314
9.5 QR-Decomposition for Invertible Matrices . . . . . . . . . . . . . . . . . . . 316
9.6 Some Applications of Euclidean Geometry . . . . . . . . . . . . . . . . . . . 320
9.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

10 QR-Decomposition for Arbitrary Matrices 323


10.1 Orthogonal Reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
10.2 QR-Decomposition Using Householder Matrices . . . . . . . . . . . . . . . . 327
10.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
CONTENTS 7

11 Hermitian Spaces 333


11.1 Hermitian Spaces, Pre-Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . 333
11.2 Orthogonality, Duality, Adjoint of a Linear Map . . . . . . . . . . . . . . . 342
11.3 Linear Isometries (Also Called Unitary Transformations) . . . . . . . . . . . 347
11.4 The Unitary Group, Unitary Matrices . . . . . . . . . . . . . . . . . . . . . 349
11.5 Orthogonal Projections and Involutions . . . . . . . . . . . . . . . . . . . . 352
11.6 Dual Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
11.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

12 Spectral Theorems 361


12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
12.2 Normal Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
12.3 Self-Adjoint and Other Special Linear Maps . . . . . . . . . . . . . . . . . . 370
12.4 Normal and Other Special Matrices . . . . . . . . . . . . . . . . . . . . . . . 377
12.5 Conditioning of Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . 380
12.6 Rayleigh Ratios and the Courant-Fischer Theorem . . . . . . . . . . . . . . 383
12.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391

13 Introduction to The Finite Elements Method 393


13.1 A One-Dimensional Problem: Bending of a Beam . . . . . . . . . . . . . . . 393
13.2 A Two-Dimensional Problem: An Elastic Membrane . . . . . . . . . . . . . 404
13.3 Time-Dependent Boundary Problems . . . . . . . . . . . . . . . . . . . . . . 407

14 Singular Value Decomposition and Polar Form 415


14.1 Singular Value Decomposition for Square Matrices . . . . . . . . . . . . . . 415
14.2 Singular Value Decomposition for Rectangular Matrices . . . . . . . . . . . 423
14.3 Ky Fan Norms and Schatten Norms . . . . . . . . . . . . . . . . . . . . . . 426
14.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427

15 Applications of SVD and Pseudo-Inverses 429


15.1 Least Squares Problems and the Pseudo-Inverse . . . . . . . . . . . . . . . . 429
15.2 Properties of the Pseudo-Inverse . . . . . . . . . . . . . . . . . . . . . . . . 434
15.3 Data Compression and SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
15.4 Principal Components Analysis (PCA) . . . . . . . . . . . . . . . . . . . . . 440
15.5 Best Affine Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
15.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450

16 Annihilating Polynomials; Primary Decomposition 453


16.1 Annihilating Polynomials and the Minimal Polynomial . . . . . . . . . . . . 453
16.2 Minimal Polynomials of Diagonalizable Linear Maps . . . . . . . . . . . . . 459
16.3 The Primary Decomposition Theorem . . . . . . . . . . . . . . . . . . . . . 465
16.4 Nilpotent Linear Maps and Jordan Form . . . . . . . . . . . . . . . . . . . . 471
8 CONTENTS

17 Topology 477
17.1 Metric Spaces and Normed Vector Spaces . . . . . . . . . . . . . . . . . . . 477
17.2 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
17.3 Continuous Functions, Limits . . . . . . . . . . . . . . . . . . . . . . . . . . 492
17.4 Continuous Linear and Multilinear Maps . . . . . . . . . . . . . . . . . . . . 499
17.5 The Contraction Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . 504
17.6 Futher Readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
17.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505

18 Differential Calculus 507


18.1 Directional Derivatives, Total Derivatives . . . . . . . . . . . . . . . . . . . 507
18.2 Jacobian Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
18.3 The Implicit and The Inverse Function Theorems . . . . . . . . . . . . . . . 528
18.4 Second-Order and Higher-Order Derivatives . . . . . . . . . . . . . . . . . . 533
18.5 Taylor’s Formula, Faà di Bruno’s Formula . . . . . . . . . . . . . . . . . . . 538
18.6 Futher Readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
18.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542

19 Quadratic Optimization Problems 545


19.1 Quadratic Optimization: The Positive Definite Case . . . . . . . . . . . . . 545
19.2 Quadratic Optimization: The General Case . . . . . . . . . . . . . . . . . . 553
19.3 Maximizing a Quadratic Function on the Unit Sphere . . . . . . . . . . . . 557
19.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562

20 Schur Complements and Applications 565


20.1 Schur Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
20.2 SPD Matrices and Schur Complements . . . . . . . . . . . . . . . . . . . . . 567
20.3 SP Semidefinite Matrices and Schur Complements . . . . . . . . . . . . . . 569

21 Convex Sets, Cones, H-Polyhedra 571


21.1 What is Linear Programming? . . . . . . . . . . . . . . . . . . . . . . . . . 571
21.2 Affine Subsets, Convex Sets, Hyperplanes, Half-Spaces . . . . . . . . . . . . 573
21.3 Cones, Polyhedral Cones, and H-Polyhedra . . . . . . . . . . . . . . . . . . 576

22 Linear Programs 583


22.1 Linear Programs, Feasible Solutions, Optimal Solutions . . . . . . . . . . . 583
22.2 Basic Feasible Solutions and Vertices . . . . . . . . . . . . . . . . . . . . . . 589

23 The Simplex Algorithm 597


23.1 The Idea Behind the Simplex Algorithm . . . . . . . . . . . . . . . . . . . . 597
23.2 The Simplex Algorithm in General . . . . . . . . . . . . . . . . . . . . . . . 606
23.3 How Perform a Pivoting Step Efficiently . . . . . . . . . . . . . . . . . . . . 613
23.4 The Simplex Algorithm Using Tableaux . . . . . . . . . . . . . . . . . . . . 616
CONTENTS 9

23.5 Computational Efficiency of the Simplex Method . . . . . . . . . . . . . . . 626

24 Linear Programming and Duality 629


24.1 Variants of the Farkas Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 629
24.2 The Duality Theorem in Linear Programming . . . . . . . . . . . . . . . . . 634
24.3 Complementary Slackness Conditions . . . . . . . . . . . . . . . . . . . . . 643
24.4 Duality for Linear Programs in Standard Form . . . . . . . . . . . . . . . . 644
24.5 The Dual Simplex Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 647
24.6 The Primal-Dual Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 653

25 Extrema of Real-Valued Functions 665


25.1 Local Extrema and Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . 665
25.2 Using Second Derivatives to Find Extrema . . . . . . . . . . . . . . . . . . . 675
25.3 Using Convexity to Find Extrema . . . . . . . . . . . . . . . . . . . . . . . 678
25.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688

26 Newton’s Method and Its Generalizations 689


26.1 Newton’s Method for Real Functions of a Real Argument . . . . . . . . . . 689
26.2 Generalizations of Newton’s Method . . . . . . . . . . . . . . . . . . . . . . 690
26.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696

27 Basics of Hilbert Spaces 697


27.1 The Projection Lemma, Duality . . . . . . . . . . . . . . . . . . . . . . . . 697
27.2 Farkas–Minkowski Lemma in Hilbert Spaces . . . . . . . . . . . . . . . . . . 714

28 General Results of Optimization Theory 717


28.1 Existence of Solutions of an Optimization Problem . . . . . . . . . . . . . . 717
28.2 Gradient Descent Methods for Unconstrained Problems . . . . . . . . . . . 731
28.3 Conjugate Gradient Methods for Unconstrained Problems . . . . . . . . . . 747
28.4 Gradient Projection for Constrained Optimization . . . . . . . . . . . . . . 757
28.5 Penalty Methods for Constrained Optimization . . . . . . . . . . . . . . . . 760
28.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762

29 Introduction to Nonlinear Optimization 763


29.1 The Cone of Feasible Directions . . . . . . . . . . . . . . . . . . . . . . . . . 763
29.2 The Karush–Kuhn–Tucker Conditions . . . . . . . . . . . . . . . . . . . . . 776
29.3 Hard Margin Support Vector Machine . . . . . . . . . . . . . . . . . . . . . 788
29.4 Lagrangian Duality and Saddle Points . . . . . . . . . . . . . . . . . . . . . 798
29.5 Uzawa’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814
29.6 Handling Equality Constraints Explicitly . . . . . . . . . . . . . . . . . . . . 820
29.7 Conjugate Function and Legendre Dual Function . . . . . . . . . . . . . . . 827
29.8 Some Techniques to Obtain a More Useful Dual Program . . . . . . . . . . 837
29.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846
10 CONTENTS

30 Soft Margin Support Vector Machines 849


30.1 Soft Margin Support Vector Machines; (SVMs1 ) . . . . . . . . . . . . . . . . 850
30.2 Soft Margin Support Vector Machines; (SVMs2 ) . . . . . . . . . . . . . . . . 859
30.3 Soft Margin Support Vector Machines; (SVMs20 ) . . . . . . . . . . . . . . . 865
30.4 Soft Margin SVM; (SVMs3 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 879
30.5 Soft Margin Support Vector Machines; (SVMs4 ) . . . . . . . . . . . . . . . . 882
30.6 Soft Margin SVM; (SVMs5 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 889
30.7 Summary and Comparison of the SVM Methods . . . . . . . . . . . . . . . 891

31 Total Orthogonal Families in Hilbert Spaces 903


31.1 Total Orthogonal Families, Fourier Coefficients . . . . . . . . . . . . . . . . 903
31.2 The Hilbert Space l2 (K) and the Riesz-Fischer Theorem . . . . . . . . . . . 911

Bibliography 920
Chapter 1

Vector Spaces, Bases, Linear Maps

1.1 Motivations: Linear Combinations, Linear Inde-


pendence and Rank
Consider the problem of solving the following system of three linear equations in the three
variables x1 , x2 , x3 ∈ R:

x1 + 2x2 − x3 = 1
2x1 + x2 + x3 = 2
x1 − 2x2 − 2x3 = 3.

One way to approach this problem is introduce the “vectors” u, v, w, and b, given by
       
1 2 −1 1
u = 2 v= 1  w= 1  b = 2
1 −2 −2 3

and to write our linear system as

x1 u + x2 v + x3 w = b.

In the above equation, we used implicitly the fact that a vector z can be multiplied by a
scalar λ ∈ R, where    
z1 λz1
λz = λ z2 = λz2  ,
  
z3 λz3
and two vectors y and and z can be added, where
     
y1 z1 y1 + z1
y + z = y2  + z2  = y2 + z2  .
y3 z3 y3 + z3

11
12 CHAPTER 1. VECTOR SPACES, BASES, LINEAR MAPS

The set of all vectors with three components is denoted by R3×1 . The reason for using
the notation R3×1 rather than the more conventional notation R3 is that the elements of
R3×1 are column vectors; they consist of three rows and a single column, which explains the
superscript 3 × 1. On the other hand, R3 = R × R × R consists of all triples of the form
(x1 , x2 , x3 ), with x1 , x2 , x3 ∈ R, and these are row vectors. However, there is an obvious
bijection between R3×1 and R3 and they are usually identifed. For the sake of clarity, in this
introduction, we will denote the set of column vectors with n components by Rn×1 .
An expression such as
x1 u + x2 v + x3 w
where u, v, w are vectors and the xi s are scalars (in R) is called a linear combination. Using
this notion, the problem of solving our linear system

x1 u + x2 v + x3 w = b.

is equivalent to determining whether b can be expressed as a linear combination of u, v, w.


Now, if the vectors u, v, w are linearly independent, which means that there is no triple
(x1 , x2 , x3 ) 6= (0, 0, 0) such that

x1 u + x2 v + x3 w = 03 ,

it can be shown that every vector in R3×1 can be written as a linear combination of u, v, w.
Here, 03 is the zero vector  
0
03 = 0 .

0
It is customary to abuse notation and to write 0 instead of 03 . This rarely causes a problem
because in most cases, whether 0 denotes the scalar zero or the zero vector can be inferred
from the context.
In fact, every vector z ∈ R3×1 can be written in a unique way as a linear combination

z = x1 u + x2 v + x3 w.

This is because if
z = x1 u + x2 v + x3 w = y1 u + y2 v + y3 w,
then by using our (linear!) operations on vectors, we get

(y1 − x1 )u + (y2 − x2 )v + (y3 − x3 )w = 0,

which implies that


y1 − x1 = y2 − x2 = y3 − x3 = 0,
by linear independence. Thus,

y 1 = x1 , y2 = x2 , y3 = x3 ,
1.1. MOTIVATIONS: LINEAR COMBINATIONS, LINEAR INDEPENDENCE, RANK13

which shows that z has a unique expression as a linear combination, as claimed. Then, our
equation
x1 u + x2 v + x3 w = b
has a unique solution, and indeed, we can check that

x1 = 1.4
x2 = −0.4
x3 = −0.4

is the solution.
But then, how do we determine that some vectors are linearly independent?
One answer is to compute the determinant det(u, v, w), and to check that it is nonzero.
In our case,
1 2 −1
det(u, v, w) = 2 1 1 = 15,
1 −2 −2
which confirms that u, v, w are linearly independent.
Other methods consist of computing an LU-decomposition or a QR-decomposition, or an
SVD of the matrix consisting of the three columns u, v, w,
 
 1 2 −1
A = u v w = 2 1 1 .
1 −2 −2

If we form the vector of unknowns  


x1
x = x2  ,

x3
then our linear combination x1 u + x2 v + x3 w can be written in matrix form as
  
1 2 −1 x1
x1 u + x2 v + x3 w = 2 1
 1   x2  ,
1 −2 −2 x3

so our linear system is expressed by


    
1 2 −1 x1 1
2 1 1   x2 = 2 ,
 
1 −2 −2 x3 3

or more concisely as
Ax = b.
14 CHAPTER 1. VECTOR SPACES, BASES, LINEAR MAPS

Now, what if the vectors u, v, w are linearly dependent? For example, if we consider the
vectors
     
1 2 −1
u= 2   v=  1  w=  1 ,
1 −1 2
we see that
u − v = w,
a nontrivial linear dependence. It can be verified that u and v are still linearly independent.
Now, for our problem
x1 u + x2 v + x3 w = b
to have a solution, it must be the case that b can be expressed as linear combination of u and
v. However, it turns out that u, v, b are linearly independent (because det(u, v, b) = −6),
so b cannot be expressed as a linear combination of u and v and thus, our system has no
solution.
If we change the vector b to  
3
b = 3 ,

0
then
b = u + v,
and so the system
x1 u + x2 v + x3 w = b
has the solution
x1 = 1, x2 = 1, x3 = 0.
Actually, since w = u − v, the above system is equivalent to

(x1 + x3 )u + (x2 − x3 )v = b,

and because u and v are linearly independent, the unique solution in x1 + x3 and x2 − x3 is

x1 + x3 = 1
x2 − x3 = 1,

which yields an infinite number of solutions parameterized by x3 , namely

x1 = 1 − x 3
x2 = 1 + x3 .

In summary, a 3 × 3 linear system may have a unique solution, no solution, or an infinite


number of solutions, depending on the linear independence (and dependence) or the vectors
1.1. MOTIVATIONS: LINEAR COMBINATIONS, LINEAR INDEPENDENCE, RANK15

u, v, w, b. This situation can be generalized to any n × n system, and even to any n × m


system (n equations in m variables), as we will see later.
The point of view where our linear system is expressed in matrix form as Ax = b stresses
the fact that the map x 7→ Ax is a linear transformation. This means that

A(λx) = λ(Ax)

for all x ∈ R3×1 and all λ ∈ R and that

A(u + v) = Au + Av,

for all u, v ∈ R3×1 . We can view the matrix A as a way of expressing a linear map from R3×1
to R3×1 and solving the system Ax = b amounts to determining whether b belongs to the
image of this linear map.
Yet another fruitful way of interpreting the resolution of the system Ax = b is to view
this problem as an intersection problem. Indeed, each of the equations

x1 + 2x2 − x3 = 1
2x1 + x2 + x3 = 2
x1 − 2x2 − 2x3 = 3

defines a subset of R3 which is actually a plane. The first equation

x1 + 2x2 − x3 = 1

defines the plane H1 passing through the three points (1, 0, 0), (0, 1/2, 0), (0, 0, −1), on the
coordinate axes, the second equation

2x1 + x2 + x3 = 2

defines the plane H2 passing through the three points (1, 0, 0), (0, 2, 0), (0, 0, 2), on the coor-
dinate axes, and the third equation

x1 − 2x2 − 2x3 = 3

defines the plane H3 passing through the three points (3, 0, 0), (0, −3/2, 0), (0, 0, −3/2), on
the coordinate axes. The intersection Hi ∩ Hj of any two distinct planes Hi and Hj is
a line, and the intersection H1 ∩ H2 ∩ H3 of the three planes consists of the single point
(1.4, −0.4, −0.4). Under this interpretation, observe that we are focusing on the rows of the
matrix A, rather than on its columns, as in the previous interpretations.
Another great example of a real-world problem where linear algebra proves to be very
effective is the problem of data compression, that is, of representing a very large data set
using a much smaller amount of storage.
16 CHAPTER 1. VECTOR SPACES, BASES, LINEAR MAPS

Typically the data set is represented as an m × n matrix A where each row corresponds
to an n-dimensional data point and typically, m ≥ n. In most applications, the data are not
independent so the rank of A is a lot smaller than min{m, n}, and the the goal of low-rank
decomposition is to factor A as the product of two matrices B and C, where B is a m × k
matrix and C is a k × n matrix, with k  min{m, n} (here,  means “much smaller than”):
   
   
    
   
   

 A =
  B 
 C 

 m×n   m×k
 

 k×n
   

Now, it is generally too costly to find an exact factorization as above, so we look for a
low-rank matrix A0 which is a “good” approximation of A. In order to make this statement
precise, we need to define a mechanism to determine how close two matrices are. This can
be done using matrix norms, a notion discussed in Chapter 6. The norm of a matrix A is a
nonnegative real number kAk which behaves a lot like the absolute value |x| of a real number
x. Then, our goal is to find some low-rank matrix A0 that minimizes the norm
2
kA − A0 k ,

over all matrices A0 of rank at most k, for some given k  min{m, n}.
Some advantages of a low-rank approximation are:

1. Fewer elements are required to represent A; namely, k(m + n) instead of mn. Thus
less storage and fewer operations are needed to reconstruct A.

2. Often, the process for obtaining the decomposition exposes the underlying structure of
the data. Thus, it may turn out that “most” of the significant data are concentrated
along some directions called principal directions.

Low-rank decompositions of a set of data have a multitude of applications in engineering,


including computer science (especially computer vision), statistics, and machine learning.
As we will see later in Chapter 15, the singular value decomposition (SVD) provides a very
satisfactory solution to the low-rank approximation problem. Still, in many cases, the data
sets are so large that another ingredient is needed: randomization. However, as a first step,
linear algebra often yields a good initial solution.
We will now be more precise as to what kinds of operations are allowed on vectors. In
the early 1900, the notion of a vector space emerged as a convenient and unifying framework
for working with “linear” objects and we will discuss this notion in the next few sections.
1.2. VECTOR SPACES 17

1.2 Vector Spaces


A (real) vector space is a set E together with two operations, + : E ×E → E and · : R×E →
E, called addition and scalar multiplication, that satisfy some simple properties. First of all,
E under addition has to be a commutative (or abelian) group, a notion that we review next.

However, keep in mind that vector spaces are not just algebraic
objects; they are also geometric objects.
Definition 1.1. A group is a set G equipped with a binary operation · : G × G → G that
associates an element a · b ∈ G to every pair of elements a, b ∈ G, and having the following
properties: · is associative, has an identity element e ∈ G, and every element in G is invertible
(w.r.t. ·). More explicitly, this means that the following equations hold for all a, b, c ∈ G:
(G1) a · (b · c) = (a · b) · c. (associativity);
(G2) a · e = e · a = a. (identity);
(G3) For every a ∈ G, there is some a−1 ∈ G such that a · a−1 = a−1 · a = e. (inverse).
A group G is abelian (or commutative) if
a · b = b · a for all a, b ∈ G.

A set M together with an operation · : M × M → M and an element e satisfying only


conditions (G1) and (G2) is called a monoid . For example, the set N = {0, 1, . . . , n, . . .} of
natural numbers is a (commutative) monoid under addition. However, it is not a group.
Some examples of groups are given below.
Example 1.1.
1. The set Z = {. . . , −n, . . . , −1, 0, 1, . . . , n, . . .} of integers is a group under addition,
with identity element 0. However, Z∗ = Z − {0} is not a group under multiplication.
2. The set Q of rational numbers (fractions p/q with p, q ∈ Z and q 6= 0) is a group
under addition, with identity element 0. The set Q∗ = Q − {0} is also a group under
multiplication, with identity element 1.
3. Similarly, the sets R of real numbers and C of complex numbers are groups under
addition (with identity element 0), and R∗ = R − {0} and C∗ = C − {0} are groups
under multiplication (with identity element 1).
4. The sets Rn and Cn of n-tuples of real or complex numbers are groups under compo-
nentwise addition:
(x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ),
with identity element (0, . . . , 0). All these groups are abelian.
18 CHAPTER 1. VECTOR SPACES, BASES, LINEAR MAPS

5. Given any nonempty set S, the set of bijections f : S → S, also called permutations
of S, is a group under function composition (i.e., the multiplication of f and g is the
composition g ◦ f ), with identity element the identity function idS . This group is not
abelian as soon as S has more than two elements.

6. The set of n × n matrices with real (or complex) coefficients is a group under addition
of matrices, with identity element the null matrix. It is denoted by Mn (R) (or Mn (C)).

7. The set R[X] of all polynomials in one variable with real coefficients is a group under
addition of polynomials.

8. The set of n × n invertible matrices with real (or complex) coefficients is a group under
matrix multiplication, with identity element the identity matrix In . This group is
called the general linear group and is usually denoted by GL(n, R) (or GL(n, C)).

9. The set of n × n invertible matrices with real (or complex) coefficients and determinant
+1 is a group under matrix multiplication, with identity element the identity matrix
In . This group is called the special linear group and is usually denoted by SL(n, R)
(or SL(n, C)).

10. The set of n × n invertible matrices with real coefficients such that RR> = In and of
determinant +1 is a group called the special orthogonal group and is usually denoted
by SO(n) (where R> is the transpose of the matrix R, i.e., the rows of R> are the
columns of R). It corresponds to the rotations in Rn .

11. Given an open interval (a, b), the set C(a, b) of continuous functions f : (a, b) → R is a
group under the operation f + g defined such that

(f + g)(x) = f (x) + g(x)

for all x ∈ (a, b).

It is customary to denote the operation of an abelian group G by +, in which case the


inverse a−1 of an element a ∈ G is denoted by −a.
The identity element of a group is unique. In fact, we can prove a more general fact:
Fact 1. If a binary operation · : M × M → M is associative and if e0 ∈ M is a left identity
and e00 ∈ M is a right identity, which means that

e0 · a = a for all a ∈ M (G2l)

and
a · e00 = a for all a ∈ M, (G2r)
then e0 = e00 .
1.2. VECTOR SPACES 19

Proof. If we let a = e00 in equation (G2l), we get

e0 · e00 = e00 ,

and if we let a = e0 in equation (G2r), we get

e0 · e00 = e0 ,

and thus
e0 = e0 · e00 = e00 ,
as claimed.

Fact 1 implies that the identity element of a monoid is unique, and since every group is
a monoid, the identity element of a group is unique. Furthermore, every element in a group
has a unique inverse. This is a consequence of a slightly more general fact:
Fact 2. In a monoid M with identity element e, if some element a ∈ M has some left inverse
a0 ∈ M and some right inverse a00 ∈ M , which means that

a0 · a = e (G3l)

and
a · a00 = e, (G3r)
then a0 = a00 .
Proof. Using (G3l) and the fact that e is an identity element, we have

(a0 · a) · a00 = e · a00 = a00 .

Similarly, Using (G3r) and the fact that e is an identity element, we have

a0 · (a · a00 ) = a0 · e = a0 .

However, since M is monoid, the operation · is associative, so

a0 = a0 · (a · a00 ) = (a0 · a) · a00 = a00 ,

as claimed.

Remark: Axioms (G2) and (G3) can be weakened a bit by requiring only (G2r) (the exis-
tence of a right identity) and (G3r) (the existence of a right inverse for every element) (or
(G2l) and (G3l)). It is a good exercise to prove that the group axioms (G2) and (G3) follow
from (G2r) and (G3r).
Before defining vector spaces, we need to discuss a strategic choice which, depending
how it is settled, may reduce or increase headackes in dealing with notions such as linear
20 CHAPTER 1. VECTOR SPACES, BASES, LINEAR MAPS

combinations and linear dependence (or independence). The issue has to do with using sets
of vectors versus sequences of vectors.
Our experience tells us that it is preferable to use sequences of vectors; even better,
indexed families of vectors. (We are not alone in having opted for sequences over sets, and
we are in good company; for example, Artin [6], Axler [8], and Lang [60] use sequences.
Nevertheless, some prominent authors such as Lax [64] use sets. We leave it to the reader
to conduct a survey on this issue.)
Given a set A, recall that a sequence is an ordered n-tuple (a1 , . . . , an ) ∈ An of elements
from A, for some natural number n. The elements of a sequence need not be distinct and
the order is important. For example, (a1 , a2 , a1 ) and (a2 , a1 , a1 ) are two distinct sequences
in A3 . Their underlying set is {a1 , a2 }.
What we just defined are finite sequences, which can also be viewed as functions from
{1, 2, . . . , n} to the set A; the ith element of the sequence (a1 , . . . , an ) is the image of i under
the function. This viewpoint is fruitful, because it allows us to define (countably) infinite
sequences as functions s : N → A. But then, why limit ourselves to ordered sets such as
{1, . . . , n} or N as index sets?
The main role of the index set is to tag each element uniquely, and the order of the
tags is not crucial, although convenient. Thus, it is natural to define an I-indexed family of
elements of A, for short a family, as a function a : I → A where I is any set viewed as an
index set. Since the function a is determined by its graph

{(i, a(i)) | i ∈ I},

the family a can be viewed as the set of pairs a = {(i, a(i)) | i ∈ I}. For notational
simplicity, we write ai instead of a(i), and denote the family a = {(i, a(i)) | i ∈ I} by (ai )i∈I .
For example, if I = {r, g, b, y} and A = N, the set of pairs

a = {(r, 2), (g, 3), (b, 2), (y, 11)}

is an indexed family. The element 2 appears twice in the family with the two distinct tags
r and b.
When the indexed set I is totally ordered, a family (ai )i∈I often called an I-sequence.
Interestingly, sets can be viewed as special cases of families. Indeed, a set A can be viewed
as the A-indexed family {(a, a) | a ∈ I} corresponding to the identity function.

Remark: An indexed family should not be confused with a multiset. Given any set A, a
multiset is a similar to a set, except that elements of A may occur more than once. For
example, if A = {a, b, c, d}, then {a, a, a, b, c, c, d, d} is a multiset. Each element appears
with a certain multiplicity, but the order of the elements does not matter. For example, a
has multiplicity 3. Formally, a multiset is a function s : A → N, or equivalently a set of pairs
{(a, i) | a ∈ A}. Thus, a multiset is an A-indexed family of elements from N, but not a
1.2. VECTOR SPACES 21

N-indexed family, since distinct elements may have the same multiplicity (such as c an d in
the example above). An indexed family is a generalization of a sequence, but a multiset is a
generalization of a set.
WePalso need to take care of an annoying technicality, which is to define sums of the
form i∈I ai , where I is any finite index set and (ai )i∈I is a family of elements in some set
A equiped with a binary operation + : A × A → A which is associative (axiom (G1)) and
commutative. This will come up when we define linear combinations.
The issue is that the binary operation + only tells us how to compute a1 + a2 for two
elements of A, but it does not tell us what is the sum of three of more elements. For example,
how should a1 + a2 + a3 be defined?
What we have to do is to define a1 +a2 +a3 by using a sequence of steps each involving two
elements, and there are two possible ways to do this: a1 + (a2 + a3 ) and (a1 + a2 ) + a3 . If our
operation + is not associative, these are different values. If it associative, then a1 +(a2 +a3 ) =
(a1 + a2 ) + a3 , but then there are still six possible permutations of the indices 1, 2, 3, and if
+ is not commutative, these values are generally different. If our operation is commutative,
then all six permutations have the same value. P Thus, if + is associative and commutative,
it seems intuitively clear that a sum of the form i∈I ai does not depend on the order of the
operations used to compute it.
This is indeed the case, but a rigorous proof requires induction, and such a proof is
surprisingly
P involved. Readers may accept without proof the fact that sums of the form
i∈I ai are indeed well defined, and jump directly to Definition 1.2. For those who want to
see the gory details, here we go.
P
First, we define sums i∈I ai , where I is a finite sequence of distinct natural numbers,
say I = (i1 , . . . , im ). If I = (i1 , . . . , im ) with m ≥ 2, we denote the sequence (i2 , . . . , im ) by
I − {i1 }. We proceed by induction on the size m of I. Let
X
ai = ai 1 , if m = 1,
i∈I
X  X 
ai = ai 1 + ai , if m > 1.
i∈I i∈I−{i1 }

For example, if I = (1, 2, 3, 4), we have


X
ai = a1 + (a2 + (a3 + a4 )).
i∈I

If the operation + is not associative, the grouping of the terms matters. For instance, in
general
a1 + (a2 + (a3 + a4 )) 6= (a1 + a2 ) + (a3 + a4 ).
22 CHAPTER 1. VECTOR SPACES, BASES, LINEAR MAPS
P
However, if the operation + is associative, the sum i∈I ai should not depend on the grouping
of the elements in I, as long as their order is preserved. For example, if I = (1, 2, 3, 4, 5),
J1 = (1, 2), and J2 = (3, 4, 5), we expect that
X X  X 
ai = aj + aj .
i∈I j∈J1 j∈J2

This indeed the case, as we have the following proposition.

Proposition 1.1. Given any nonempty set A equipped with an associative binary operation
+ : A × A → A, for any nonempty finite sequence I of distinct natural numbers and for
any partition of I into p nonempty sequences Ik1 , . . . , Ikp , for some nonempty sequence K =
(k1 , . . . , kp ) of distinct natural numbers such that ki < kj implies that α < β for all α ∈ Iki
and all β ∈ Ikj , for every sequence (ai )i∈I of elements in A, we have

X X X 
aα = aα .
α∈I k∈K α∈Ik

Proof. We proceed by induction on the size n of I.


If n = 1, then we must have p = 1 and Ik1 = I, so the proposition holds trivially.
Next, assume n > 1. If p = 1, then Ik1 = I and the formula is trivial, so assume that
p ≥ 2 and write J = (k2 , . . . , kp ). There are two cases.
Case 1. The sequence Ik1 has a single element, say β, which is the first element of I.
In this case, write C for the sequence obtained from I by deleting its first element β. By
definition, X 
X
aα = aβ + aα ,
α∈I α∈C

and
X X  X X 
aα = aβ + aα .
k∈K α∈Ik j∈J α∈Ij

Since |C| = n − 1, by the induction hypothesis, we have


 X  X X 
aα = aα ,
α∈C j∈J α∈Ij

which yields our identity.


Case 2. The sequence Ik1 has at least two elements. In this case, let β be the first element
of I (and thus of Ik1 ), let I 0 be the sequence obtained from I by deleting its first element β,
let Ik0 1 be the sequence obtained from Ik1 by deleting its first element β, and let Ik0 i = Iki for
1.2. VECTOR SPACES 23

i = 2, . . . , p. Recall that J = (k2 , . . . , kp ) and K = (k1 , . . . , kp ). The sequence I 0 has n − 1


elements, so by the induction hypothesis applied to I 0 and the Ik0 i , we get
X X X   X  X X 
aα = aα = aα + aα .
α∈I 0 k∈K α∈Ik0 α∈Ik0 j∈J α∈Ij
1

If we add the lefthand side to aβ , by definition we get


X
aα .
α∈I

If we add the righthand side to aβ , using associativity and the definition of an indexed sum,
we get
 X  X X    X  X X 
aβ + aα + aα = aβ + aα + aα
α∈Ik0 j∈J α∈Ij α∈Ik0 j∈J α∈Ij
1 1
X  X X 
= aα + aα
α∈Ik1 j∈J α∈Ij
X X 
= aα ,
k∈K α∈Ik

as claimed.
Pn P
If I = (1, . . . , n), we also write i=1 a i instead of i∈I ai . Since + is associative, Propo-
sition 1.1 shows that the sum ni=1 ai is independent of the grouping of its elements, which
P
justifies the use the notation a1 + · · · + an (without any parentheses).
If we also assume that
P our associative binary operation on A is commutative, then we
can show that the sum i∈I ai does not depend on the ordering of the index set I.

Proposition 1.2. Given any nonempty set A equipped with an associative and commutative
binary operation + : A × A → A, for any two nonempty finite sequences I and J of distinct
natural numbers such that J is a permutation of I (in other words, the unlerlying sets of I
and J are identical), for every sequence (ai )i∈I of elements in A, we have
X X
aα = aα .
α∈I α∈J

Proof. We proceed by induction on the number p of elements in I. If p = 1, we have I = J


and the proposition holds trivially.
If p > 1, to simplify notation, assume that I = (1, . . . , p) and that J is a permutation
(i1 , . . . , ip ) of I. First, assume that 2 ≤ i1 ≤ p − 1, let J 0 be the sequence obtained from J by
24 CHAPTER 1. VECTOR SPACES, BASES, LINEAR MAPS

deleting i1 , I 0 be the sequence obtained from I by deleting i1 , and let P = (1, 2, . . . , i1 −1) and
Q = (i1 + 1, . . . , p − 1, p). Observe that the sequence I 0 is the concatenation of the sequences
P and Q. By the induction hypothesis applied to J 0 and I 0 , and then by Proposition 1.1
applied to I 0 and its partition (P, Q), we have
iX1 −1   X p 
X X
aα = aα = ai + ai .
α∈J 0 α∈I 0 i=1 i=i1 +1

If we add the lefthand side to ai1 , by definition we get


X
aα .
α∈J

If we add the righthand side to ai1 , we get


1 −1
iX p
  X 
ai 1 + ai + ai .
i=1 i=i1 +1

Using associativity, we get


iX1 −1 p
  X   1 −1
iX p
  X 
ai 1 + ai + ai = ai 1 + ai + ai ,
i=1 i=i1 +1 i=1 i=i1 +1

then using associativity and commutativity several times (more rigorously, using induction
on i1 − 1), we get
 1 −1
iX p
  X  iX 1 −1  p
 X 
ai 1 + ai + ai = ai + ai 1 + ai
i=1 i=i1 +1 i=1 i=i1 +1
p
X
= ai ,
i=1

as claimed.
The cases where i1 = 1 or i1 = p are treated similarly, but in a simpler manner since
either P = () or Q = () (where () denotes the empty sequence).
P
Having done all this, we can now make sense of sums of the form i∈I ai , for any finite
indexed set I and any family a = (ai )i∈I of elements in A, where A is a set equipped with a
binary operation + which is associative and commutative.
Indeed, since I is finite, it is in bijection with the set {1, . . . , n} for some n ∈ N, and any
total ordering  on I corresponds to a permutation I of {1, . . . , n} (where P we identify a
permutation with its image). For any total ordering  on I, we define i∈I, ai as
X X
ai = aj .
i∈I, j∈I
1.2. VECTOR SPACES 25

Then, for any other total ordering 0 on I, we have


X X
ai = aj ,
i∈I,0 j∈I0

and since I and I0 are different permutations of {1, . . . , n}, by Proposition 1.2, we have
X X
aj = aj .
j∈I j∈I0

P
Therefore,
P the sum i∈I, ai does
P not depend on the total ordering on I. We define the sum
i∈I ai as the common value i∈I, ai for all total orderings  of I.

Vector spaces are defined as follows.

Definition 1.2. A real vector space is a set E (of vectors) together with two operations
+ : E × E → E (called vector addition)1 and · : R × E → E (called scalar multiplication)
satisfying the following conditions for all α, β ∈ R and all u, v ∈ E;

(V0) E is an abelian group w.r.t. +, with identity element 0;2

(V1) α · (u + v) = (α · u) + (α · v);

(V2) (α + β) · u = (α · u) + (β · u);

(V3) (α ∗ β) · u = α · (β · u);

(V4) 1 · u = u.

In (V3), ∗ denotes multiplication in R.

Given α ∈ R and v ∈ E, the element α · v is also denoted by αv. The field R is often
called the field of scalars.
In Definition 1.2, the field R may be replaced by the field of complex numbers C, in
which case we have a complex vector space. It is even possible to replace R by the field of
rational numbers Q or by any other field K (for example Z/pZ, where p is a prime number),
in which case we have a K-vector space (in (V3), ∗ denotes multiplication in the field K).
In most cases, the field K will be the field R of reals.
From (V0), a vector space always contains the null vector 0, and thus is nonempty.
From (V1), we get α · 0 = 0, and α · (−v) = −(α · v). From (V2), we get 0 · v = 0, and
(−α) · v = −(α · v).
1
The symbol + is overloaded, since it denotes both addition in the field R and addition of vectors in E.
It is usually clear from the context which + is intended.
2
The symbol 0 is also overloaded, since it represents both the zero in R (a scalar) and the identity element
of E (the zero vector). Confusion rarely arises, but one may prefer using 0 for the zero vector.
26 CHAPTER 1. VECTOR SPACES, BASES, LINEAR MAPS

Another important consequence of the axioms is the following fact: For any u ∈ E and
any λ ∈ R, if λ 6= 0 and λ · u = 0, then u = 0.
Indeed, since λ 6= 0, it has a multiplicative inverse λ−1 , so from λ · u = 0, we get

λ−1 · (λ · u) = λ−1 · 0.

However, we just observed that λ−1 · 0 = 0, and from (V3) and (V4), we have

λ−1 · (λ · u) = (λ−1 λ) · u = 1 · u = u,

and we deduce that u = 0.

Remark: One may wonder whether axiom (V4) is really needed. Could it be derived from
the other axioms? The answer is no. For example, one can take E = Rn and define
· : R × Rn → Rn by
λ · (x1 , . . . , xn ) = (0, . . . , 0)
for all (x1 , . . . , xn ) ∈ Rn and all λ ∈ R. Axioms (V0)–(V3) are all satisfied, but (V4) fails.
Less trivial examples can be given using the notion of a basis, which has not been defined
yet.
The field R itself can be viewed as a vector space over itself, addition of vectors being
addition in the field, and multiplication by a scalar being multiplication in the field.
Example 1.2.
1. The fields R and C are vector spaces over R.

2. The groups Rn and Cn are vector spaces over R, and Cn is a vector space over C.

3. The ring R[X]n of polynomials of degree at most n with real coefficients is a vector
space over R, and the ring C[X]n of polynomials of degree at most n with complex
coefficients is a vector space over C.

4. The ring R[X] of all polynomials with real coefficients is a vector space over R, and
the ring C[X] of all polynomials with complex coefficients is a vector space over C.

5. The ring of n × n matrices Mn (R) is a vector space over R.

6. The ring of m × n matrices Mm,n (R) is a vector space over R.

7. The ring C(a, b) of continuous functions f : (a, b) → R is a vector space over R.

Let E be a vector space. We would like to define the important notions of linear combi-
nation and linear independence. These notions can be defined for sets of vectors in E, but
it will turn out to be more convenient (in fact, necessary) to define them for families (vi )i∈I ,
where I is any arbitrary index set.
1.3. LINEAR INDEPENDENCE, SUBSPACES 27

1.3 Linear Independence, Subspaces


One of the most useful properties of vector spaces is that there possess bases. What this
means is that in every vector space, E, there is some set of vectors, {e1 , . . . , en }, such that
every vector v ∈ E can be written as a linear combination,
v = λ1 e1 + · · · + λn en ,
of the ei , for some scalars, λ1 , . . . , λn ∈ R. Furthermore, the n-tuple, (λ1 , . . . , λn ), as above
is unique.
This description is fine when E has a finite basis, {e1 , . . . , en }, but this is not always the
case! For example, the vector space of real polynomials, R[X], does not have a finite basis
but instead it has an infinite basis, namely
1, X, X 2 , . . . , X n , . . .
For simplicity, in this chapter, we will restrict our attention to vector spaces that have a
finite basis (we say that they are finite-dimensional ).
Given a set A, recall that an I-indexed family (ai )i∈I of elements of A (for short, a family)
is a function a : I → A, or equivalently a set of pairs {(i, ai ) | i ∈ I}. We agree that when
I = ∅, (ai )i∈I = ∅. A family (ai )i∈I is finite if I is finite.
Remark: When considering a family (ai )i∈I , there is no reason to assume that I is ordered.
The crucial point is that every element of the family is uniquely indexed by an element of
I. Thus, unless specified otherwise, we do not assume that the elements of an index set are
ordered.
Given two disjoint sets I and J, the union of two families (ui )i∈I and (vj )j∈J , denoted as
(ui )i∈I ∪ (vj )j∈J , is the family (wk )k∈(I∪J) defined such that wk = uk if k ∈ I, and wk = vk
if k ∈ J. Given a family (ui )i∈I and any element v, we denote by (ui )i∈I ∪k (v) the family
(wi )i∈I∪{k} defined such that, wi = ui if i ∈ I, and wk = v, where k is any index such that
k∈ / I. Given a family (ui )i∈I , a subfamily of (ui )i∈I is a family (uj )j∈J where J is any subset
of I.
In this chapter, unless specified otherwise, it is assumed that all families of scalars are
finite (i.e., their index set is finite).
Definition 1.3. Let E be a vector space. A vector v ∈ E is a linear combination of a family
(ui )i∈I of elements of E iff there is a family (λi )i∈I of scalars in R such that
X
v= λi ui .
i∈I
P
When I = ∅, we stipulate that v = 0. (By Proposition 1.2, sums of the form i∈I λi ui are
well defined.) We say that a family (ui )i∈I is linearly independent iff for every family (λi )i∈I
of scalars in R, X
λi ui = 0 implies that λi = 0 for all i ∈ I.
i∈I
28 CHAPTER 1. VECTOR SPACES, BASES, LINEAR MAPS

Equivalently, a family (ui )i∈I is linearly dependent iff there is some family (λi )i∈I of scalars
in R such that X
λi ui = 0 and λj 6= 0 for some j ∈ I.
i∈I

We agree that when I = ∅, the family ∅ is linearly independent.

Observe that defining linear combinations for families of vectors rather than for sets of
vectors has the advantage that the vectors being combined need not be distinct. For example,
for I = {1, 2, 3} and the families (u, v, u) and (λ1 , λ2 , λ1 ), the linear combination
X
λi ui = λ1 u + λ2 v + λ1 u
i∈I

makes sense. Using sets of vectors in the definition of a linear combination does not allow
such linear combinations; this is too restrictive.
Unravelling Definition 1.3, a family (ui )i∈I is linearly dependent iff either I consists of a
single element, say i, and ui = 0, or |I| ≥ 2 and some uj in the family can be expressed as
a linear combination of the other vectors in the family. Indeed, in the second case, there is
some family (λi )i∈I of scalars in R such that
X
λi ui = 0 and λj 6= 0 for some j ∈ I,
i∈I

and since |I| ≥ 2, the set I − {j} is nonempty and we get


X
uj = −λ−1
j λi ui .
i∈(I−{j})

Observe that one of the reasons for defining linear dependence for families of vectors
rather than for sets of vectors is that our definition allows multiple occurrences of a vector.
This is important because a matrix may contain identical columns, and we would like to say
that these columns are linearly dependent. The definition of linear dependence for sets does
not allow us to do that.
The above also shows that a family (ui )i∈I is linearly independent iff either I = ∅, or I
consists of a single element i and ui 6= 0, or |I| ≥ 2 and no vector uj in the family can be
expressed as a linear combination of the other vectors in the family.
When I is nonempty, if the family (ui )i∈I is linearly independent, note that ui 6= 0 for
all i ∈ I. Otherwise, if ui = 0 for some i ∈ I, then we get a nontrivial linear dependence
P
i∈I λi ui = 0 by picking any nonzero λi and letting λk = 0 for all k ∈ I with k 6= i, since
λi 0 = 0. If |I| ≥ 2, we must also have ui 6= uj for all i, j ∈ I with i 6= j, since otherwise we
get a nontrivial linear dependence by picking λi = λ and λj = −λ for any nonzero λ, and
letting λk = 0 for all k ∈ I with k 6= i, j.
1.3. LINEAR INDEPENDENCE, SUBSPACES 29

Thus, the definition of linear independence implies that a nontrivial linearly independent
family is actually a set. This explains why certain authors choose to define linear indepen-
dence for sets of vectors. The problem with this approach is that linear dependence, which
is the logical negation of linear independence, is then only defined for sets of vectors. How-
ever, as we pointed out earlier, it is really desirable to define linear dependence for families
allowing multiple occurrences of the same vector.

Example 1.3.

1. Any two distinct scalars λ, µ 6= 0 in R are linearly dependent.

2. In R3 , the vectors (1, 0, 0), (0, 1, 0), and (0, 0, 1) are linearly independent.

3. In R4 , the vectors (1, 1, 1, 1), (0, 1, 1, 1), (0, 0, 1, 1), and (0, 0, 0, 1) are linearly indepen-
dent.

4. In R2 , the vectors u = (1, 1), v = (0, 1) and w = (2, 3) are linearly dependent, since

w = 2u + v.

When I is finite, we often assume that it is the set I = {1, 2, . . . , n}. In this case, we
denote the family (ui )i∈I as (u1 , . . . , un ).
The notion of a subspace of a vector space is defined as follows.

Definition 1.4. Given a vector space E, a subset F of E is a linear subspace (or subspace)
of E iff F is nonempty and λu + µv ∈ F for all u, v ∈ F , and all λ, µ ∈ R.

It is easy to see that a subspace F of E is indeed a vector space, since the restriction
of + : E × E → E to F × F is indeed a function + : F × F → F , and the restriction of
· : R × E → E to R × F is indeed a function · : R × F → F .
It is also easy to see that any intersection of subspaces is a subspace.
Since F is nonempty, if we pick any vector u ∈ F and if we let λ = µ = 0, then
λu + µu = 0u + 0u = 0, so every subspace contains the vector 0. For any nonempty finite
index set I, one can show by induction on the cardinalityPof I that if (ui )i∈I is any family of
vectors ui ∈ F and (λi )i∈I is any family of scalars, then i∈I λi ui ∈ F .
The subspace {0} will be denoted by (0), or even 0 (with a mild abuse of notation).

Example 1.4.

1. In R2 , the set of vectors u = (x, y) such that

x+y =0

is a subspace.
Another random document with
no related content on Scribd:
holes through the deck into the deck beam and secure them with
three 1¹⁄₄-in. No. 10 screws. Bore three holes along the gunwale on
each side and turn three 1¹⁄₄-in. No. 10 screws into the deck.

A Center Thwart

To strengthen and stiffen the hull a center thwart, or cross bar,


should be run across the canoe amidships. A piece of oak or ash, ³⁄₈
in. thick and tapering from 2 in. in the center to 1 in. at the ends,
should be screwed to the lower side of the gunwale. Although not
exactly essential, it is a good plan to run another thwart across the
canoe just back of the forward seat, and a rear thwart some 3 ft.
forward of the rear seat, or paddling bar. This will make the craft very
stiff when a heavy load is carried, and likewise prevent the lightly
constructed hull from sagging, or “hogging,” when stored for the
winter.

Fig. 13
The Manner of Shaping the Ends of the Canvas to Fit over the Canoe Ends

Applying the Canvas

The canvas is put on with marine glue, the black kind being the
best for this particular purpose. Before gluing the canvas, lay it
smoothly on the hull and trim so that it will fold nicely at the stems,
as shown in Fig. 13. Melt the glue in a can over a moderately hot fire
and spread it on one side of the canvas with a stiff brush. Of course,
the glue will be too thick to spread evenly, but be sure to apply it as
evenly as possible, and touch every bit of the canvas with a fairly
heavy coating of glue. Lay the glued canvas in place, and iron with a
moderately hot flatiron. This melts the glue, and the canvas will
adhere smoothly to the planking. Finish by tacking the edge of the
canvas along the edge of the gunwales, and fold the canvas as
smoothly as possible at the stem, and tack in place, running the line
of tacks exactly down the center line of the stem.

Fenders or Covering Strips

Fenders of ¹⁄₂-in. round molding may be tacked on to cover the


edge of the canvas, or a strip, 1 in. wide, may be sawed from the
same material as the planking and tacked to cover the edge by using
1¹⁄₄-in. brads every 2 or 3 in. along the edge.

Stem Bands and Outside Keel

The stem bands may be made from wood if desired and bent to
shape, but the brass oval stem or bang iron, ³⁄₈ in. wide, makes a
stronger and better finish. The wood stem band should be about ³⁄₈
in. square, and rounded on the outside. Put this on with 1¹⁄₄-in. brads
and fasten the brass band with ³⁄₄-in. screws.
The outside keel may or may not be used, according to
preference. It strengthens the canoe to a certain extent and keeps
the bottom from many a scratch while pulling out. The usual outside
keel is about 1 in. wide and ¹⁄₂ in. thick, of oak or ash, and tapered at
the stems to the width of the stem bands, which are screwed on over
it. The most serviceable keel is about 2¹⁄₂ in. wide in the center, and
tapers to fit the bands at either end. When made of ³⁄₄-in. oak, or
ash, it makes a splendid protection for the bottom of the hull,
especially when the craft is used in rocky waters. Unlike the narrow
keel, the flat keel makes the canoe easier to turn with the paddle, but
any form of keel will add several pounds to the weight of the craft
and is for this reason often omitted.
Painting the Canoe

The canvas should be given a coat of shellac before the paint is


applied. This makes it waterproof. Then four coats of paint are
applied to fill the fibers of the canvas. To make a smooth finishing
coat, rub down the second and third coats with fine sandpaper. The
entire woodwork of the canoe should be finished with three coats of
good-quality outside spar varnish.
A slatted grating, made of soft-pine lattice stuff, about 1¹⁄₈ in. wide
and 1¹⁄₄ in. thick, will afford protection to the bottom of the canoe. For
summer use this is desirable, but may be omitted on long trips and
when soft footwear is worn. The grating should not be fastened to
the ribs, but the parallel strips screwed, or nailed, to cross strips,
curved to fit the contour of the canoe’s bottom. The grating should
extend from well under the stern seat up to the stem splice in the
bow, and should be nicely tapered to make a neat appearance. By
fastening two or three little blocks of wood so that they will extend up
between the slats, one may screw small brass buttons into these
blocks to keep the slatted floor in place, thus making it easily
removable when washing out the canoe.
A Ring-and-Egg Trick
This trick consists in borrowing a ring and wrapping it in a
handkerchief from which it is made to disappear, to be found in an
egg, taken from a number in a plate.
Obtain a wedding ring and sew it into one corner of a
handkerchief. After borrowing a ring, pretend to wrap it in the center
of the handkerchief, but instead wrap up the one concealed in the
corner, retaining the borrowed one in the hand. Before beginning the
performance, place in the bottom of an egg cup a small quantity of
soft wax. When getting the cup, slip the borrowed ring into the wax in
an upright position. An egg is then chosen by anyone in the
audience. This is placed in the egg cup, the ring in the bottom being
pressed into the shell. With a button hook break the top of the shell
and fish out the ring. The handkerchief is then taken out to show that
the ring has vanished.
Lock for Gasoline Tank on a Launch

Filler-Pipe Cover Lock to Prevent the Theft of Gasoline from a Motorboat

Having trouble by thefts of gasoline from the tank in my launch, I


made the following device to prevent them, which proved very
effective. A strap hinge, about 12 in. long, was procured, and on one
wing, near the outer end, I fastened a staple made of a large nail,
and near the center a large hole was drilled to fit over the pipe, or
opening, to the tank. The other wing of the hinge was bent to the
shape shown, and an oak block was fastened in the bend with wood
screws. A hole was bored in the block to fit over the end of the pipe.
A slot was cut in the same wing at the end to receive the staple. In
turning the wing over to cover the pipe end, the staple was brought
into position for a padlock. After locking the device, most of the
screws are covered so that it is almost impossible to remove them
without taking off the lock.—Contributed by Stephen H. Freeman,
Klamath Falls, Oregon.
A Quick-Acting Bench Vise
A Quick-Acting Vise Made of Hard Wood for the Home Worker’s Bench
For those who desire a quick-acting vise and cannot afford the
price of a manufactured one, I designed the vise shown in the
illustration. A detail is given of each part, with dimensions, so that it
is not difficult to make it from hard wood. The roll A binds the vise so
that it remains rigid, while the cam in front gives the necessary play,
to release or tighten as preferred. The clamp jaw B is pivoted so that
it swings loose, thus making it fit any surface that may not be parallel
with its opposite side. In releasing the stock, the cam is first turned,
and then the front part of the vise is pulled up to relieve the roll A.
The front jaw can be then moved back and forth to take stock of any
size desired. As soon as the stock is placed, the roll A falls into place
and clamps the jaw arm C.—Contributed by J. C. Hansen, Maywood,
Ill.

¶A practical vacuum will raise water 30 feet.


How To Build
A Canoe
By Stillman Taylor
PART II
Sailing the Open Paddling Canoe

S ailing and its recreations are afforded the owner of an open


paddling canoe, for a satisfactory sailing rig may be provided at
small cost. A regulation sailing outfit may be purchased, but it is
rather costly, and if the canoeist cannot use a sail frequently,
purchasing an outfit is unduly expensive. A sailing rig may be
constructed even by one of only moderate skill, who will devote a
few hours to it. The specifications given, if carefully followed, will
enable one to make a sailing rig as serviceable as a ready-made
outfit, and at about one-half the cost.
The specifications and list of material for a sailing outfit suitable for
a 16-ft. open canoe are as follows:
1 piece bamboo, 1¹⁄₂ in. in diameter, 6 ft. 10 in. long, for mast.
2 pieces bamboo fishing rod, 1 in. at butt, 10 ft. long.
1 piece, cedar or white pine, 5 ft. long, 4 in. wide, and ⁷⁄₈ in. thick, for leeboard
thwart.
2 pieces, cedar or pine, 28 in. long, 10 in. wide, and ³⁄₈ in. thick, for mast
thwart.
1 piece, cedar or pine, 27¹⁄₂ in. long, 3¹⁄₂ in. wide, and ³⁄₄ in. thick, for mast
thwart.
1 piece, cedar or pine, 5 in. long, 2¹⁄₂ in. wide, and 1 in. thick, for mast step.
8 yd. unbleached cotton sheeting, 1 yd. wide, for making sail.
30 ft. ¹⁄₄-in. cotton rope, for halyard and main sheet.
4 brass lantern-board hooks, for clamping mast thwart and leeboard thwart to
canoe.
2 brass stove bolts, 3 in. long and ⁵⁄₁₆ in. diameter, with washers and thumb
nuts, for clamping leeboards at desired angle.
1 brass single-boom jaw for canoe, 2 in. long, for keeping boom on mast.
3 brass screw eyes, 1¹⁄₂ in. long with ¹⁄₂-in. eye, one for halyard, two for spars.
1 brass split ring, 1¹⁄₂ in., for fastening ends of spars together.
1 brass “S” hook, 1³⁄₄ in. long, for fastening ends of spars.
In converting the paddling canoe for sailing, it is desirable that it be
unmarred, as far as possible. The rig described represents the result
of experiments with various arrangements, and has been found to be
safe and convenient. The original outfit has been in use for six years
and will still serve for some time.
The lateen rig is best for an open canoe, because a shorter mast
is required for the same sail area. An open craft is less suited for
carrying sail than one which is decked fore, aft, and amidships. It is
not safe to rig a canoe too heavily, and the rig described has been
found to be well proportioned.
Bamboo is best for the mast, because it is lighter and tougher than
a solid wood spar of the same dimensions, and is readily procured.
Dealers in rugs use bamboo of 1¹⁄₂ to 2-in. diameter on which to roll
carpets, and it may usually be purchased of them.
Cut the bamboo to a length of 6 ft. 10 in., and whittle a wooden
plug, about 3 in. long, tapering it so that it will wedge firmly inside,
taking care not to split the cane. Bore a small hole through the cane
2 in. from the top, plug it and fix a screw eye into the plug. Drive
small brads through the cane into the plug to prevent the former from
splitting. A brass ferrule fitted over the end of the bamboo will make
a strong and neat finish.
The sail is made in the form of a triangle and measures 9 ft. on
each side. It is best to have it sewed on a machine. The sail is
bighted with parallel strips, or folds, ¹⁄₂ in. wide, spaced 6 in. apart,
as shown at the left in the illustration. First cut the canvas to the
approximate size and shape by laying the spars over it and marking
the outline with a pencil. Next sew the separate widths together,
lapping one edge over the other about ¹⁄₂ in., and sewing close to
both edges. The bights or folds run at an angle and parallel with the
loose ends of the leech of the sail. The sail should then be reinforced
at the corners by sewing segments of cloth at these points. Along the
edges which are to be lashed to the spars, fold over a strip of canvas
and sew it to make a 1-in. hem. Run a ¹⁄₂-in. tape into the fold along
the leech while sewing the hem. This tape is fastened to the spar at
each end, to take up the slack caused by the stretching of the sail
after use, thus preventing that bugbear of sailors—a flapping leech.
Sailing and Its Recreations are Afforded the Owner of an Open
Paddling Canoe, for a Satisfactory Sailing Rig may be Provided at
Small Cost. The Canoe Is Practically Unmarred, yet the Sailing
Outfit is Installed Substantially and may be Removed Quickly. The
Canoe Is Shown Running Nearly Free—before the Wind—and the
Leeboards are Therefore Only Partly Submerged

The mast thwart is made as shown at the right in the illustration,


and has a hole cut in the center to fit the mast. It is also provided
with two lantern-board hooks, one at each end, with which to clamp
the thwart to the gunwales. The mast is supported at the bottom by
means of the mast step, which is a block of wood, shaped as shown
to give a neat appearance. It is fastened to the grating, or to the ribs
if no grating is used.
The leeboard thwart is also shown in the sketch, at the right. The
short upright ends are set at an angle so that they conform to the
curve of the canoe and wedge the thwart into place immediately aft
of the mast. The ends are grooved to fit the thwart and fastened with
screws. A carriage bolt is fitted through each end piece and provided
with a wing nut, which holds the leeboard in place on each side. The
leeboards may thus be adjusted at the desired angle by fixing them
with wing nuts. No dimensions are given, for it is obvious that they
will vary on different styles and sizes of canoes. A finish in keeping
with that of the canoe should be applied. Smooth all the work as
carefully as possible with sharp tools and sandpaper it lightly. Three
coats of spar varnish will give a satisfactory finish.
The sail is hoisted by running the halyard through the screw eye at
the top of the mast, until the gaff spar is close to the mast top, as
illustrated.
The boom jaw is fastened on the boom, with the open end 18 in.
from the forward end of the boom. This will permit the forefoot of the
sail to extend forward of the mast. By tying the halyard at various
points along the gaff, the point of balance may be found. For the
sake of safety the halyard should not be tied to the forward thwart,
but run under it to the stern within easy reach of the canoeist. The
main sheet should never be made fast, but the rope merely looped
around the thwart and held in the hand or beneath the foot, so that it
may be released quickly if a puff of wind should strike the sail.
Steering is done with a paddle. This method is more convenient
than a rudder where the single sail is used. The paddle is always
used on the lee side—away from the wind—and the wake keeps the
blade close to the side of the canoe, without much effort on the part
of the person guiding it. When turning about make the regular
paddling stroke, but finish it by thrusting the blade of the paddle
away from the canoe. This will tend to keep the canoe in its course,
and the paddle will not be drawn across the wake, which would
affect the headway of the craft.
The lower the weight is placed in a canoe, or boat, the greater will
be its stability. Hence, in sailing a canoe, sit on the floor of the craft,
and when turning about, turn against the wind and not with it. The
experienced canoeist can shift his course readily, but the novice
must be cautious, even in a moderate breeze. It would be well to sail
in shallow water and to wear only bathing costume when learning to
sail a canoe. When tacking and sailing close-hauled the leeboard is
the most effective, but as the boards are thin both may be kept down
without greatly reducing the speed. When running before the wind
both boards may be raised to give the greatest speed.

¶Paint may be readily removed from windows by applying a cloth


dipped in hot vinegar or acetic acid. This applied to brushes will
soften them.
How To Build
A Canoe
By Stillman Taylor
PART III
Fitting a Motor into a Paddling Canoe

A stanchly built canoe of sufficient length and beam may be


converted into a light, serviceable, and convenient power boat by
the installation of a light-weight motor of about 2 hp. While the craft
thus becomes less available for shallow waters and cannot be used
so readily on trips where portages are necessary, a power canoe has
advantages in that longer trips may be undertaken with less regard
for weather conditions. Greater speed and the fact that physical
power need not be expended also increase the value and range of
operations of such a craft.
Unless a motor of extremely light weight is procured, a canoe of
frail construction and less than 16 ft. long is not likely to stand the jar
of the driving mechanism. The canoe illustrated in the page plate is
18 ft. long, of 36-in. beam, and strongly planked, decked, and
braced. A canoe of even broader beam would tend to give more
stability in rough water, and if it is desired to transport heavy
camping packs, or other material, in the craft, this factor should be
observed particularly. Likewise, the depth and draft must be
considered, as the carrying capacity and seaworthiness of a canoe
depend in part on these factors. The fitting of the various parts of the
mechanism and accessories must be done with the aim of balancing
the load evenly. If properly disposed, the weight of these parts
should tend to lower the center of gravity of the canoe, thus
rendering it more stable.
The actual work of installing the motor and fittings should be
preceded by careful planning and the making of a full-size diagram
of the stern portion of the canoe as rebuilt. Too much care cannot be
taken in this work, as, if it is neglected, the craft may be rendered
unsafe, or the motor and fittings may not operate satisfactorily. The
motor should be set in the stern, as shown in the illustration, as this
will permit the use of a minimum of shafting and other fittings which
must be accommodated. The exact location of the motor may vary
with canoes and engines of different types. This should be tested out
by placing the motor in the canoe and noting the effect on its balance
in the water. For a canoe of the dimensions indicated, and a light-
weight motor, 5 ft. from the stern is a satisfactory position. The motor
should be placed as low in the canoe as possible, allowing the
flywheel and crank case sufficient clearance below.
A convenient method of operation is as follows: Place the canoe
on boxes, or sawhorses, taking care that it is properly supported
about 2 ft. from the ground, or floor. Take measurements directly
from the canoe, or part to be fitted, whenever convenient. Procure
two sheets of paper, 30 in. wide and 7 ft. long; mark one “diagram”
and the other “templates,” and use the former for the full-size detail
and the other for the making of templates for curved or irregular
parts.
Begin the diagram by drawing the base line AB, Fig. 3. This is the
lower line of the engine bed and the upper surface of the ribs. Draw
the line CD perpendicular to the base line, and 18 in. from the left
end of the sheet. The point C is the center of the stern end of the
driving shaft. The dimensions of parts are not given, except in
special instances, since they must be obtained from the particular
canoe and other parts entering into the construction. Indicate the
layer of ribs E, the planking F, and the keel G. Using the template
sheet, cut a template or pattern for the curved stern. This may be
readily and accurately done by fixing a straightedge to the keel and
permitting it to extend to A. Rest the long edge of the sheet on the
straightedge when fitting the template to the curve. Use the template
as a guide in marking the curve on the diagram, as at HJ. The curve
K, of the stern decking, may be indicated similarly.
Determine the distance the motor is to be set from the stern and
indicate it by the perpendicular line L. Measuring from the base line,
indicate the height of the center of the motor shaft from the floor, as
at M. This should be made as low as possible, permitting sufficient
clearance for the flywheel and the crank case. Draw a straight line
from C to M, which will thus indicate the center line of the driving
shaft. This line is fundamental in determining the dimensions and
placing of certain parts and fittings, and should be established with
extreme care. The size and exact position of the engine bed N may
now be indicated. Its dimensions, given in detail in the perspective
sketch, Fig. 5, are suggestive only. They may be varied in order to
provide proper bearing on the floor, and so that the bolts holding the
bed may pass through ribs. The cross brace at the forward end is
important, and should be fitted carefully over a rib. The upper line of
the engine bed must not be confounded with the center line of the
shaft, for in many engines they are on a horizontal line when viewed
from the forward end, yet not necessarily so. The slant of the engine
bed must be made accurately, as any deflection from the angle of
the center line of the shaft will disarrange the installation.
The shaft log O may next be indicated and a template made for
use in guiding the bit when boring the hole for the shaft through it.
The template used for the curve HJ may be altered by drawing the
shaft log on it at the proper place. The point P, from which the bit is
to be started when the shaft log is fixed into place, should be
indicated and the center line of the shaft extended to Q, may then be
used as a guide for the bit. If the homemade type of bearing R is
used, it should be indicated on the diagram. A metal bearing may be
made, or a suitable one obtained from dealers in marine hardware.
In the latter case it will probably be necessary to block up the bottom
of the canoe in order to provide a flat, horizontal bearing surface for
the bearing flange.
The rudder and other parts, which are not directly connected with
the motive-power unit, may be indicated in detail on the diagram or
be made from sketches of a smaller scale. Paper patterns, made full
size, offer a convenient method of outlining the parts of the engine
bed, the rudder, and other irregular pieces. When the diagram is
complete, measurements may be transferred directly from it without
reducing them to figures, and, wherever possible, parts should be
fitted to it.
The shaft log, shaft bearing, and engine bed may be made of oak,
or other strong hard wood. It will be found desirable to have the
engine bed complete before an attempt is made to fit the shaft and
its connections. It is made of 1¹⁄₂-in. stock, bolted together with lag
screws and fixed firmly into the canoe with bolts. The heads of the
bolts should be provided with cotton and red-lead packing, and care
should be taken that the bolts pass through ribs.
The shaft log should be fixed into place before it is bored. Bolts
may be passed through it and fastened on the inside if there is room
for drawing up the nuts in the stern. Large screws may be used to
aid in the fastening and smaller screws may be used from the inside.
The lower rudder support will also aid in holding the log in place, and
the iron straps S, Fig. 3, will insure its rigidity. This is an important
point in the construction, as if the log is not fixed positively, the
thrashing of the propeller will soon loosen it.

You might also like