Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

CS 50: Software Design and Implementation

Lecture 23
How to Build Your Own C Library
You have been using libraraies in this course but we haven’t discussed them in detail or discussed more advanced uses of this programming and development feature; that is, how to
build you own C library. Why would you want to do this? And, how would you do this. In a general sense libraries are collections of precompiled functions that have been written to be
reused by other programmers. In your case, imagine, three different programmers were designing and coding up TinySearch. Once they completed the overall design then they might
come to the conclusion that there are a number of common functions that could be reused by each component being coded up: for example all the code in the lab6/src/util directory.
These common functions in util could be built as a library that each programmer could link as a static library (lib.a) or better built as a shared object (lib.so) that is dyanmically shared.
We will discuss these various static and dynamic options and as part of Lab6 you will create a static libray from the functions in the util directory and link to that library at compile time.
This will require you to build a library and the change your makefile to build the crawler, indexer and query engine so that these objects use that library. Understanding and being able to
build a library is another one of those skills you need to get your cs50 hackers badge.

Goals

We plan to learn the following from today’s lecture:

What are different types of libraries and their uses.


Example of linking to libm.a
Building a static library: libtseutil.a
Listing and linking to libtseutil.a

The Basics - Libraries

Libaries consist of a set of related functions to perform a common task; for example, the standard C library, ‘libc.a’, is automatically linked into your programs by the “gcc” compiler and
can be found at /usr/lib/libc.a. Standard system libraries are usually found in /lib and /usr/lib/ directories. Check out those directories. By default the gcc compiler or more specifically its
linker needs to be directed to which libraries to search other than the standard C library - which is included by default.

There are a number of conventions for naming libraries and telling the compiler where to find them that we will discuss in this lecture. A libray filename always starts with lib. The last
part of the name determines what type of library it is:

.a: static, traditional libraries. Applications link to these libraries of object code.

.so: dynamically linked shared object libraries. These libraries can either be linked in at runtime but statically aware or loaded during execution by the dynamic link loader.

The way to view a static library is that it is linked by the linker and included in the execution code. So if 10 applications linked in a static library it would mean that each application’s
resulting binary would include the referenced library in its program binary. This leads to large executable files. To address this people use shared libraries. These libraries contain the
same references to those found in static ones but the code for those functions are not directly included in the resulting executable. Rather, shared libraries access a single copy of the
libray that is shared by all the 10 applications while all excuting at the same time. There is some operating system magic to make this happen safely but it is a foundation on modern
computing.

As in the case of stdio.h the library has to provide a header file that is included in the source code that defines prototype functions and variables that the library prpovides. Many times
there are functions not included in the standard C library that you need to get access to. For example, take the example code below: it uses functions provided in the the math library
(i.e., libm.a) and defined in math.h which is included as a header file in the code below.

The trig.c code comes from another nice book on Linux Programming by Masters and Blum trig.c

/*
* Professional Linux Programming - Trig Functions
* Author: Jon Masters <jcm@jonmasters.org>
*/

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#define MAX_INPUT 25

int main(int argc, char **argv)


{
char input[MAX_INPUT];
double angle;

printf("Give me an angle (in radians) ==> ");


if (!fgets(input, MAX_INPUT, stdin)) {
perror("an error occurred.\n");
}

angle = strtod(input, NULL);

printf("sin(%e) = %e\n", angle, sin(angle));


printf("cos(%e) = %e\n", angle, cos(angle));
printf("tan(%e) = %e\n", angle, tan(angle));

return 0;
}

The trig program relies on the math library libm.a so it is necessary to tell the linker explicity to include that library. Note, that on Linux machines that the math library is included as part
of the GLIBC (libc.a) package so you don’t really need to do this on Linux. But the point is that you can’t rely on that for many C functions that are provided by external functions.
Therefore, it’s you should tell gcc which library to include as below:

mygcc -o trig -lm trig.c

The -lm option tells gcc to search the system provided math library (libm). It will look in /lib and /usr/lib to find the library as discussed above. the “m” is the name of the library libm.a
The “-l” option is the library name: -llibrary. gcc searches for the library named library when linking, which is actually a file named liblibrary.a.

The following is a tutorial on building libraries with focus on the Tiny Search Engine. This tutorial is largely adapted from http://www.yolinux.com/TUTORIALS/LibraryArchives-
StaticAndDynamic.html).

We will not cover building shared dynamic libraries in the lecture below but if interested check out the tutorial that provides a nice set of simple examples.

Building a Static C Library

Your crawler code is probably factored into two components, namely crawler and util, for which the former contains the main control logic of the crawler and the latter is more of a
general purpose toolbox that you utilize in your crawler and in other components of your Tiny Search Engine. So, would it be cool if you can make your util into a C library and just link
to it when you need to use functions from it, instead of having to copy over all the util C files as you work on different components of your search engine? This short tutorial is going to
show you just how to do that.

We will be using the crawler sample solution code as our example to walkthrough how to build its util into a static C library and how to link to it to create the crawler binary executable.

The crawler code is divided into two directories: crawler and util, we want to build the util into a C library.

Compiling .c to .o

Navigate into util directory and compile all the .c source files into .o object files by supplying the -c flag:
$ gcc -Wall -c *.c

You should see a number of .o object files created corresponding to the .c source files.

Creating a Library File

Next, we issue:

$ ar -cvq libtseutil.a *.o


q - dictionary.o
q - file.o
q - hash.o
q - html.o

which takes all the .o object files generated from previous step and packages them into a single .a static library file, named lib[xxxxx].a. In our case, since we are making the util
library for the tiny search engine (tse), we named it libtseutil.a. Notice, that the name must start with lib and end with the .a extension. (ar is the Linux archiver utility, man it to learn
about it and its options.)

The ar command does the heavy lifting and – creates and maintains library archives – take a look at the man pages to determine the detail meaning of the flags; in brief:

-c Whenever an archive is created, an informational message to that effect is written to standard error.
If the -c option is specified, ar creates the archive silently.

-v Provide verbose output.

-q Quickly append the specified files to the archive. If the archive does not exist a new archive file is created.

We have successfully built our own C library file and we are ready to link it to build our crawler binary. But before that, we can also take an optional step to view what files our library
contains.

Listing Files in a Library

You can do

$ ar -t libtseutil.a

to see what files this library includes. For example, following is the output, indicating the four .o files included in this library file.

$ ar -t libtseutil.a
dictionary.o
file.o
hash.o
html.o

Linking with a Static C Library

Now we navigate to the crawler directory and build our executable binary. There are two ways of doing this; we can do

$ gcc -o crawler crawler.c list.c ../util/libtseutil.a

in which we directly specify the path to the the library file; or we can do
$ gcc -o crawler crawler.c list.c -L../util/ -ltseutil

in which we specify the directory path containing the library file with the -L flag and specify that we want to link to our TinySearchEngine util library with the switch -ltseutil. Pay close
attention to the spelling of this switch and the name we gave to our library file: this switch is basically cooked up by just cutting off the lib prefix and .a extension from the library name
and sticking a -l before what’s left.

What you have to do for Lab6

In lab6 we want you to create a static library as above and link it into the three main components if needed. The following steps serve to outline a general approach to do this:

First, your code structure should contain a single util folder, and three other folders corresponding to the control blocks of the three components of the search engine.
Navigate into the util folder, and create a Makefile that uses ar and gcc to create the static library that you want to use for all the three components of the search engine. And of
course also include things like clean in this Makefile.
Next, move into each of the component folders; for each component, create a Makefile that treats the util static library as a part of the dependency chain when building each
component; just cd to util folder and call make there. For example, in your indexer Makefile, you could have something like
$(UTILLIB): $(UTILC) $(UTILH)
cd $(UTILDIR); make;

where UTILLIB might point to the name of the static library, UTIL[HC] the header and source files in your util directory. You don’t have to worry about cd back into your component
directory from util directory because every single line in Makefile runs in a separate process.
Note that, when specifying the source files in the Makefile for any of the components, you do NOT want to include the util C source files; that’s the whole point of making our own
static library, isn’t it?

An example snippet of the Makefile for crawler could be as follows:

This is beyond a hint. You work it out.

# Filename: Makefile
# Description: The make file is to build up the crawler.
# Warning: See how to use CFLAGS1 to build file.c
CC=gcc
CFLAGS= -Wall -pedantic -std=c99
SOURCES=./list.h ./list.c ./crawler.c ./crawler.h
CFILES=./list.c ./crawler.c

UTILDIR=../util/
UTILFLAG=-ltseutil
UTILLIB=$(UTILDIR)libtseutil.a
UTILC=$(UTILDIR)hash.c $(UTILDIR)html.c $(UTILDIR)file.c $(UTILDIR)dictionary.c
UTILH=$(UTILC:.c=.h)

crawler:$(SOURCES) $(UTILDIR)header.h $(UTILLIB)


$(CC) $(CFLAGS) -o $@ $(CFILES) -L$(UTILDIR) $(UTILFLAG)

$(UTILLIB): $(UTILC) $(UTILH)


cd $(UTILDIR); make;

NOTE: Once you have made a library you can not debug it even if the debug flags are set in the Makefile. For example, the symbols are not available for the functions in
tseutil – so you can inspect data or set break points, etc. The best thing is to make sure your util code is debugged before creating libtseutil.a. Don’t create the library
while you have bugs.

Note to lecturer: run example in l23/util. Then go to lab6/src/util and lab6/src/crawler and run and discuss the Makefile.

You might also like