03 CS107 Practice Midterm

CS107 Summer 2012
Handout 03 July 19th, 2012
CS107 Practice Midterm Exam

Midterm Exam: Wednesday, July 25th, 2012 Nvidia Auditorium 7:00 p.m. until 9:00 p.m.
Open book/notes You may bring textbooks, notes, handouts, code printouts, etc. to refer to during the exam. No computers, phones, PDAs, or other electronic devices may be used. The exam will include a list of relevant stdlib function prototypes. Dont let the "open-book" nature delude you into not preparing. There will not be time to learn or relearn the material during the exam. You must come prepared to answer questions, referring to your notes only for the occasional detail. Material The exam will concentrate on material covered in the labs and assignments. For the midterm, this means questions that delve into strings, pointers, arrays, function pointers, low-level memory manipulation, and data representation (e.g. bits, ints, floats). (IA32 assembly will not be tested on the midterm.) When evaluating your C code, we will not be picky about minor syntax/oversights (missing braces around a block of clearly indented code, forgetting to declare an integer variable, we don't ask for #include's etc.), but there are subtleties that matter immensely, e.g., an int** is just one character different than int*, yet there is a world of difference between the two! The &s, *s, []s, and typecasts are things that really matter. Practice You're unlikely to do well on a test if you don't understand the core concepts, but there is no guarantee about the inverse. Some students who are quite accomplished in practice don't manage to demonstrate that same proficiency in the exam setting. Writing code on paper under time constraints is different than working at the computer, and most students need practice to adapt their skills. We recommend you sit down with the problems and write out solutions in longhand. This is much more valuable than a passive review of the problem alongside its solution where it is too easy to conclude "ah, yes, I would have done that" only to find yourself sad during the real exam when there is no provided solution to agree with! The rest of this handout is based the midterm given last winter in CS107 so you can consider the questions fairly representative in terms of format, difficulty, and content. The one key difference is that those questions were written for a 180-minute exam, and your exam will be written to take about 75 minutes, even though Im giving you two hours to take it. Ill also confess that the second problem was a little longer than it needed to be.
Problem 1: The Science of Word Recognition

a) Assume youve been given the following pair of double-specific functions. void SwapDoubles(double *one, double *two) { double temp = *one; *one = *two; *two = temp; } void ShuffleDoubleArray(double array[], int n) { for (int i = 0; i < n; i++) { int j = RandomInteger(i, n - 1); SwapDoubles(&array[i], &array[j]); } } Shuffle accepts an array of doubles and replaces it with a random permutation of those same doubles. [We should assume, of course, that RandomInteger(a, b) returns a random integer between a and b inclusive.] Your task is to leverage the above and the generic Swap we wrote together in lecture to implement a generic Shuffle algorithm that works for any base type, not just doubles. Assume that the base address of the entire array is expressed as base, the number of elements is given as n, and that the size of the figures in the array is given by elemSize. void Swap(void *one, void *two, int elemSize) { char temp[elemSize]; memcpy(temp, one, elemSize); memcpy(one, two, elemSize); memcpy(two, temp, elemSize); } void Shuffle(void *base, int n, int elemSize) { b) Those who study word recognition long ago made the interesting discovery that people are generally able to read a sentence even if the letters in each word are shuffled, provided the first and last letter of each word are in the proper place. That means that: You slhoud hvae ltlite ploerbm rdnaeig tihs eevn thguoh the lrttees are suheflfd Using your freshly minted Shuffle algorithm, along with the set of standard C string functions, write the JumbleSentence routine, which assumes a simple sentence of space-delimited words resides in the sentence buffer, and jumbles all internal characters of each word. For simplicity, assume the sentence is comprised of purely alphabetic words [no hyphens, no apostrophes, etc.] separated by single space characters, that there are no leading or trailing spaces, and that there arent any punctuation marks [so no periods or exclamation marks] at the end. Ensure that the first and last letters of each word stay in place. You should assume the entire sentence is terminated by a '\0'. void JumbleSentence(char *sentence) {
Problem 2: The CLexicon

One of the most common data structures around is the lexicon, which is a dictionary that stores words, but doesn't bother to store any of its definitions. While many applications might need more than that, some applications only need to know if a given string is a meaningful, properly spelled word in the English language. For this exam problem, you'll be implementing a CLexicon ADT to store an arbitrarily large collection of purely alphabetic words, but you'll be pinned to storing them in a way that I lay out for you very, very clearly. Here is the condensed interface file: typedef struct CLexiconImplementation CLexicon; typedef void (*CLexiconMapFunction)(const char *word, void *auxData); CLexicon *CLexiconCreate(); bool CLexiconContains(CLexicon *cl, const char *word); void CLexiconAdd(CLexicon *cl, const char *word); void CLexiconMap(CLexicon *cl, CLexiconMapFunction mapfn, void *auxData); void CLexiconDispose(CLexicon *cl); Heres the struct definition used by the implementations: struct CLexiconImplementation { CVector *buckets[676]; }; Youre given the complete struct CLexiconImplementation definition, but it'll be up to you to implement the five functions. Here's the design you must adhere to: Each word is stored in one of 676 CVectors. In particular, words beginning with "aa" are stored in the first CVector, words beginning with "ab" are stored in the second CVector, and so forth. The first 26 CVectors all store words beginning with the character 'a', so that the 27th CVector stores words beginning with "ba", the 28th CVector stores words beginning with "bb", and so forth. For simplicity, we'll assume that all words are of length two or morespecifically, you needn't worry about the empty string or single-character words. You should also assume that all words are purely alphabetic, lowercase and free of punctuation marks. Because the first two characters are effectively captured by a CVector index, they aren't explicitly stored anywhere. Only letters 3 and beyond are ever copied into a CVector. "abacus", for example, is stored in a CLexicon provided the 2nd CVector [storing words that begin with "ab"] contains the suffix "acus". "ab" plus "acus" makes "abacus". Woo. Because the size of the elements stored in these CVectors must be fixed at creation time, you're going to devote exactly eight bytes to the storage of each suffix, even if the suffix is considerably longer than that. A large fraction of words in the lexicon would probably be less than 10 letters long, and since the first two characters can be inferred from the index of the CVector storing the suffix, the remaining characters can potentially reside directly in the CVector.
4 Larger words can't be compactly stored this way. For larger words, the eight bytes need to be used differently. Here's the final heuristic: o The first of the eight bytes will be to tell us whether the remaining seven are enough to store the entire suffix. This first byte will store a 0 [equivalently, a '\0'] when the suffix of the word being stored is 7 characters or fewer. The suffix should itself be terminated with a '\0' if its length is less than 7, but suffixes of length 7 should not store the '\0', since there won't be any room for it. o Should the first byte store anything nonzero, the remaining seven bytes are divided up. Bytes 2, 3, and 4, store the first three characters of the suffix, but bytes 5 through 8 store the address of a dynamically allocated character array large enough to store the rest of them. o When analyzing the suffix, it's the implementation's responsibility to check this first byte to see if the suffix is fully stored in the remaining seven bytes, or if the suffix is broken up into two separate arrays. Here are a few examples: The word "abacus" would take up eight bytes in the 2nd [or the "ab"] CVector. The suffix would be stored as follows:
'a'
'c'
'u'
's'
The first byte stores a zero, because the "acus" suffix can be fully stored in the remaining seven bytes. The leading 0 informs the implementation that everything resides in the eight-byte chunk. The word "polyphony" would take up eight bytes in the "po" CVector. The suffix would be stored as follows: 0 'l' 'y' 'p' 'h' 'o' 'n' 'y'
Again, the "lyphony" suffix can be wedged into the seven-byte chunk. The only difference here is that the '\0' can't be stored. This shouldnt limit your implementation, as it should just realize that at most seven characters can be accommodated. The word "onomatopoeia" is a mighty big one. The "on" CVector would contain eight bytes on behalf of "omatopoeia", but those eight bytes would look this:
'o'
'm'
'a'
't'
'o'
'p'
'o'
'e'
'i'
'a'
Note the 1 in the very first byte. Thats the signal that the last four bytes point to dynamically allocated space [allocated by the implementation, of course] to store the suffix that just couldn't fit in the eight primary bytes.
5 The dynamically allocated portion is always null-terminated and is always exactly the size it needs to be to store the rest of the characters. Some relevant ANCI C functions youll want to make use of: int strcmp(const char *s1, const char *s2); int strncmp(const char *s1, const char *s2, int len); void strcpy(char *dst, const char *src); void strncpy(char *dst, const char *src, int len); strcmp compares the two null-terminated C strings and returns a negative number if the first is lexicographically less than the second, a positive number if the first is lexicographically greater than the second, and 0 if they exactly match. strncmp does the same thing, except that it compares at most len characters. strcpy copies the string src to dst [including the terminating '\0' character]. The strncpy function copies no more than len characters from src into dst, appending '\0' characters if src is less than len characters long, and not terminating dst otherwise. a) Implement the CLexiconCreate and CLexiconDispose functions. CLexiconCreate dynamically allocates a CLexiconImplementation struct, initializes all 676 CVector *s within to address properly constructed but otherwise empty CVectors, and then returns the address of the struct. CLexiconDispose brings down all of the resources contributing to the CLexicon being destroyed. CLexicon *CLexiconCreate() { void CLexiconDispose(CLexicon *cl) { b) Implement the CLexiconContains function. You should leverage CVectorSearch to detect whether the word is present, but you shouldnt assume the suffixes are otherwise sorted in any way. Take the time to create helper functions if you foresee them being useful in the context of inserting and mapping over words. [You will need to write a comparison function and pass it to CVectorSearch. You should assume the key CVectorSearch is looking for is always passed as the first of the two parameters to the comparison function.] static int CLexiconHashToBucket(const char *str) { int msb = str[0] - 'a'; // more significant byte int lsb = str[1] - 'a'; // less significant byte return msb * 26 + lsb; // base 26 number in range [0, 676) } bool CLexiconContains(CLexicon *cl, const char *str) { c) Next, implement the CLexiconAdd function, which ensures that the specified word is present in the CLexicon. You should call CLexiconContains directly, and if the word is already present, you should just return without doing anything. If the word is missing, you should append the correct eight-byte figure to the end of the relevant CVector. You should not worry about sorting anything. void CLexiconAdd(CLexicon *cl, const char *word) {
6 d) Finally, implement the CLexiconMap function. You should assume for the sake of convenience that no word in the lexicon is ever longer than 64 characters. You will need to manually reconstruct all of the strings on-thefly so that they can be passed to the mapping routine, but the reconstruction should not make use of any dynamically allocated memory. void CLexiconMap(CLexicon *cl, CLexiconMapFunction mapfn, void *auxData) {
Problem 3: Bits, bytes, and numbers

a) Write a function that given a float f will return the next float nearest to f, e.g. the immediate neighbor of f on the number line of representable floating point values. The function may return either the left or right neighbor. The function should work correctly for normalized or denormalized inputs, you may ignore exceptional inputs (e.g. infinity, and Nan.) float Neighbor(float f) { b) The Mystery function takes an integer and returns a bool computed as below: bool Mystery(int n) { return (n & (n-1)) == 0; } Characterize the result returned by Mystery for the inputs listed in the chart below.
If n is
positive negative zero
Mystery(n) returns.

03 CS107 Practice Midterm

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

03 CS107 Practice Midterm

Uploaded by

Copyright:

Available Formats

CS107 Summer 2012

Handout 03 July 19th, 2012

CS107 Practice Midterm Exam

Problem 1: The Science of Word Recognition

Problem 2: The CLexicon

Problem 3: Bits, bytes, and numbers

You might also like