An instance of a problem is a particular case of the problem, with specific values of the parameters of the problem. Each set of specific values of the parameters defines an instance.
Consider, for example, the problem of finding the average of two numbers, say a and b. An instance of this problem consists of finding the average of 123 and 9876. Another instance consists of find the average of 22222 and 3434.
An algorithm is correct if it fulfills its promises, that is, if it does what it promises to do. In other words, an algorithm is correct if it solves the problem that it is supposed to solve.
Elegant is more or less the same as simple, clean, beautiful, economical. An elegant program has no superfluous code and no unnecessary variables. An elegant program does not treat special instances of the problem in separate.
A very clever, intricate, convoluted. tangled, obscure, ugly, confused program is inelegant.
Experience shows that elegante programs are, in general, efficient, and efficient programs are, usually, elegant.
An algorithm is efficient if does not waste time. Roughly speaking, an algorithm is efficient if it is faster than others algorithms for the same problem.
Though this definition
is adequate for informal use,
it hides some difficulties:
faster than othersshould include not only the already known algorithms for the problem, but also the ones not yet discovered.
The following definitions refer to the time consumption
(i.e., the response time
)
of algorithms.
In all the definitions,
n is the size of an instance of the problem that the algorithm solves,
i.e.,
the parameter that measures the size of the input to the algorithm.
(If the input is a graph, for example,
then n is the
the number of vertices plus the number of edges of the graph.)
An algorithm is
These terms refer, usually, to the time consumption in the worst case.
A data type is a set of values equipped with a set of operations. Some operations transform values into other values; others associate numbers to values or sets of values. Here are some basic data types:
The operations on these values are the comparisons ( <, <=, ==, >, >= ) and the arithmetic operations ( +, -, *, / ). Each arithmetic operation receives two ints and returns an int. The definitions of these operations are a little different from the ones used in mathematics, since we can have overflow in some cases and truncation in others. For example, the value of 9 / 2 is 4 (not 4.5). Another example: if INT_MIN and INT_MAX are −32768 and +32767 then 32767 + 1 is −32768, not 32768.
The usual operations on these values are indicated by <, <=, ==, >, >=, + and -. In an ideal implementation of the data type, the value of the expression 127 + 2 would be −127 rather than 129. In C, however, the value of the expression x + y below is 129, since the expression is evaluated in int arithmetic.
char x, y, z; x = 127; y = 2; z = x + y;
But the value of z is −127.
real numbers) is more difficult to describe.
You can define your own data types using typedef. The following data type, for example, is very useful to represent boolean values:
typedef enum {FALSE, TRUE} boolean;
(But this data type is already defined in the stdbool.h interface.) Another useful data type is string (a sequence of bytes):
typedef char *string;
Here is a more specialized example: the data type point represents the coordinates of a point on the plane:
typedef struct { double x; double y; } point;
The operations on values of this type are derived from the operations on doubles. You can also add your own operations, like the operation that produces the Euclidean distance between two points, for example.
realnumbers
Computers are not aware of real numbers in the mathematical sense. Computers only know a small set of rational numbers.
The world of computing
loosely calls reals
the numbers of type
float and
double
(such as 0.31415926 × 101,
for example),
and all these numbers are rational.
These numbers are represented in
floating point notation.
A priority queue is any data type equipped with two operations:
It is easy to implement a priority queue in such a way than one of the operations is fast. It is more challenging to come up with an implementation in which both operations are fast.
The definition just given is that of a max
priority queue.
It is not difficult to adapt the definition
to that of a min
priority queue.
Do not mistake the heap data structure with the area of the computer memory used for dynamic memory allocation. The two concepts are independent.
Natural numbers (zero, one, two, three etc.) can be written in binary (base 2) notation, octal (base 8) notation, decimal (base 10) notation, and hexadecimal (base 16) notation.
In binary notation, the digits are 0 and 1. In octal notation, the digits are 0 1 .. 7. In decimal, the digits are 0 1 .. 9. And in hexadecimal, the digits are 0 1 .. 9 A B .. F.
For example, the number seventy-nine
is represented by
79
in decimal notation
since seventy-nine is
7×10 + 9.
The same number is represented by
1001111
in binary notation,
since seventy-nine is equal to 1×26 + 1×23 + 1×22 + 1×21 + 1×20.
The same number is represented by
117
in octal notation
(since seventy-nine is equal to 1×8² + 1×8¹ + 7),
and by 4F
in hexadecimal notation
(since seventy-nine is equal to 4×16¹ + F).
The C language uses the following convention:
For example, 0117 is the octal representation and 0x4F is the hexadecimal representation of seventy-nine.
Diacritics are the marks over letters (like á, ã, é, etc.) and the cedilla under a c to indicate accents.
The C99 standard of the C language allows letters with diacritics in names of variables and functions. But the letters with diacritics must be represented (in the source file of your program) by means of its Unicode numbers. For example, if you wish the name of a variable to be piñacolada, you must type pi\u00F1acolada. This complication makes the resource somewhat useless in practice…
In the case of character constants, however, the things are more friendly: letters with diacritics can be typed in directly (provided your languages environment is appropriate). For example,
int pinacolada = 0; // piñacolada char message[14] = "values in €"; printf( "%s\n", message);
produces
values in €
(Note that the string "values in €" occupies 14 bytes since € uses 3 bytes in UTF-8 encoding.)
Until about 2004, the bytes 10000000 to 11111111 were used to represent letters with diacritics and some others characters. This correspondence is defined by the ISO-LATIN-1 table, also known as ISO-8859-1 table.
With the popularization of the UTF-8 code, the ISO-LATIN-1 table was deprecated. Hence, the bytes whose first bit is 1 (unsigned chars 128 to 255) no longer represent characters.
The UTF-8 code represents each character by a sequence of 1 to 4 bytes. The coding scheme was carefully designed to allow for the unambiguous decoding of a sequence of bytes and to be compatible with the ASCII code.
The first 128 characters on the Unicode list (numbers U+0000 to U+007F) are represented by 1 byte each. The 1920 next characters (numbers U+0080 to U+07FF) are encoded in 2 bytes. And so on.
The table below shows the UTF-8 code structure. In the left column, we have the intervals of Unicode numbers, in hexadecimal notation. In the right column, in binary notation, the intervals of the corresponding code bytes:
Unicode numbers byte 1 byte 2 byte 3 byte 4 00000000 .. 0000007F 0xxxxxxx 00000080 .. 000007FF 110xxxxx 10xxxxxx 00000800 .. 0000FFFF 1110xxxx 10xxxxxx 10xxxxxx 00010000 .. 0010FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
Now, the same table written in decimal notation:
0 .. 127 000..127 128 .. 2047 192..223 128..191 2048 .. 65535 224..239 128..191 128..191 65536 .. 1114111 240..247 128..191 128..191 128..191
Finally, the same table written in hexadecimal notation:
0 .. 7F 00..7F 80 .. 7FF C0..DF 80..BF 800 .. FFFF E0..EF 80..BF 80..BF 10000 .. 10FFFF F0..F7 80..BF 80..BF 80..BF
First, a very rough draft of the algorithm:
// This function decides whether a string s contains a // well-formed sequence of parentheses and brackets. wellformed (string s) { for each element c of s do { if c ≡ ')' then if top of stack has '(' then pop stack else if c ≡ ']' then if top of stack has '[' then pop stack else push c on stack } if stack empty then returnyeselse returnno}
Now, a draft that is closer to actual code:
wellformed (string s) { allocate space for a stack of bytes for each byte c in s do { if c ≡ ')' then if stack nonempty and top of stack is '(' then pop else returnnoelse if c ≡ ']' then if stack nonempty and top of stack is '[' then pop else returnnoelse push c } if stack empty then returnyeselse returnno}
Here is a rough pseudocode draft in of the infixToPostfix function:
// This function receives an infix expression inf // and returns the corresponding postfix expression. infixToPostfix (string inf) { for each character c of inf do { depending on the value of c push c onto stack or add c to the postfix expression or transfer characters from stack to postfix } add '\0' to the postfix expression return postfix }
Now, a more precise and complete draft:
infixToPostfix (string inf) { allocate space for postfix string post allocate space for stack of bytes // first byte of inf is a '(' push first byte of inf onto stack for each byte c of inf do { if c ≡ '(' then push c else if c ≡ ')' then while top of stack is different from '(' pop x put x in post pop x else if c is '+' or '-' then while top of stack is different from '(' pop x put x in post push c else if c is '*' or '/' then while top of stack is neither '(' nor '+' nor '-' pop x put x in post push c else put c in post } put '\0' in post return post }
Here is um draft of the distances algorithm:
// Receives matrix A of interconnections // between 6 cities. Returns array d such that // d[i] is the distance from city c to city i. distances (matrix A[0..5][0..5], int c) { d[c] = 0 put c into the queue while queue is not empty { let i be the first city in the queue for each neighbor j of i do if d[j] not yet defined then d[j] = d[i] + 1 put j into queue remove i from queue } return d }
Here is a draft, in pseudocode, of the sieve function:
sieve (array v[1..m]) {
p = 1
while 2p ≤ m do
let c be the most valuable child of p
if v[p] ≥ v[c] then stop
else interchange v[p] and v[c]
else p = c // p moves down the heap
}
Here is a draft, in pseudocode, of the heapsort function:
// Rearranges v[1..n] in increasing order. heapsort (v[1..n]) { transform v[1..n] into a heap for m = n, n−1, ... , 2 do interchange v[1] and v[m] sieve (v[1..m-1]) }
Problem-solving skills are almost unanimously
the most important qualification
that employers look for […]
more than programming languages proficiency,
debugging, and system design.
What should I do when faced with a new computational problem?
To properly understand a problem, write its statement on a piece of papel, draw a diagram, explain the problem to a skeptical colleague. Try to solve some simple instances of the problem. Try to solve extreme instances, like the very small and the ones that have no solution.
(This note was inspired in the article How to think like a programmer — lessons in problem solving, by Richard Reis.)
Informatica gaat net zo min over computers
als astronomie over telescopen.
Computer science is no more about computers
than astronomy is about telescopes.
— Edsger W. Dijkstra
Edsger W. Dijkstra had strong and original opinions about the teaching of programming and computing. Here are some excerpts from his manuscript EWD1036 (On the cruelty of really teaching computing science):
So, if I look into my foggy crystal ball at the future of computing science education, I overwhelmingly see the depressing picture of 'Business as usual'. The universities will continue to lack the courage to teach hard science, they will continue to misguide the students, and each next stage of infantilization of the curriculum will be hailed as educational progress.
We could, for instance, begin with cleaning up our language by no longer calling a bug a bug but by calling it an error. It is much more honest because it squarely puts the blame where it belongs, viz. with the programmer who made the error. The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation. The nice thing of this simple change of vocabulary is that it has such a profound effect: while, before, a program with only one bug used to be 'almost correct', afterwards a program with an error is just 'wrong'.
The statement that a given program meets a certain specification amounts to a statement about all computations that could take place under control of that given program. And since this set of computations is defined by the given program, [we must] deal with all computations possible under control of a given program by ignoring them and working with the program. We must learn to work with program texts while (timerarily) ignoring that they admit the interpretation of executable code.
Needless to say that, all by itself, a program is no more than half a conjecture. The other half of the conjecture is the functional specification the program is supposed to satisfy. The programmer's task is to present such complete conjectures as proven theorems.
Right from the beginning, and all through the [introductory programming] course, we [should] stress that the programmer's task is not just to write down a program, but that his main task is to give a formal proof that the program he proposes meets the equally formal functional specification. [...] Finally, in order to drive home the message that this introductory programming course is primarily a course in formal mathematics, we see to it that the programming language in question has not been implemented on campus so that students are protected from the temptation to test their programs. And this concludes the sketch of my proposal for an introductory programming course for freshmen.
E. W. Dijkstra Archive
collection of manuscripts
of Edsger W. Dijkstra
See also Dijkstra's interview Discipline in Thought in 2001.