Programming Languages
What is C/C++?
C is a high-level programming language developed by Dennis Ritchie at Bell Labs in the mid 1970s. Although originally designed as a systems programming language , C has proved to be a powerful and flexible language that can be used for a variety of applications, from business programs to engineering. C is a particularly popular language for personal computer because it is relatively small and requires less memory than other languages.
The first major program written in C was the UNIX operating system, and for many years C was considered to be inextricably linked with UNIX.
Although it is a high-level language, C is much closer to assembly language than are most other high-level languages. This closeness to the underlying machine language allows C programmers to write very efficient code.
C++ is an OOP (Object Oriented Programming) in which
One of the principal advantages of object-oriented programming techniques over procedural programming techniques is that they enable programmers to create modules that do not need to be changed when a new type of object is added. A programmer can simply create a new object that inherits many of its feature from existing objects. This makes object-oriented programs easier to modify.
Tips
IO of binary files
To make sure that there is no CR/LF translation on non-Unix computers, you have to use the following lines to open streams to files with binary data.
ofstream os("output.flt", ios::out | ios::binary); |
For Visual C++, when using fstream.h, use in addition the flag ios::nocreate. Otherwise you can open a non-existing file for reading, without complaining. (This is not necessary when using fstream).
When are destructors called for local variables
Non-static (or 'automatic' ) variables are 'destructed' automatically when they go out of scope. Scope is a fairly complicated thing, and I'm not going to repeat the definition here. Roughly speaking the scope ends when you encounter the } around the declaration of the variable. See also the use of {} and how scope is defined in the for() statement.
Variables are destructed (by the compiler) by calling the appropriate destructor of their class. If the objects allocate memory (and hence the destructor should free that memory), this means that you recover the memory allocated.
class array |
Use {} to keep things local
Use of the grouping construct {} enables you to declare variables local to that group. When leaving the group, all local variables are destructed. This has the advantage that the reader of the code knows (s)he shouldn't worry about these variables to understand the rest of the code.
In a way this can be understood as if every use of {} is like a function call (with local variables declared in the function). Of course, you don't have the overhead of stack manipulations and jumps involved in a proper function call.
// recommended usage |
This tip is just an extension of the 'avoid global variables' credo.
As always, this can be disabused as in the following piece of code, where the outer variable 'a' is hidden by a local 'a', resulting in not very readable code.
// not very readable code |
Scope of variables declared in for()
The new ANSI C++ standard specifies that variables declared as in for(int i=1; ...) have a scope local to the for statement. Unfortunately, older compilers (for instance Visual C++ 5.0) use the older concept that the scope is the enclosing group. Below I list 2 possible problems, and their recommended solutions:
you want to use the variable after the for() statement
you have to declare the variable outside of the for() statement.
int i; |
you want to have multiple for() loops with the same variables.
Put the for statement in its own group. You could also declare the variable outside of the 'for', but it makes it slightly trickier for an optimizing compiler (and a human) to know what you intend.
{ |
.Binding a Reference to an Rvalue
Rvalues and lvalues are a fundamental concept of C++ programming. In essence, an rvalue is an expression that cannot appear on the left-hand side of an assignment expression. By contrast, an lvalue refers to an object (in its wider sense), or a chunk of memory, to which you can write a value. References can be bound to both rvalues and lvalues. However, due to the language's restrictions regarding rvalues, you have to be aware of the restrictions on binding references to rvalues, too. Binding a reference to an rvalue is allowed as long as the reference is bound to a const type. The rationale behind this rule is straightforward: you can't change an rvalue, and only a reference to const ensures that the program doesn't modify an rvalue through its reference. In the following example, the function f() takes a reference to const int:
void f(const int & i); |
The program passes the rvalue 2 as an argument to f(). At runtime, C++ creates a temporary object of type int with the value 2 and binds it to the reference i. The temporary and its reference exist from the moment f() is invoked until it returns; they are destroyed immediately afterwards. Note that had we declared the reference i without the const qualifier, the function f() could have modified its argument, thereby causing undefined behavior. For this reason, you may only bind references to const objects.
struct A{}; |
Comma-Separated Expressions
Comma-separated expressions were inherited from C. It's likely that you use such expressions in for- and while-loops rather often. Yet, the language rules in this regard are far from being intuitive. First, let's see what a comma separated expression is.
An expression may consist of one or more sub-expressions separated by commas. For example:
if(++x, --y, cin.good()) /*three expressions*/ |
The if condition contains three expressions separated by commas. C++ ensures that each of the expressions is evaluated and its side effects take place. However, the value of an entire comma-separated expression is only the result of the rightmost expression. Therefore, the if condition above evaluates as true only if cin.good() returns true. Here's another example of a comma expression:
int j=10; |
Calling a Function Before Program's Startup
Certain applications need to invoke startup functions that run before the main program starts. For example, polling, billing, and logger functions must be invoked before the actual program begins. The easiest way to achieve this is by calling these functions from a constructor of a global object. Because global objects are conceptually constructed before the program's outset, these functions will run before main() starts. For example:
class Logger |
The global object log is constructed before main() starts. During its construction, log invokes the function activate_log(). Thus, when main() starts, it can read data from the log file.
Hiding the Cumbersome Syntax of Pointers to Functions
void (*p[10]) (void (*)()); |
p is an "array of 10 pointers to a function returning void and taking a pointer to another function that returns void and takes no arguments." The cumbersome syntax is nearly indecipherable, isn't it? You can simplify this declaration considerably by using typedefs. First, declare a typedef for "pointer to a function returning void and taking no arguments" as follows:
typedef void (*pfv)(); |
Next, declare another typedef for "pointer to a function returning void and taking a pfv":
typedef void (*pf_taking_pfv) (pfv); |
Now declaring an array of 10 such pointers is a breeze:
pf_taking_pfv p[10]; /*equivalent to |
All About Pointers to Members
A class can have two general categories of members: function members and data members. Likewise, there are two categories of pointers to members: pointers to member functions and pointers to data members. The latter are less common because in general, classes do not have public data members. However, when using legacy C code that contains structs or classes that happen to have public data members, pointers to data members are useful.
Pointers to members are one of the most intricate syntactic constructs in C++, and yet, they are a very powerful feature too. They enable you to invoke a member function of an object without having to know the name of that function. This is very handy implementing callbacks. Similarly, you can use a pointer to data member to examine and alter the value of a data member without knowing its name.
Avoiding Memory Fragmentation
Often, appl ications that are free from memory leaks but frequently allocate and delocate dynamic memory show gradual performance degradation if they are kept running for long periods. Finally, they crash. Why is this? Recurrent allocation and deallocation of dynamic memory causes heap fragmentation, especially if the application allocates small memory chunks. A fragmented heap might have many free blocks, but these blocks are small and non-contiguous. To demonstrate this, look at the following scheme that represents the system's heap. Zeros indicate free memory blocks and ones indicate memory blocks that are in use: 100101010000101010110
The above heap is highly fragmented. Allocating a memory block that contains five units (i.e., five zeros) will fail, although the systems has 12 free units in total. This is because the free memory isn't contiguous. On the other hand, the following heap has less free memory but it's not fragmented: 1111111111000000
What can you do to avoid heap fragmentation? First, use dynamic memory as little as possible. In most cases, you can use static or automatic storage or use STL containers. Secondly, try to allocate and de-allocate large chunks rather than small ones. For example, instead of allocating a single object, allocate an array of objects at once. As a last resort, use a custom memory pool.
Optimizing Class Member Alignment
The size of a class can be changed simply by playing with the order of its members' declaration:
struct A |
On my machine, sizeof (A) equals 12. This result might seem surprising because the total size of A's members is only 6 bytes: 1+4+1 bytes. Where did the remaining 6 bytes come from? The compiler inserted 3 padding bytes after each bool member to make it align on a four-byte boundary. You can reduce A's size by reorganizing its data members as follows:
struct B |
This time, the compiler inserted only 2 padding bytes after the member c. Because b occupies four bytes, it naturally aligns on a word boundary without necessitating additional padding bytes.
Eliminating Temporary Objects
C++ creates temporary objects "behind your back" in several contexts. The overhead of a temporary can be significant because both its constructor and destructor are invoked. You can prevent the creation of a temporary object in most cases, though. In the following example, a temporary is created:
Complex x, y, z; |
The expression y+z; results in a temporary object of type Complex that stores the result of the addition. The temporary is then assigned to x and destroyed subsequently. The generation of the temporary object can be avoided in two ways:
Complex y,z; |
In the example above, the result of adding x and z is constructed directly into the object x, thereby eliminating the intermediary temporary. Alternatively, you can use += instead of + to get the same effect:
/* instead of x = y+z; */ |
Although the += version is less elegant, it costs only two member function calls: assignment operator and operator +=. In contrast, the use of + results in three member function calls: a constructor call for the temporary, a copy constructor call for x, and a destructor call for the temporary.
Storing Dynamically Allocated Objects in STL Containers
Suppose you need to store objects of different types in the same container. Usually, you do this by storing pointers to dynamically allocated objects. However, instead of using named pointers, insert the elements to the container as follows:
class Base {}; |
This way you ensure that the stored objects can only be accessed through their container. Remember to delete the allocated objects as follows:
delete v[0]; |
Arguments Into Functions
There are three techniques used to pass variables into a C or C++ functions. ?Pass by value C/C++
?Pass a pointer by value C/C++
?Pass by reference C++ only
Pass by value
The function receives a copy of the variable. This local copy has scope, that is, exists only within the function. Any changes to the variable made in the function are not passed back to the calling routine. The advantages of passing by values are simplicity and that is guaranteed that the variable in the calling routine will be unchanged after return from the function. There are two main disadvantages. First, it is inefficient to make a copy of a variable, particularly if it is large such as an array, structure or class. Second, since the variable in the calling routine will not be modified even if that's what is desired, only way to pass information back to the calling routine is via the return value of the function. Only one value may be passed back this way.
Example:
void IncreaseMe(int theInt); |
What is printed? Answer: i is 5.
Since i was passed by value, the local copy in the function gets incremented. The copy in main remained unmodified and still will be five.
Passing a pointer by value
A pointer to the variable is passed to the function. The pointer can then be manipulated to change the value of the variable in the calling routine. It is interesting to note that the pointer itself is passed by value. The function cannot change the pointer itself since it gets a local copy of the pointer. However, the function can change the contents of memory, the variable, to which the pointer refers. The advantages of passing by pointer are that any changes to variables will be passed back to the calling routine and that multiple variables can be changed.
Example:
void IncreaseMe2(int *theInt); |
What is printed? Answer: i is 6.
Since i was passed by reference, the function received a pointer to the variable. By properly dereferencing, i was incremented to 6. Not that while i was changed, the pointer to i, pt was not.
Passing by reference (C++ only)
C++ provides this third way to pass variables to a function. A reference in C++ is an alias to a variable. Any changes made to the reference will also be made to the original variable. When variables are passed into a function by reference, the modifications made by the function will be seen in the calling routine. References in C++ allow passing by reference like pointers do, but without the complicated notation. Since no local copies of the variables are made in the function, this technique is efficient. Additionally, passing multiple references into the function can modify multiple variables.
Example:
void IncreaseMe(int &theInt); |
What is printed? Answer: i is 6.
Since I was passed by reference, the variable in the calling routine is modified. Also, note that the main program is identical to the main program in the passing by value example. The compiler does the hard work. The programmer does not need to be concerned with the complicated notation and the explicit referencing and differencing seen when using pointers.
The Use of Braces in C/C++ C and C++ allow two styles of single statement if statements, one with braces and one without.
if (myPocketChange == 1000) { |
Which should you use? Generally, although it involves a little more typing, always use the form with braces. Why? When using nested if statements or the indentation used is misaligned very subtle hard to find bugs can result. Here's how:
myEggs = 10; |
Here's the output:
Prepare the Omelet!!
Surprised? Before you go on an egg hunt to find the missing two eggs realize that the compiler doesn't care about indentation. Only the first print statement is conditional. The second print statement is not part of the "if" statement. With proper indentation, the same code is:
myEggs = 10; |
Show compile duration
Simply add '/Y3' to the command line of VC (In the short cut). You'll now get reports on how long a compile took in your Build window.
Teach VC to intelligently expand classes/structs in debugger
Isn't it neat how VC's debugger knows how to intelligently expand CStrings, CPoints, POINTS, and a lot of other stuff? Well, you can teach it to handle your own structs and classes. Just edit autoexp.dat in sharedide/bin (in the directory where Visual Studio is installed.) The format of that file is fairly complicated, so I suggest just copying the examples already in the file.
Add user defined keywords for syntax highlighting
Why can you set a color for user defined keywords in Tools > Options > Format? Where do you set the keywords? Easy, just create usertype.dat in the sharedide/bin directory that Visual Studio is installed in. Put your keywords in that file, one per line.
How to use .cc file extensions for C++
Make the following modifications to the registry: HKEY_CURRENT_USER\Software\Microsoft\DevStudio\6.0\Text Editor\Tabs/Language Settings\C/C++
FileExtensions=cpp;cxx;c;h;hxx;hpp;inl;tlh;tli;rc;rc2;cc;cp
HKEY_USERS\S-1-5-21-1219703950-274334628-1532313055-1335\Software\Microsoft\DevStudio\6.0\Build System\Components\Platforms\Win32 (x86)\Tools\32-bit C/C++ Compiler for 80x86
Input_Spec=*.c;*.cpp;*.cxx,*Cc,*.cp
HKEY_USERS\S-1-5-21-1219703950-274334628-1532313055-1335\Software\Microsoft\DevStudio\6.0\Build System\Components\Tools\
Input_Spec=*.c;*.cpp;*Cxx;*Cc;*.cp
Add the flag "/Tp" to the compiler settings for the project.
Removing the "docking" capability from the menus
In Tools > Options..., in the Workspace tab, turn on "Use screen reader compatible menus". Your menus will lose the gripper (the double line on the left edge indicating dockability), and will stay nailed down like they should. Unfortunately this also removes the icons from the menus.
Hard code a debugger breakpoint
If you need to insert a hard breakpoint in your code (perhaps because you need to attach to a process), simply add the following line to your code.
__asm int 3; |
Tracking GDI resource leaks
Plenty of tools exist to help track down memory leaks. You've got the debug heap, Rational Purify for Windows, HeapAgent, and other tools. But there aren't any good tools to help track GDI resource leaks. A resource leak can crash the system under Windows 95 or Windows 98, and can ruin performance on any Windows operating system.
Memory Values
Check this page form information on "Funny" Memory Values. In particular:
If you're using the debug heap, memory is initialized and cleared with special values. Typically MFC automatically adds something like the following to your .cpp files to enable it:
#ifdef _DEBUG |
You can find information on using the debug heap here. Microsoft defines some of the magic values here. While using the debug heap, you'll see the values: Value Usage
0xCDCDCDCD Allocated in heap, but not initialized
0xDDDDDDDD Released heap memory.
0xFDFDFDFD "NoMansLand" fences automatically placed at boundary of heap memory. Should never be overwritten. If you do overwrite one, you're probably walking off the end of an array.
0xCCCCCCCC Allocated on stack, but not initialized
Display GetLastError's value and message
You can display the value GetLastError() will return by putting "@err" in your watch window. You can see the error message associated with that value by putting "@err,hr" in your watch window. If you've placed an HRESULT in a variable, adding ",hr" to the variable name in the watch window will display the associated text.
Display pointer as an array
If you expand a pointer and you only get a single item, just add ",n" to the entry in the watch window where n is the number of elements to expand. For example, if you have a foo * pFoo pointing to an array of ten elements, put pFoo,10 in your watch window to see all of the element. This can be useful to view parts of a large array. If pFoo points to an array of 5,000 elements, you might use (pFoo + 2000),10 to see elements 2000 through 2009.
Debug checked casts
If you want maximum safety, you should always use dynamic_cast. However, if you feel you must optimize away those costs, use this version of checked_cast. It will ASSERT on a bad cast in Debug builds, but not do the slightly more expensive dynamic_cast in Release builds.
// checked_cast - Uses fast static_cast in Release build, |
Avoiding Stepping Into Things
It's often useful to avoid stepping into some common code like constructors or overloaded operators. autoexp.dat provides this capability. Add a section called "[ExecutionControl]". Add keys where the key is the function name and the value is "NoStepInto". You can specify an asterisk (*) as a wildcard as the first set of colons for a namespace or class.
autoexp.dat is only read on Visual Studio's start up.
To ignore the function myfunctionname, and all calls to the class CFoo: [ExecutionControl]
myfunctionname=NoStepInto
CFoo::*=NoStepInto
To ignore construction and assignment of MFC CStrings: (Notice the extra = in CString::operator=.) [ExecutionControl]
CString::CString=NoStepInto
CString::operator==NoStepInto To ignore all ATL calls: [ExecutionControl]
ATL::*=NoStepInto
Naming Threads
Use "SetThreadName". The name is limited to 9 characters. SetThreadName fires an exception, which the debugger will catch and use to name the thread. The name will appears in Debug > Threads dialog.
#define MS_VC_EXCEPTION 0x406d1388 |