Reusable library design

What Makes for a Highly Useable Library?

In the previous section I talked about all kinds of things that can go wrong with dynamic libraries. Unfortunately, some of these problems can even occur with static libraries, such as the point about two libraries vying for the same resource. (What difference does it make if the code is linked into an application or sitting in a dynamic library? If two copies of the code are running, and both vie for the same resource, you will still have problems.)

Therefore, in the following sections I talk about some approaches you can take to ensure that you will create a library that is good. And, of course, this being a book about usability, I’m talking about libraries that are highly useable. (Would you expect anything different?)

A Unique Naming Convention

Choosing a unique name for your libraries helps make sure they don’t clash with other libraries. This section applies primarily to dynamic libraries, although if you’re creating a library (either static or dynamic) to be used by other developers, you want to make sure your names don’t clash then, either.

It would be crazy for me to try to give you a naming convention and then tell you to stick with it. However, what I can do is offer suggestions. First, the days of 8.3 filenames on Microsoft systems are a thing of the past. For some time even after Windows NT and Windows 95 both came along, some network servers still had trouble with long filenames. That’s not a concern anymore. You now have plenty of room to work.

Therefore, with all that extra space in a filename, consider using a naming convention that includes your company or product name, or an abbreviation of your company or product name. For example, instead of this name

comm.dll 

for the communication portion of your software, try something like this

abccomm.dll 

where abc is the name of your company. And if you want to embed version numbers (something I talk about in a later section, “Proper Handling of Versions”), you might do something like this

abccomm03.dll 

for version 3, for example. Or, you might even make a name that’s especially clear to end users, such as

CompanyName_CommunicationsLibrary_v03.dll 

Thus, remember:


RULE 

Pick a name that you’re reasonably sure will be unique.

Loading the Dynamic Library Now or Later?

On most modern operating systems, you can choose whether to have a dynamic library load automatically by the operating system when your application starts or to have your program load the dynamic library itself, manually.

You can read about all the technical details elsewhere on what exactly takes place when the operating system’s loader loads a dynamic library for you as it loads your application. The process involves loading the library into memory, obtaining the memory addresses of all the exported functions, and then loading your application, filling in the memory addresses of the function calls. Although complex, this process is easy for the programmer, because you don’t have to do anything at all; the operating system takes care of the hard work.

But, like everything in life, you get what you pay for. With this minimal work comes minimal power. If you want the loader to load the libraries for you, you’re limited on where those libraries can be located on the hard drive: They can be in the same directory as the executable, or they can be on the system path. (At least, that’s the case for Windows. Unix uses a very different approach, which I’ll describe shortly in this section.)

If you want to load your dynamic libraries from places other than the same directory as the program or the system path, you can instead manually load the libraries, specifying the full path and filename for the library. But that’s just the beginning; you then have to manually obtain the memory address of each function in the library you wish to call and then call the function through a pointer. If you attempt to just call the function directly, then when you build your program, the linker will not complete, because it will be unable to find the function you’re calling.

To locate the functions in your library at runtime, you use the GetProcAddress procedure, which is part of the Win32 SDK. This is a well-documented procedure, and therefore I won’t be showing you here how to do this.

In the sections that follow, I talk about different issues surrounding dynamic libraries, and in many issues I compare the pros and cons of letting the loader load the library versus loading the libraries manually.

Proper Handling of Versions

Microsoft Windows allows you to embed version information in your DLLs. It doesn’t work. Yes, you can embed the information, and yes, you can hope that the installation program you’re using is smart enough to look at the versions, and you can hope that other developers have used installation programs that look at version numbers. And you can also hope that if a user installs your program first and then installs somebody else’s program that uses one of the DLLs that your program installed, that other program’s installer will be smart enough to not overwrite your file if the version is older.

At least you can hope. And while you’re at it, you can hope that all disease, famine, and war will end.

Or, better, you can become an activist and try to help end disease, famine, war, and DLL problems. I can’t give you many tips on the famine thing, but I can give you some tips on dynamic libraries, whether on Windows, Unix, or any other system, regardless of whether the system supposedly supports versioning information.

But in case you’re curious, here’s the rundown of versioning information on Windows. First I’ll give you the old version approach, and then I’ll talk briefly about the way Windows does versioning under the newer .NET architecture. And then after that I’ll show you how you can implement your own versioning system, regardless of operating system.

The Old Windows Versioning System

Although the older versioning system on Windows is pretty much useless in a practical sense, I do recommend using it if for nothing more than to embed version information in your files in case anybody (including some wiser installation programs) looks at the files.

When you link together a binary executable on Windows (whether the executable is an EXE file or a DLL file), you can include resources in the final file. Such resources can include bitmaps, menus, dialog boxes, icons, string constants (for international programs, you can embed translated strings in place of your native language), and yes, version information.

Figure 10.1 shows an example of a project in Microsoft Visual C++ 6.0 where I’m setting the version information. You can see from this figure that the version information resource includes:

  • File version

  • Product version

  • Operating system

  • File type (which can be VFT_APP for application, and other names for such types as DLL, device driver, virtual device driver, and font file)

Click To expand
Figure 10.1: Visual C++ lets you create a version resource for your executables.

In addition, you can enter the following information:

  • Comments: Some general comments by you

  • Company name: The name of your company

  • File description: A description of what the program does

  • File version (this is just a copy of the file version in the upper part of the screen)

  • Internal name (such as Competition Killer 1.0, I suppose)

  • Legal copyright: Such as Copyright (c) 2003 Me, Inc.

  • Legal trademarks: Such as The Me logo is a trademark of Me, Inc.

  • Some various build numbers and product versions; the product version is a copy of the product version in the upper part of the screen.

You can treat every bit of information in this version screen as strictly comments. You can put whatever you want in them. What is the information used for? Very little. The biggest problem is that even though you might have two versions of the same DLL, you might well use the same filenames for the two versions. But here’s the big problem: Windows doesn’t allow two files in the same directory to have the same filename, even if the two files have different version information.

Some smarter installation programs use the FileVersion field, and some system utility programs display all the version information. But beyond these uses, this information really isn’t valuable. However, if you are using Windows, I still recommend that you create a version information resource. The reason is that some power users (such as me) will use various spy tools to see which executables are running on our computers. Such spy tools will list information from the version information resource, if the information is present; otherwise the tools will list just the filename. And if I see something like this

ea.dll 

(or ea.exe, or ea.vxd, and so on), I’ll be suspicious and will begin searching the Web under the assumption that I’m looking at a virus. But if I see with the entry an entire description of the program, then I’ll feel a little bit better. (Of course, the virus makers could lie and claim their file comes from Microsoft, for example. However, if I see no information at all, I automatically assume virus.)

The New Windows .NET Versioning System

The Windows .NET architecture takes a very different approach to versioning. You can put your files in the Global Assembly Cache (GAC), which is really just a directory on the computer, usually C:\WINDOWS\assembly\GAC. But you don’t just dump your files off inside this directory. Instead, you have to either make use of the installer tool that ships with Visual Stu-dio.NET or use the gacutil.exe program that also ships with Visual Studio.NET.

Now remember, in .NET, your DLLs are called assemblies. (Technically, an assembly is a type of DLL that contains additional information used by .NET.) I briefly mentioned the GAC in Chapter 8, “Under the Hood,” in the section, “Mucking with the System Directory: Keep Out!” But here are more details: The GAC includes a separate directory for each assembly. (That’s why you don’t dump your assembly into the GAC’s directory; you need your own subdirectory.) Under each subdirectory lives a separate subdirectory for each version of the assembly. The subdirectory is named for the version. Finally, inside each version subdirectory goes your actual DLL file containing the assembly.

The Visual Studio.NET installer tool is the tool you use to create an installer for your end users. The end users use your installer to install your product, and you can set up the installer to insert assemblies into the GAC. Or, if you’re installing a product manually (such as on a developer computer), you can use the gacutil.exe program by simply typing

gacutil /i myfile.dll 

(The online help entry for gacutil describes all the command-line parameters, such as one for listing the contents of the GAC.)

In order to put your assembly in the global cache, you need to make sure your assembly is a strong-named assembly. A strong-named assembly is simply an assembly that has a public key and digital signature attached to it, combined with information about the assembly, including its name and version number. (There’s that version stuff I’ve been talking about.) I won’t show you the steps for creating a strong-named assembly; instead, open up the Visual Studio.NET Combined Collection (that’s the name for the online help) and type strong-named into the index. There you’ll find all the gory, er, I mean helpful details. After you create a strong-named assembly (even if strongly named sounds better), you can insert the assembly into the Global Assembly Cache.

But you don’t have to always put your DLLs in the GAC. In fact, in general you should not put them in the GAC. Put them in the GAC only if you expect them to be used by multiple applications. For your own private assemblies that will be used only by your program, simply put them in the same directory as your executable file.

Now just to make sure I’m not blowing smoke here, I created a sample project in Visual C++.NET. The project is a Managed C++ Class Library. I then created a key pair (which is required in order to create a strong-named assembly); I typed the key pair name into the Key File Attribute section of the AssemblyInfo.cpp file; I then included a version number in this same file. The version number I specified is 1.0.1. (By convention, that means version 1.0, build 1.) I built the library and then opened up the .NET command-line prompt (which is available from the Start menu under Visual Studio.NET tools), and from the directory containing the built DLL I typed

gacutil /i UsableAssembly.dll 

Then, I returned to Visual C++.NET, changed the version number to 2.0.1, and rebuilt the project. I went back to the command-line prompt and typed the same gacutil command.

Next, I looked at the GAC from the command prompt. Here’s what I saw:

C:\WINDOWS\assembly\GAC\UsableAssembly>dir /s 
Volume in drive C has no label.
Volume Serial Number is 9090-6698

Directory of C:\WINDOWS\assembly\GAC\UsableAssembly

09/28/2003 12:48 AM <DIR> .
09/28/2003 12:48 AM <DIR> ..
09/28/2003 12:48 AM <DIR> 1.0.1.0__9aed7bce1e438dd5
09/28/2003 12:48 AM <DIR> 2.0.1.0__9aed7bce1e438dd5
0 File(s) 0 bytes

Directory of C:\WINDOWS\assembly\GAC\UsableAssembly\1.0.1.0__9aed7bce1e438dd5
09/28/2003 12:48 AM <DIR> .
09/28/2003 12:48 AM <DIR> ..
09/28/2003 12:48 AM 252 __AssemblyInfo__.ini
09/28/2003 12:48 AM 122,880 UsableAssembly.dll
2 File(s) 123,132 bytes

Directory of C:\WINDOWS\assembly\GAC\UsableAssembly\2.0.1.0__9aed7bce1e438dd5

09/28/2003 12:48 AM <DIR> .
09/28/2003 12:48 AM <DIR> ..
09/28/2003 12:48 AM 252 __AssemblyInfo__.ini
09/28/2003 12:48 AM 122,880 UsableAssembly.dll
2 File(s) 123,132 bytes

Total Files Listed:
4 File(s) 246,264 bytes
8 Dir(s) 9,818,222,592 bytes free

C:\WINDOWS\assembly\GAC\UsableAssembly>

Inside the GAC directory I can see a UsableAssembly directory. (That’s what I called my assembly.) Under that directory, I see two version subdirectories, each containing a different version of the DLL. (The name of each version subdirectory is the version number followed by some hex numbers that are related to the public key portion of the key pair.)

For the most part, this system works and is much better than the previous Windows versioning system. The idea is that you can have multiple versions of your DLLs on a single computer, and the .NET system will help each application locate the version it needs.

The Unix and Linux Versioning System

On a Unix system, life is very different from Windows. The various modern breeds of Unix automatically support versions of shared libraries. For example, here’s a list of some files I found on a Linux system in the /lib directory:

lrwxrwxrwx  1 root  root     12 Jul 17 11:09 libdb.so -> libdb-3.1.so* 
lrwxrwxrwx 1 root root 15 Jul 17 11:09 libdb.so.2 -> libdb1.so.2.1.3*
lrwxrwxrwx 1 root root 11 Jul 17 11:09 libdb.so.3 -> libdb2.so.3*
-rwxr-xr-x 1 root root 525905 Oct 12 2002 libdb-3.1.so*
lrwxrwxrwx 1 root root 15 Jul 17 11:09 libdb1.so.2 -> libdb1.so.2.1.3*
-rwxr-xr-x 1 root root 62620 Oct 12 2002 libdb1.so.2.1.3*
-rwxr-xr-x 1 root root 289204 Oct 12 2002 libdb2.so.3*

At the top of this listing are three dynamic library files, libdb.so, libdb.so.2, and libdb.so.3. At runtime, a program can request to link dynamically to version 2 of this library (specified as libdb.so.2), version 3 of this library (specified as libdb.so.3), or whatever is the current version (specified as libdb.so).

But all three of these are symbolic links to other files. The first, libdb.so, is a link to the current version, libdb-3.1.so. The libdb.so.2 file is a link to libdb1.so.2.1.3. And the libdb.so.3 file is a link to libdb2.so.3. The files these links link to, in turn, are the actual shared libraries, not links.

This versioning system on Unix allows the application developer to choose whether the application should just always load the current version of a library (such as libdb.so) or to always choose a particular version of a library (such as libdb.so.2).

But this system has a strange caveat: Notice the heavy use of symbolic links. The reason is that the versioning system doesn’t allow for minor versions. You cannot, for example, choose version 2.1.3 instead of version 2.1.2. Instead, you can choose only version 2, which is the major version. Or you can choose version 3, but nothing in between. If you choose version 2, you’ll get whatever minor version is currently on the system.

Implementing Your Own Versioning System on Windows

If you want to implement your own versioning system, you have several options for doing so. Remember, our goal is to make the application as useable as possible. Therefore, as I list these possibilities, I present them in the light of usability. Here are three possible approaches; you can probably think of some others:

  • Putting the dynamic library files in the same directory as the executable

  • Using the Registry (when on a Windows system)

  • Using a common area in conjunction with version information

The first item is by far the simplest: Just drop the dynamic libraries in the same directory as the executable. If different programs are on the system that require different versions of the same library, then each program will have its own version right in its own directory.

In the past, this might not have been a very good idea, since hard drive space came with a premium. If you have a 500MB hard drive, and you have a DLL that takes up, say, 1MB, and 15 programs need the DLL, then that would mean the 15 copies would take up 15MB. But today that’s not even an issue. What’s 15MB in a world of 80GB hard drives? Therefore, this is a viable option.

But on the other hand, suppose the user decides to clean out her hard drive, and she stumbles across the file called abccomm.dll. She then does a full search of the hard drive and finds 15 copies of this file scattered all about the system. I know that I, personally, would be a little distraught: Why are there 15 copies all over the place? Are they really necessary? And worse, I might be rather troubled if I find that they all have different timestamps and sizes!

But this approach also has another potential problem: What if this library is used by many different developers, and some of the applications put the library in their own directory, and at least one dumps the file into the system directory? What will happen? Well, by default, Windows will first look in the same directory as the executable. (I tested this out and found it to be true; I created two different versions of a DLL and put one in an application directory and one in the c:\windows\system32 directory, and the application located the one in its own directory.) Therefore, if some rogue developer decides to put all his dynamic libraries in the system directory, your application will be safe if you put the dynamic libraries in your application’s own directory.

Another option on a Windows system is to use the Registry. Remember, you have two ways to load a dynamic library: You can let the Windows loader load the library when it loads your application, or you can have your program load the library manually. If you’re willing to load the library manually, you can make use of the Registry to locate the library. The idea is simple. During installation, save a key in the Registry that holds the name of the directory containing the library.

Using this Registry approach, you can build your own versioning system. Here’s how to do this: For saving the location of dynamic libraries, you will want to use the HKEY_LOCAL_MACHINE tree. (That’s as opposed to HKEY_CURRENT_USER, since the location of the library will be the same regardless of which user is logged in.) Under this key is the Software key; under Software you create a key for your company name. Under your company name you have some choices: You can create a product key and put the library information there. Or, you can create a key specifically for the libraries. I recommend this approach; that way, you can have multiple applications (that you wrote) that use the same library but only one copy of the library and only one key pointing to the library.

Suppose, then, that your company is named Me, Inc. Your library is mecomm.dll, and your two applications are AllMe9000 and MeMeMe2000. Here’s how you might arrange the keys in the Registry:

HKEY_LOCAL_MACHINE 
Software
Me, Inc.
AllMe9000
options
MeMeMe2000
options
LibraryLocations
mecomm.dll
1.0
2.0

The 1.0 key would have the a string value containing the path to version 1.0 of the mecomm.dll library. (Remember, each key can have a set of named values as well as a default value. You can either have the default value for the 1.0 key contain the path or you might, for example, have a value named Path hold the path.)

Then, each application can simply grab the key HKEY_LOCAL_MACHINE/Software/Me, Inc. /LibraryLocations/mecomm.dll/1.0 to obtain the path to version 1.0 of the library and then load the library manually. Or, the application can drill down to 2.0 to locate version 2.0 of the library.

This Registry approach has an important benefit: You don’t need to mess with the system path! Personally, I am upset when an application installs itself and then adds its location to the system path.

The third option for rolling your own version system deals with using a common area in conjunction with version information. By this I simply mean you create a directory that contains several subdirectories, one for each version. Each version subdirectory contains a different version of the DLL.

This is the same way .NET handles its Global Assembly Cache, and in a sense this method mirrors the Registry approach. On Windows, you would typically create a directory for your products under the Program Files directory. For example, if you again have a company called Me, Inc., and you have two products, AllMe9000 and MeMeMe2000, and finally a library used by both applications called mecomm.dll, then you might create a directory structure like this:

Program Files 
MeInc
AllMe9000
MeMeMe2000
Libraries
1.0
2.0

In the 1.0 directory you would place the 1.0 versions of your libraries. In the 2.0 directory, you would place the 2.0 versions. Or, if you have drastically different versions, you might do something like this:

Program Files 
MeInc
AllMe9000
MeMeMe2000
Libraries
mecomm.dll
1.0
2.0
another.dll
6.5
7.2

Note 

I chose to leave out the comma, space, and period in the name Me, Inc., although you’re perfectly allowed to use these characters in directory names. On Windows, however, if you end a directory name with a period, that period won’t show up. If you create a directory called abcdef., then the final dot won’t make it into the directory name.

In the mecomm.dll\1.0 directory you would have mecomm.dll version 1.0, and in the mecomm.dll\2.0 directory you would have version 2.0 of the library. In the another.dll\6.5 directory you would have version 6.5 of another.dll, and in the another.dll\7.2 directory you would have version 7.2 of the library.

Finally, remember this:


Warning 

If you create a directory structure or a Registry structure pointing to the various versions of the libraries, the system loader will not be able to locate the libraries (as I describe in the following section, “Placing Libraries in the Correct Locations on Windows.”) Your program must, then, manually load the libraries.

Placing Libraries in the Correct Locations on Windows

If you don’t want to manually load your dynamic libraries, then you must rely on the system loader to load the libraries for you. Of course, doing so is much easier, because you don’t need to locate the functions inside the library. But with this approach comes the following limitation: If you rely on the loader to load your libraries, your libraries must be either in the same directory as the application file or somewhere in the system path.

From a usability standpoint, this means the only legitimate place to put your files is in the same directory as the application file, if you’re going to rely on the automatic loading of the libraries. Why? Here’s why:


RULE 

Don’t mess with the user’s system path.

As a user, I have two main reasons why I don’t want applications messing with my system path:

  • The system path is too easy for me, the user, to change, and that could cause your program to break.

  • I don’t want to see my system path getting longer, and longer, and longer.

Imagine if some user who was too smart for his own good decided to clean out the system path. (That’s easy to do, remember!) Then, suddenly, your application doesn’t run. Guess what comes next: support calls!

But modifying the system path has another problem: If you modify the system path for the current user, then the path won’t be modified for another user, and your program won’t run for that other user.

Of course, system restores and all kinds of other system-modifying tasks could alter the system path. Thus, as I just mentioned, if you’re letting the system loader load your libraries for you, the only viable place to put them is in the same directory as the application. Then you don’t need to touch the system path.

But with this comes two related issues that I want to bring to your attention; these issues apply more to the manual loading of dynamic libraries. First:


RULE 

Don’t hard-code any paths into your product!

And second:


RULE 

Stay away from environment variables.

The first of these should go without saying. Don’t hard-code the string c:\Program Files\MeInc\ into your program. Users might decide to install your product elsewhere.

But what about environment variables? One good way to hard-code some paths might be to set an environment variable, say, MeFiles, to be the value c:\Program Files\MeInc\. And yes, this would work. But this has the same problems as modifying the system path. Personally, I, as a user, get upset when I discover that a whole bunch of programs all deemed it necessary to create a bunch of environment variables on my system. And second, when you switch to a different user, depending on how these variables are stored, they may go away. And third, some too-smart-for-his-own-good user might delete the variables, resulting in a support call.

Therefore, here are your primary options for deciding where to put your dynamic libraries:

  • Put automatically loaded libraries in the same directory as the application’s executable file.

  • If you want to put the files elsewhere, load them manually.

Finally, if you feel the need for an environment variable storing, for example, the root directory of your products, here’s a viable alternative: Store this directory in the Registry. From there, you can construct your directories by appending the subdirectory names to this root path. For example, if your root path is c:\Program Files\MeInc\, then you can append the string Libraries\mecomm.dll\2.0 to locate version 2.0 of the mecomm.dll file. (But I’m still not totally pleased with this approach, because this again requires hard-coding certain information into your program.)

What is my favorite choice as a user? I prefer that you, as a programmer, write software that lands on my computer that uses the Registry approach that I describe in “Implementing Your Own Versioning System on Windows” in this chapter. That, of course, requires that you manually load your libraries. And if you refuse to manually load your libraries, then put the libraries in the same directory as the executable file.

Properly Using Resources in a Multitasking Environment (Think: Mutexes!)

If you write a library (either static or dynamic) that might be used by more than one application, and one that accesses resources of any kind (whether it’s a file, a hardware device, or whatever), then you will want to make sure your library can survive being run in a multitasking environment.

For example, if your library writes to a single log file, what happens if two programs use the library simultaneously? If you aren’t careful, you’ll probably end up with intertwined text. And what if you have two programs using your library to simultaneously read from a port? Who knows what exactly will happen, but it will be messy.


RULE 

Always expect that your library will be running in a multitasking environment. Code it as such.

The proper solution for sharing resources is to use mutexes. Remember, a mutex is a data structure that only one process or thread can own at any given time. Another process or thread can ask the operating system for ownership of the mutex, and that process or thread will freeze up until the other process or thread lets go of the mutex. Think of a mutex as a single key to access a device or other resource, and only one process can hold the key at any given time. Only when a process is finished with the key can another process take ownership of the key. Windows includes special functions for using mutexes, and the standard libraries on Unix (such as the Posix standard) contain various mutex functions.

Here’s an example of a class that makes use of a mutex for writing to a log file. This class has a hard-coded log filename (generally a bad idea, but I wanted to keep this example simple). This class would go inside the library:

HANDLE hMutex; 

class LogFile {
protected:
HANDLE hMutex;
ofstream *f;
public:
void Open();
void WriteLogLine(char *line);
void Close();
};

void LogFile::Open() {
hMutex = CreateMutex(NULL, FALSE, "MyIncLogFile");
f = new ofstream("c:/temp/myfile.dat", ios_base::app);
}

void LogFile::WriteLogLine(char *line) {
WaitForSingleObject(hMutex, INFINITE);
*f << line << endl;
ReleaseMutex(hMutex);
}

void LogFile::Close() {
CloseHandle(hMutex);
}

As you can see, I’m using the Win32 mutex functions and data structures; if you want, you can instead use the Posix versions. Here’s a sample main that uses this library:

int main(int argc, char* argv[]) 
{
LogFile log;
log.Open();
char buf[200];
for (int i=0; i<100000; i++) {
sprintf (buf, "Hi there everybody, my number is %d", i);
log.WriteLogLine(buf);
}
return 0;
}

This program writes out 100,000 lines to a log file. I compiled this program as console1.exe, and then I wrote a batch file that launches two instances of the program simultaneously. Here’s the batch file:

start console1.exe 
start console1.exe

When I ran this program without mutexes, I ended up with lines that were mixed together, as in

Hi there everybody, my number Hi there everybody my number is 10. 

But when I used the mutexes, this intertwined output no longer occurred.

Comments