Wednesday, June 14, 2006

Win32: Mappings for Unicode support

I have spent some time during the past few days to play with the Win32 API again after a year since first looking at them. I must learn how to manage processes under Windows as part of my SoC project, Boost.Process, and this involves native Windows programming with the Win32 API.

After creating a fresh C++ Win32 console application project from Visual Studio 2005, I noticed that the template code had a _tmain function rather than a main one. I did not pay much attention to it until I looked at some code examples that deal with the CreateProcess call: they use weirdly named functions such as _tcsdup and types as _TCHAR instead of the traditional strdup and char * respectively. I could not resist to learn why they did this.

Spending some time searching and reading the MSDN documentation answered my question. These functions and types are wrappers around the standard objects: the functions and types they really point to depend on whether you define the _UNICODE macro during the build or not.

As you can easily guess, defining _UNICODE maps those routines and types to entities that can handle Unicode strings, effectively making your application Unicode-aware. Similarly, if you do not define the symbol, the application remains SBCS/MBCS compatible (the distinction between these two also depends on another macro, as far as I can tell). And because all these redirections are handled by the preprocessor, there is no run-time overhead.

For example: the _tmain function is mapped to the traditional main subroutine if and only if _UNICODE is undefined while it is mapped to wmain otherwise. The latter takes wide-character argv and envp pointers in contrast to the former.

I do not know to which extent this macro is supported by the standard libraries, although I bet almost everything supports it; I have seen many other functions taking advantage of this redirection. In the specific domain I am analyzing, there are two implementations for CreateProcess: CreateProcessW, the Unicode version; and CreateProcessA, the ANSI one.

OK, my knowledge about internationalization is very limited, and I do not know if this feature is very useful or not, but it seems quite interesting at the very least.

See Routine Mappings (CRT) and main: Program Startup (C++) for more details.

Edit (17:24): Changed MFC references to Win32. Thanks to Jason for pointing out the difference between the two in one of the comments. I am in fact investigating the latter.

5 comments:

Anonymous said...

>and this involves native >Windows programming with the >MFC.

What attitude has MFC to your work? May be you tangle win32 API and MFC? If not, your implementation will depend on not free library?

Julio M. Merino Vidal said...

If I understand you correctly, you want to know why I need MFC in Boost.Process, is it?

One of the goals of Boost.Process is to be portable, supporting POSIX systems and Windows at a minimum.

In order to implement the required functionality in Windows, I must use the native API, not some POSIX compatibility layer around it.

Please note that I've always assummed that Win32 API = MFC, but maybe I am wrong and I should change the term.

Jason said...

Thankfully, MFC is not the name of the standard Windows API. That's Win32, and it's in C only. MFC is the horrendous C++ wrapper library that has been causing library dependancy problems ever since they allowed ISVs access to the source.

If you really are using any part of MFC in your Windows code, I'd implore you to look for the Win32 alternative. (which MFC is calling anyway) :)

Anonymous said...

>Win32 API = MFC, but maybe I am wrong and I >should change the term.

MFC is wrapper around Win32 API, also it contains some widgets and so on.

If tell on "unix language",
posix ~= win32 api
qt ~= mfc,

so when you say something like
I want use "fork", so I have to use QT,
or I'll use "exec", so I have to use gtkmm,
this sounds like nonsence for me.

Julio M. Merino Vidal said...

Thanks to both for the clarification. I have fixed the article accordingly.