Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Václav Haisman edited this page Feb 28, 2023 · 1 revision

Log4cplus uses the expression "UNICODE" in at least two not so equal meanings:

  1. the Unicode standard as defined by the Unicode Consortium

  2. compiler's and/or C++ standard library's support for strings of wchar_ts and their manipulation

wchar_t support

Log4cplus is aimed to be portable and to have as little 3rd party dependencies as possible. To fulfill this goal it has to use facilities offered by the operating systems and standard libraries it runs on. To offer the best possible level of support of national character, it has to support usage of wchar_t and it has to use wchar_t support (especially on Windows) provided by operating system and standard C and C++ libraries.

This approach to portability has some limitations. One of the limitations is lacking support for C++ locales in various operating systems and standard C++ libraries. Some standard C++ libraries do not support other than the "C" and "POSIX" locales. This usually means that wchar_tchar conversion using std::codecvt<> facet is impossible. On such deficient platforms, log4cplus can use either standard C locale support or iconv() (through libiconv or built--in).

Unicode and file appenders

Another limitation related to Unicode support is then inability to write wchar_t messages that contain national characters that do not map to any code point in single byte code page to log files using FileAppender. This is a problem mainly on Windows. Linux and other Unix--like systems can avoid it because they do not need to use wchar_t interfaces to have Unicode aware applications. They usually (as of year 2012) use UTF-8 based locales. With proper C++ locale setup in client applications, national characters can come through into log files unharmed. But if they choose to use wchar_t strings, they face the problem as well.

Unix--like platforms

To support output of non-ASCII characters in wchar_t message on Unix--like platforms, it is necessary to use UTF-8 based locale (e.g., en_US.UTF-8) and to set up global locale with std::codecvt<> facet or imbue individual FileAppenders with that facet. The following code can be used to get such std::locale instance and to set it into global locale:

std::locale::global (     // set global locale
    std::locale (         // using std::locale constructed from
        std::locale (),   // global locale
                          // and codecvt facet from user locale
        new std::codecvt_byname<wchar_t, char, std::mbstate_t>("")));

Windows

Windows do not support UTF-8 based locales. The above approach will yield a std::locale instance converting wchar_ts to current process' code page. Such locale will not be able to convert Unicode code points outside the process' code page. This is true at least with the std::codecvt facet implemented in Visual Studio 2010. Instead, with Visual Studio 2010 and later, it is possible to use std::codecvt_utf8 facet:

std::locale::global (     // set global locale
    std::locale (         // using std::locale constructed from
        std::locale (),   // global locale
                          // and codecvt_utf8 facet
        new std::codecvt_utf8<tchar, 0x10FFFF,
            static_cast<std::codecvt_mode>(std::consume_header
                | std::little_endian)>));
Clone this wiki locally
Morty Proxy This is a proxified and sanitized view of the page, visit original site.