Jump to content

Wide character: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
added links for header files wchar.h and wctype.h
No edit summary
Line 14: Line 14:


In [[C standard library|ANSI C library header files]], <[[Wchar.h|wchar.h]]> and <[[Wctype.h|wctype.h]]> deal with the wide characters.
In [[C standard library|ANSI C library header files]], <[[Wchar.h|wchar.h]]> and <[[Wctype.h|wctype.h]]> deal with the wide characters.
==Functions==

There are several functions in C's [[stdlib.h]] to help with wchar_t's.

* wctomb() - wide char to multibyte char <ref>[http://www.cplusplus.com/reference/clibrary/cstdlib/mbtowc/ C++ Resources Network - wctomb], access 2009 12 15</ref>
* mbtowc() - multibyte char to wide char <ref>[http://www.cplusplus.com/reference/clibrary/cstdlib/mbtowc/ C++ Resources Network - mbtowc], access 2009 12 15</ref>
* wcstombs() - wide-char string to multibyte-char string <ref>[http://www.cplusplus.com/reference/clibrary/cstdlib/mbtowc/ C++ Resources Network - wcstombs], access 2009 12 15</ref>
* mbstowcs() - multibyte-char string to wide-char string <ref>[http://www.cplusplus.com/reference/clibrary/cstdlib/mbtowc/ C++ Resources Network - mbstowcs], access 2009 12 15</ref>
* mblen() - number of bytes in a multibyte char <ref>[http://www.cplusplus.com/reference/clibrary/cstdlib/mbtowc/ C++ Resources Network - mblen], access 2009 12 15</ref>


==External links==
==External links==

Revision as of 18:56, 15 December 2009

Wide character is a computer programming term. It is a vague term used to represent a datatype that is richer than the traditional (8-bit) characters. It is not the same thing as Unicode.

wchar_t is a data type in ANSI/ISO C, ANSI/ISO C++, and some other programming languages that is intended to represent wide characters.

The Unicode standard 4.0 says that

"ANSI/ISO C leaves the semantics of the wide character set to the specific implementation but requires that the characters from the portable C execution set correspond to their wide character equivalents by zero extension."

and that

"The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text. The wchar_t type is intended for storing compiler-defined wide characters, which may be Unicode characters in some compilers."

Under Win32, wchar_t is 16 bits wide and represents a UTF-16 code unit. On Unix-like systems wchar_t is commonly 32 bits wide and represents a UTF-32 code unit.

In ANSI C library header files, <wchar.h> and <wctype.h> deal with the wide characters.

Functions

There are several functions in C's stdlib.h to help with wchar_t's.

  • wctomb() - wide char to multibyte char [1]
  • mbtowc() - multibyte char to wide char [2]
  • wcstombs() - wide-char string to multibyte-char string [3]
  • mbstowcs() - multibyte-char string to wide-char string [4]
  • mblen() - number of bytes in a multibyte char [5]


  1. ^ C++ Resources Network - wctomb, access 2009 12 15
  2. ^ C++ Resources Network - mbtowc, access 2009 12 15
  3. ^ C++ Resources Network - wcstombs, access 2009 12 15
  4. ^ C++ Resources Network - mbstowcs, access 2009 12 15
  5. ^ C++ Resources Network - mblen, access 2009 12 15