국가 언어 코드, iconv

2015. 11. 12. 11:40

iconv.htm%23a197c1176

National Language Support Guide and Reference

Using the iconv Command

Any converter installed in the system can be used through the iconv command, which uses the iconv library. The iconv command acts as a filter for converting from one code set to another. For example, the following command filters data from PC Code (IBM-850) to ISO8859-1:

cat File | iconv -f IBM-850 -t ISO8859-1 | tftp -p - host /tmp/fo

The iconv command converts the encoding of characters read from either standard input or the specified file and then writes the results to standard output.

Understanding libiconv

The iconv application programming interface (API) consists of the following subroutines that accomplish conversion:

iconv_open: Performs the initialization required to convert characters from the code set specified by the FromCode parameter to the code set specified by the ToCode parameter. The strings specified are dependent on the converters installed in the system. If initialization is successful, the converter descriptor, iconv_t, is returned in its initial state.
iconv: Invokes the converter function using the descriptor obtained from the iconv_open subroutine. The inbuf parameter points to the first character in the input buffer, and the inbytesleft parameter indicates the number of bytes to the end of the buffer being converted. The outbuf parameter points to the first available byte in the output buffer, and the outbytesleft parameter indicates the number of available bytes to the end of the buffer.
For state-dependent encoding, the subroutine is placed in its initial state by a call for which the inbuf value is a null pointer. Subsequent calls with the inbuf parameter as something other than a null pointer cause the internal state of the function to be altered as necessary.
iconv_close: Closes the conversion descriptor specified by the cd variable and makes it usable again.

In a network environment, the following factors determine how data should be converted:

Code sets of the sender and the receiver
Communication protocol (8-bit or 7-bit data)

The following table outlines the conversion methods and recommends how to convert data in different situations. See the Interchange Converters—7-bit and the Interchange Converters—8-bit for more information.

Outline of Methods and Recommended Choices
	Communication with system using the same code set		Communication with system using different code set (or receiver's code set is unknown)
	Protocol		Protocol
Method to choose	7-bit only	8-bit	7-bit only	8-bit
as is	Not valid	Best choice	Not valid	Not valid if remote code set is unknown
fold7	OK	OK	Best choice	OK
fold8	Not valid	OK	Not valid	Best choice
uucode	Best choice	OK	Not valid	Not valid

If the sender uses the same code set as the receiver, the following possibilities exist:

When protocol allows 8-bit data, the data can be sent without conversions.

When protocol allows only 7-bit data, the 8-bit code points must be mapped to 7-bit values. Use the iconv interface and one of the following methods:

uucode	Provides the same mapping as the uuencode and uudecode commands. This is the recommended method. For more information, see Interchange Converters—uucode.
7–bit	Converts internal code sets using 7-bit data. This method passes ASCII without any change. For more information, see Interchange Converters—7-bit.

If the sender uses a code set different from the receiver, there are two possibilities:

When protocol allows only 7-bit data, use the fold7 method.

When protocol allows 8-bit data and you know the receiver's code set, use the iconv interface to convert the data. If you do not know the receiver's code set, use the following method:

8–bit

Converts internal code sets to standard interchange formats. The 8-bit data is transmitted and the information is preserved so that the receiver can reconstruct the data in its code set. For more information, see Interchange Converters—8-bit.

Using the iconv_open Subroutine

The following examples illustrate how to use the iconv_open subroutine in different situations:

When the sender and receiver use the same code sets, and if the protocol allows 8-bit data, you can send data without converting it. If the protocol allows only 7-bit data, do the following:
```
Sender:
 cd = iconv_open("uucode", nl_langinfo(CODESET));
Receiver:
 cd = iconv_open(nl_langinfo(CODESET), "uucode"); 
```

Whne the sender and receiver use different code sets, and if the protocol allows 8-bit data and the receiver's code set is unknown, do the following:

Sender:
 cd = iconv_open("fold8", nl_langinfo(CODESET));
Receiver:
 cd = iconv_open(nl_langinfo(CODESET),"fold8" );

If the protocol allows only 7-bit data, do the following:

Sender:
 cd = iconv_open("fold7", nl_langinfo(CODESET));
Receiver:
 cd = iconv_open(nl_langinfo(CODESET), "fold7" );

The iconv_open subroutine uses the LOCPATH environment variable to search for a converter whose name is in the following form:

iconv/FromCodeSet_ToCodeSet

The FromCodeSet string represents the sender's code set, and the ToCodeSet string represents the receiver's code set. The underscore character separates the two strings.

Note:

All setuid and setgid programs ignore the LOCPATH environment variable.

Because the iconv converter is a loadable object module, a different object is required when running in the 64-bit environment. In the 64-bit environment, the iconv_open routine uses the LOCPATH environment variable to search for a converter whose name is in the following form:

iconv/FromCodeSet_ToCodeSet__64.

The iconv library automatically chooses whether to load the standard converter object or the 64-bit converter object. If the iconv_open subroutine does not find the converter, it uses the from,to pair to search for a file that defines a table-driven conversion. The file contains a conversion table created by the genxlt command.

The iconvTable converter uses the LOCPATH environment variable to search for a file whose name is in the following form:

iconvTable/FromCodeSet_ToCodeSet

If the converter is found, it performs a load operation and is initialized. The converter descriptor, iconv_t, is returned in its initial state.

Converter Programs versus Tables

Converter programs are executable functions that convert data according to a set of rules. Converter tables are single-byte conversion tables that perform stateless conversions. Programs and tables are in separate directories, as follows:

/usr/lib/nls/loc/iconv	Converter programs
/usr/lib/nls/loc/iconvTable	Converter tables

After a converter program is compiled and linked with the libiconv.a library, the program is placed in the /usr/lib/nls/loc/iconv directory.

To build a table converter, build a source converter table file. Use the genxlt command to compile translation tables into a format understood by the table converter. The output file is then placed in the /usr/lib/nls/loc/iconvTable directory.

Unicode and Universal Converters

Unicode (or UCS-2) conversion tables are found in:

$LOCPATH/uconvTable/*CodeSet*

The $LOCPATH/uconv/UCSTBL converter program is used to perform the conversion to and from UCS-2 using the iconv utilities.

A Universal converter program is provided that can be used to convert between any two code sets whose conversions to and from UCS-2 is defined. Given the following uconv tables:

X     -> UCS-2
UCS-2 -> Y

a universal conversion can be defined that maps the following:

X -> UCS-2 -> Y

by use of the $LOCPATH/iconv/Universal_UCS_Conv.

Universal UCS Converter

UCS-2 is a universal 16-bit encoding that can be used as an interchange medium to provide conversion capability between virtually any code sets. The conversion can be accomplished using the Universal UCS Converter, which converts between any two code sets XXX and YYY as follows:

XXX <-> UTF-32 <-> YYY

The XXX and YYY conversions must be included in the supported List of UCS-2 Interchange Converters, and must be installed on the system.

The universal converter is installed as the file /usr/lib/nls/loc/iconv/Universal_UCS_Conv.

The conversion between multibyte and wide character code depends on the current locale setting. Do not exchange wide character codes between two processes, unless you have knowledge that each locale that might be used handles wide character codes in a consistent fashion. Most locales for this operating system use the Unicode character value as a wide character code, except locales based on IBM-eucTW codesets.

Using Converters

The iconv interface is a set of the following subroutines used to open, perform, and close conversions:

Code Set Conversion Filter Example

The following example shows how you can use these subroutines to create a code set conversion filter that accepts the ToCode and FromCode parameters as input arguments:

#include <stdio.h>
#include <nl_types.h>
#include <iconv.h>
#include <string.h>
#include <errno.h>
#include <locale.h>
#define ICONV_DONE() (r>=0)
#define ICONV_INVAL() (r<0) && (errno==EILSEQ))
#define ICONV_OVER() (r<0) && (errno==E2BIG))
#define ICONV_TRUNC() (r<0) && (errno==EINVAL))
#define USAGE 1
#define ERROR 2
#define INCOMP 3
char ibuf[BUFSIZ], obuf[BUFSIZ];
extern int errno;
main (argc,argv)
int argc;
char **argv;
{

 size_t  ileft,oleft;
 nl_catd catd;
 iconv_t cd;
 int r;
 char *ip,*op;
 setlocale(LC_ALL,"");
 catd = catopen (argv[0],0);
 if(argc!=3){
  fprintf(stderr,
   catgets (catd,NL_SETD,USAGE,"usage;conv fromcode tocode\n"));
  exit(1);
 }
 cd=iconv_open(argv[2],argv[1]);
ileft=0;
while(!feof(stdin)) {

 /*
 * After the next operation,ibuf will
 * contain new data plus any truncated
 * data left from the previous read.
 */
 ileft+=fread(ibuf+ileft,1,BUFSIZ-ileft,stdin);
 do {
  ip=ibuf;
  op=obuf;
  oleft=BUFSIZ;
  r=iconv(cd,&ip,&ileft,&op,&oleft);
  if(ICONV_INVAL()){
   fprintf(stderr,
      catgets(catd,NL_SETD,ERROR,"invalid input\n"));
   exit(2);
 }
 fwrite(obuf,1,BUFSIZ-oleft,stdout);
 if(ICONV_TRUNC() || ICONV_OVER())
  /*
  *Data remaining in buffer-copy
  *it to the beginning
  */
  memcpy(ibuf,ip,ileft);
  /*
  *loop until all characters in the input
  *buffer have been converted.
  */
 } while(ICONV_OVER());
}
 if(ileft!=0){
  /*
  *This can only happen if the last call
  *to iconv() returned ICONV_TRUNC, meaning
  *the last data in the input stream was
  *incomplete.
  */
 fprintf(stderr,catgets(catd,NL_SETD,INCOMP,"input incomplete\n"));
 exit(3);
 }
 iconv_close(cd);
 exit(0);
}

Naming Converters

Code set names are in the form CodesetRegistry-CodesetEncoding where:

CodesetRegistry	Identifies the registration authority for the encoding. The CodesetRegistry must be made of characters from the portable code set (usually A-Z and 0-9).
CodesetEncoding	Identifies the coded character set defined by the registered authority.

The from,to variable used by the iconv command and iconv_open subroutine identifies a file whose name should be in the form /usr/lib/nls/loc/iconv/%f_%t or /usr/lib/nls/loc/iconvTable/%f_%t, where:

%f	Represents the FromCode set name
%t	Represents the ToCode set name

List of Converters

Converters change data from one code set to another. The sets of converters supported with the iconv library are listed in the following sections. All converters shipped with the BOS Runtime Environment are located in the /usr/lib/nls/loc/iconv/* or /usr/lib/nls/loc/iconvTable/* directory.

These directories also contain private converters; that is, they are used by other converters. However, users and programs should only depend on the converters in the following lists.

Any converter shipped with the BOS Runtime Environment and not listed here should be considered private and subject to change or deletion. Converters supplied by other products can be placed in the /usr/lib/nls/loc/iconv/* or /usr/lib/nls/loc/iconvTable/* directory.

Programmers are encouraged to use registered code set names or code set names associated with an application. The X Consortium maintains a registry of code set names for reference. See Code Sets for National Language Support for more information about code sets.

PC, ISO, and EBCDIC Code Set Converters

These converters provide conversion between PC, ISO, and EBCDIC single-byte stateless code sets. The following types of conversions are supported: PC to/from ISO, PC to/from EBCDIC, and ISO to/from EBCDIC.

Conversion is provided between compatible code sets such as Latin-1 to Latin-1 and Greek to Greek. However, conversion between different EBCDIC national code sets is not supported. For information about converting between incompatible character sets, refer to the Interchange Converters—7-bit and the Interchange Converters—8-bit.

Conversion tables in the iconvTable directory are created by the genxlt command.

Compatible Code Set Names

The following table lists code set names that are compatible. Each line defines to/from strings that may be used when requesting a converter.

Note:

The PC and ISO code sets are ASCII-based.

Code Set Compatibility
Character Set	Languages	PC	ISO	EBCDIC
Latin-1	U.S. English, Portuguese, Canadian French	N/A	ISO8859-1	IBM-037
Latin-1	Danish, Norwegian	N/A	ISO8859-1	IBM-277
Latin-1	Finnish, Swedish	N/A	ISO8859-1	IBM-278
Latin-1	Italian	N/A	ISO8859-1	IBM-280
Latin-1	Japanese	N/A	ISO8859-1	IBM-281
Latin-1	Spanish	N/A	ISO8859-1	IBM-284
Latin-1	U.K. English	N/A	ISO8859-1	IBM-285
Latin-1	German	N/A	ISO8859-1	IBM-273
Latin-1	French	N/A	ISO8859-1	IBM-297
Latin-1	Belgian, Swiss German	N/A	ISO8859-1	IBM-500
Latin-2	Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian Latin, Slovak, Slovene	IBM-852	ISO88859-2	IBM-870
Cyrillic	Bulgarian, Macedonian, Serbian Cyrillic, Russian	IBM-855	ISO8859-5	IBM-880 IBM-1025
Cyrillic	Russian	IBM-866	ISO8859-5	IBM-1025
Hebrew	Hebrew	IBM-856 IBM-862	ISO8859-8	IBM-424 IBM-803
Turkish	Turkish	IBM-857	ISO8859-9	IBM-1026
Arabic	Arabic	IBM-864 IBM-1046	ISO8859-6	IBM-420
Greek	Greek	IBM-869	ISO8859-7	IBM-875
Greek	Greek	IBM-869	ISO8859-7	IBM-875
Baltic	Lithuanian, Latvian, Estonian	IBM-921 IBM-922	ISO8859-4	IBM-1112 IBM-1122

Note:

A character that exists in the source code set but does not exist in the target code set is converted to a converter-defined substitute character.

Files

The following table describes the inconvTable converters found in the /usr/lib/nls/loc/iconvTable directory:

iconvTable Converters
Converter Table	Description	Language
IBM-037_IBM-850	IBM-037 to IBM-850	U.S. English, Portuguese, Canadian-French
IBM-273_IBM-850	IBM-273 to IBM-850	German
IBM-277_IBM-850	IBM-277 to IBM-850	Danish, Norwegian
IBM-278_IBM-850	IBM-278 to IBM-850	Finnish, Swedish
IBM-280_IBM-850	IBM-280 to IBM-850	Italian
IBM-281_IBM-850	IBM-281 to IBM-850	Japanese-Latin
IBM-284_IBM-850	IBM-284 to IBM-850	Spanish
IBM-285_IBM-850	IBM-285 to IBM-850	U.K. English
IBM-297_IBM-850	IBM-297 to IBM-850	French
IBM-420_IBM_1046	IBM-420 to IBM-1046	Arabic
IBM-424_IBM-856	IBM-424 to IBM-856	Hebrew
IBM-424_IBM-862	IBM-424 to IBM-862	Hebrew
IBM-500_IBM-850	IBM-500 to IBM-850	Belgian, Swiss German
IBM-803_IBM-856	IBM-803 to IBM-856	Hebrew
IBM-803_IBM-862	IBM-803 to IBM-862	Hebrew
IBM-850_IBM-037	IBM-850 to IBM-037	U.S. English, Portuguese, Canadian-French
IBM-850_IBM-273	IBM-850 to IBM-273	German
IBM-850_IBM-277	IBM-850 to IBM-277	Danish, Norwegian
IBM-850_IBM-278	IBM-850 to IBM-278	Finnish, Swedish
IBM-850_IBM-280	IBM-850 to IBM-280	Italian
IBM-850_IBM-281	IBM-850 to IBM-281	Japanese-Latin
IBM-850_IBM-284	IBM-850 to IBM-284	Spanish
IBM-850_IBM-285	IBM-850 to IBM-285	U.K. English
IBM-850_IBM-297	IBM-850 to IBM-297	French
IBM-850_IBM-500	IBM-850 to IBM-500	Belgian, Swiss German
IBM-856_IBM-424	IBM-856 to IBM-424	Hebrew
IBM-856_IBM-803	IBM-856 to IBM-803	Hebrew
IBM-856_IBM-862	IBM-856 to IBM-862	Hebrew
IBM-862_IBM-424	IBM-862 to IBM-424	Hebrew
IBM-862_IBM-803	IBM-862 to IBM-803	Hebrew
IBM-862_IBM-856	IBM-862 to IBM-856	Hebrew
IBM-864_IBM-1046	IBM-864 to IBM-1046	Arabic
IBM-921_IBM-1112	IBM-921 to IBM-1112	Lithuanian, Latvian
IBM-922_IBM-1122	IBM-922 to IBM-1122	Estonian
IBM-1112_IBM-921	IBM-1121 to IBM-921	Lithuanian, Latvian
IBM-1122_IBM-922	IBM-1122 to IBM-922	Estonian
IBM-1046_IBM-420	IBM-1046 to IBM-420	Arabic
IBM-1046_IBM-864	IBM-1046 to IBM-864	Arabic
IBM-037_ISO8859-1	IBM-037 to ISO8859-1	U.S. English, Portuguese, Canadian French
IBM-273_ISO8859-1	IBM-273 to ISO8859-1	German
IBM-277_ISO8859-1	IBM-277 to ISO8859-1	Danish, Norwegian
IBM-278_ISO8859-1	IBM-278 to ISO8859-1	Finnish, Swedish
IBM-280_ISO8859-1	IBM-280 to ISO8859-1	Italian
IBM-281_ISO8859-1	IBM-281 to ISO8859-1	Japanese-Latin
IBM-284_ISO8859-1	IBM-284 to ISO8859-1	Spanish
IBM-285_ISO8859-1	IBM-285 to ISO8859-1	U.K. English
IBM-297_ISO8859-1	IBM-297 to ISO8859-1	French
IBM-420_ISO8859-6	IBM-420 to ISO8859-6	Arabic
IBM-424_ISO8859-8	IBM-424 to ISO8859-8	Hebrew
IBM-500_ISO8859-1	IBM-500 to ISO8859-1	Belgian, Swiss German
IBM-803_ISO8859-8	IBM-803 to ISO8859-8	Hebrew
IBM-852_ISO8859-2	IBM-852 to ISO8859-2	Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian Latin, Slovak, Slovene
IBM-855_ISO8859-5	IBM-855 to ISO8859-5	Bulgarian, Macedonian, Serbian Cyrillic, Russian
IBM-866_ISO8859-5	IBM-866 to ISO8859-5	Russian
IBM-869_ISO8859-7	IBM-869 to ISO8859-7	Greek
IBM-875_ISO8859-7	IBM-875 to ISO8859-7	Greek
IBM-870_ISO8859-2	IBM-870 to ISO8859-2	Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian, Slovak, Slovene
IBM-880_ISO8859-5	IBM-880 to ISO8859-5	Bulgarian, Macedonian, Serbian Cyrillic, Russian
IBM-1025_ISO8859-5	IBM-1025 to ISO8859-5	Bulgarian, Macedonian, Serbian Cyrillic, Russian
IBM-857_ISO8859-9	IBM-857 to ISO8859-9	Turkish
IBM-1026_ISO8859-9	IBM-1026 to ISO8859-9	Turkish
IBM-850_ISO8859-1	IBM-850 to ISO8859-1	Latin
IBM-856_ISO8859-8	IBM-856 to ISO8859-8	Hebrew
IBM-862_ISO8859-8	IBM-862 to ISO8859-8	Hebrew
IBM-864_ISO8859-6	IBM-864 to ISO8859-6	Arabic
IBM-1046_ISO8859-6	IBM-1046 to ISO8859-6	Arabic
ISO8859-1_IBM-850	ISO8859-1 to IBM-850	Latin
ISO8859-6_IBM-864	ISO8859-6 to IBM-864	Arabic
ISO8859-6_IBM-1046	ISO8859-6 to IBM-1046	Arabic
ISO8859-8_IBM-856	ISO8859-8 to IBM-856	Hebrew
ISO8859-8_IBM-862	ISO8859-8 to IBM-862	Hebrew
ISO8859-1_IBM-037	ISO8859-1 to IBM-037	U.S. English, Portuguese, Canadian French
ISO8859-1_IBM-273	ISO8859-1 to IBM-273	German
ISO8859-1_IBM-277	ISO8859-1 to IBM-277	Danish, Norwegian
ISO8859-1_IBM-278	ISO8859-1 to IBM-278	Finnish, Swedish
ISO8859-1_IBM-280	ISO8859-1 to IBM-280	Italian
ISO8859-1_IBM-281	ISO8859-1 to IBM-281	Japanese-Latin
ISO8859-1_IBM-284	ISO8859-1 to IBM-284	Spanish
ISO8859-1_IBM-285	ISO8859-1 to IBM-285	U.K. English
ISO8859-1_IBM-297	ISO8859-1 to IBM-297	French
ISO8859-1_IBM-500	ISO8859-1 to IBM-500	Belgian, Swiss German
ISO8859-2_IBM-852	ISO8859-2 to IBM-852	Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian Latin, Slovak, Slovene
ISO8859-2_IBM-870	ISO8859-2 to IBM-870	Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian Latin, Slovak, Slovene
ISO8859-5_IBM-855	ISO8859-5 to IBM-855	Bulgarian, Macedonian, Serbian Cyrillic, Russian
ISO8859-5_IBM-880	ISO8859-5 to IBM-880	Bulgarian, Macedonian, Serbian Cyrillic, Russian
ISO8859-5_IBM-1025	ISO8859-5 to IBM-1025	Bulgarian, Macedonian, Serbian Cyrillic, Russian
ISO8859-6_IBM-420	ISO8859-6 to IBM-420	Arabic
ISO8859-5_IBM-866	ISO8859-5 to IBM-866	Russian
ISO8859-7_IBM-869	ISO8859-7 to IBM-869	Greek
ISO8859-7_IBM-875	ISO8859-7 to IBM-875	Greek
ISO8859-8_IBM-424	ISO8859-8 to IBM-424	Hebrew
ISO8859-8_IBM-803	ISO8859-8 to IBM-803	Hebrew
ISO8859-9_IBM-857	ISO8859-9 to IBM-857	Turkish
ISO8859-9_IBM-1026	ISO8859-9 to IBM-1026	Turkish

Multibyte Code Set Converters

Multibyte code-set converters convert characters among the following code sets:

PC multibyte code sets
EUC multibyte code sets (ISO-based)
EBCDIC multibyte code sets

The following table lists code set names that are compatible. Each line defines to/from strings that may be used when requesting a converter.

Code Set Compatibility
Language	PC	ISO	EBCDIC
Japanese	IBM-932	IBM-eucJP	IBM-930, IBM-939
Japanese (MS compatible)	IBM-943	IBM-eucJP	IBM-930, IBM-939
Korean	IBM-934	IBM-eucKR	IBM-933
Traditional Chinese	IBM-938, big-5	IBM-eucTW	IBM-937
Simplified Chinese	IBM-1381	IBM-eucCN	IBM-935

Conversions between Simplified and Traditional Chinese are provided (IBM-eucTW <—> IBM-eucCN and big5 <—> IBM-eucCN).
UTF-8 is an additional code set. See UTF-8 Interchange Converters for more information.

Files

The following list describes the Multibyte Code Set converters that are found in the /usr/lib/nls/loc/iconv directory.

Converter	Description
IBM-eucJP_IBM-932	IBM-eucJP to IBM-932
IBM-eucJP_IBM-943	IBM-eucJP to IBM-943
IBM-eucJP_IBM-930	IBM-eucJP to IBM-930
IBM-eucCN_IBM-936(PC5550)	IBM-eucCN to IBM-936(PC5550)
IBM-eucCN_IBM-935	IBM-eucCN to IBM-935
IBM-eucJP_IBM-939	IBM-eucJP to IBM-939
IBM-eucCN_IBM-1381	IBM-eucCN to IBM-1381
IBM-943_IBM-932	IBM-943 to IBM-932
IBM-932_IBM-943	IBM-932 to IBM-943
IBM-930_IBM-932	IBM-930 to IBM-932
IBM-930_IBM-943	IBM-930 to IBM-943
IBM-930_IBM-eucJP	IBM-930 to IBM-eucJP
IBM-932_IBM-eucJP	IBM-932 to IBM-eucJP
IBM-932_IBM-930	IBM-932 to IBM-930
IBM-943_IBM-eucJP	IBM-943 to IBM-eucJP
IBM-943_IBM-930	IBM-943 to IBM-930
IBM-936(PC5550)_IBM-935	IBM-936(PC5550) to IBM-935
IBM-936_IBM-935	IBM-936 to IBM-935
IBM-932_IBM-939	IBM-932 to IBM-939
IBM-939_IBM-932	IBM-939 to IBM-932
IBM-943_IBM-939	IBM-943 to IBM-939
IBM-939_IBM-943	IBM-939 to IBM-943
IBM-935_IBM-936(PC5550)	IBM-935 to IBM-936(PC5550)
IBM-935_IBM-936	IBM-935 to IBM-936
IBM-1381_IBM-935	IBM-1381 to IBM-935
IBM-935_IBM-1381	IBM-935 to IBM-1381
IBM-935_IBM-eucCN	IBM-935 to IBM-eucCN
IBM-936(PC5550)_IBM-eucCN	IBM-936(PC5550) to IBM-eucCN
IBM-eucTW_IBM-eucCN	IBM-eucTW to IBM-eucCN
big5_IBM-eucCN	big5 to IBM-eucCN
IBM-1381_IBM-eucCN	IBM-1381 to IBM-eucCN
IBM-939_IBM-eucJP	IBM-939 to IBM-eucJP
IBM-eucKR_IBM-934	IBM-eucKR to IBM-934
IBM-934_IBM-eucKR	IBM-934 to IBM-eucKR
IBM-eucKR_IBM-933	IBM-eucKR to IBM-933
IBM-933_IBM-eucKR	IBM-933 to IBM-eucKR
IBM-eucTW_IBM-937	IBM-eucTW to IBM-937
IBM-938_IBM-937	IBM-938 to IBM-937
big-5_IBM-937	big-5 to IBM-937
IBM-eucCN_IBM-eucTW	IBM-eucCN to IBM-eucTW
IBM-937_IBM-eucTW	IBM-937 to IBM-eucTW
IBM-937_IBM-938	IBM-937 to IBM-938
IBM-eucTW_IBM-938	IBM_eucTW to IBM_938
IBM-eucCN_big5	IBM-eucCN to big5
IBM-eucTW_big-5	IBM_eucTW to big-5
IBM-937_big-5	IBM-937 to big-5
CNS11643.1992-3_IBM-eucTW	CNS11643.1992-3 to IBM_eucTW
CNS11643.1992-3-GL_IBM-eucTW	CNS11643.1992-3-GL to IBM_eucTW
CNS11643.1992-3-GR_IBM-eucTW	CNS11643.1992-3-GR to IBM_eucTW
CNS11643.1992-4_IBM-eucTW	CNS11643.1992-4 to IBM_eucTW
CNS11643.1992-4-GL_IBM-eucTW	CNS11643.1992-4-GL to IBM_eucTW
CNS11643.1992-4-GR_IBM-eucTW	CNS11643.1992-4-GR to IBM_eucTW
IBM-eucTW_CNS11643.1992-3	IBM_eucTW to CNS11643.1992-3
IBM-eucTW_CNS11643.1992-3-GL	IBM_eucTW to CNS11643.1992-3-GL
IBM-eucTW_CNS11643.1992-3-GR	IBM_eucTW to CNS11643.1992-3-GR
IBM-eucTW_CNS11643.1992-4	IBM_eucTW to CNS11643.1992-4
IBM-eucTW_CNS11643.1992-4-GL	IBM_eucTW to CNS11643.1992-4-GL
IBM-eucTW_CNS11643.1992-4-GR	IBM_eucTW to CNS11643.1992-4-GR
IBM-eucCN_GB2312.1980-1	IBM-eucCN to GB2312.1980-1
IBM-eucCN_GB2312.1980-1-GL	IBM-eucCN to GB2312.1980-1-GL
IBM-eucCN_GB2312.1980-1-GR	IBM-eucCN to GB2312.1980-1-GR
IBM-937_csic	IBM-937 to csic
csic_IBM-937	csic to IBM-937
IBM-938_csic	IBM-938 to csic
csic_IBM-938	csic to IBM-938
IBM-eucTW_ccdc	IBM-eucTW to ccdc
ccdc_IBM-eucTW	ccdc to IBM-eucTW
IBM-eucTW_cns	IBM-eucTW to cns
cns_IBM-eucTW	cnd to IBM-eucTW
IBM-eucTW_csic	IBM-eucTW to csic
csic_IBM-eucTW	csic to IBM-eucTW
IBM-eucTW_sops	IBM-ecuTW to sops
sops_IBM-eucTW	sops to IBM-eucTW
IBM-eucTW_tca	IBM-eucTW to tca
tca_IBM-eucTW	tca to IBM-eucTW
big5_cns	big5 to cns
cns_big5	cns to big5
big5_csic	big5 to csic
csic_big5	csic to big5
big5_ttc	big5 to ttc
ttc_big5	ttc to big5
big5_ttcmin	big5 to ttcmin
ttcmin_big5	ttcmin to big5
big5_unicode	big5 to unicode
unicode_big5	unicode to big5
big5_wang	big5 to wang
wang_big5	wang to big5
ccdc_csic	ccdc to csic
csic_ccdc	csic to_ccdc
csic_sops	csic to sops
sops_csic	sops to csic
CNS11643.1986-1_big5	CNS11643.1986-1 to big5
big5_CNS11643.1986-1	big5 to CNS11643.1986-1
CNS11643.1986-1-GR_big5	CNS11643.1986-1-GR to big5
big5_CNS11643.1986-1-GR	big5 to CNS11643.1986-1-GR
CNS11643.1986-2_big5	CNS11643.1986-2 to big5
big5_CNS11643.1986-2	big5 to CNS11643.1986-2
CNS11643.1986-2-GR_big5	CNS11643.1986-2-GR to big5
big5_CNS11643.1986-2-GR	big5 to CNS11643.1986-2-GR
CNS11643.CT-GR_big5	CNS11643.CT-GR to big5
big5_CNS11643.CT-GR	big5 to CNS11643.CT-GR
IBM-sbdTW-GR_big5	IBM-sbdTW-GR to big5
big5_IBM-sbdTW-GR	big5 to IBM-sbdTW-GR
IBM-sbdTW.CT-GR_big5	IBM-sbdTW.CT-GR to big5
big5_IBM-sbdTW.CT-GR	big5 to IBM-sbdTW.CT-GR
IBM-sbdTW_big5	IBM-sbdTW to big5
big5_IBM-sbdTW	big5 to IBM-sbdTW
IBM-udcTW-GR_big5	IBM-udcTW-GR to big5
big5_IBM-udcTW-GR	big5 to IBM-udcTW-GR
IBM-udcTW.CT-GR_big5	IBM-udcTW.CT-GR to big5
big5_IBM-udcTW.CT-GR	big5 to IBM-udcTW.CT-GR
ISO8859-1_big5	ISO8859 to big5
big5_ISO8859-1	big5 to ISO8859
IBM-sbdTW_big5	IBM-sbdTW to big5
big5_IBM-sbdTW	big5 to IBM-sbdTW
big5_ASCII-GR	big5 to ASCII-GR
ASCII-GR_big5	ASCII-GR to big5
GBK_big5	GBK to big5
big5_GBK	big5 to GBK
GBK_IBM-eucTW	GBK to IBM-eucTW
IBM-eucTW_GBK	IBM-eucTW to GBK
CNS11643.1986-1_GBK	CNS11643.1986-1 to GBK
GBK_CNS11643.1986-1	GBK to CNS11643.1986-1
CNS11643.1986-2_GBK	CNS11643.1986-2 to GBK
GBK_CNS11643.1986-2	GBK to CNS11643.1986-2
CNS11643.1986-1-GR_GBK	CNS11643.1986-1-GR to GBK
GBK_CNS11643.1986-1-GR	GBK to CNS11643.1986-1-GR
CNS11643.1986-2-GR_GBK	CNS11643.1986-2-GR to GBK
GBK_CNS11643.1986-2-GR	GBK to CNS11643.1986-2-GR
CNS11643.1986-1-GL_GBK	CNS11643.1986-1-GL to GBK
GBK_CNS11643.1986-1-GL	GBK to CNS11643.1986-1-GL
CNS11643.1986-2-GL_GBK	CNS11643.1986-2-GL to GBK
GBK_CNS11643.1986-2-GL	GBK to CNS11643.1986-2-GL
CNS11643.CT-GR_GBK	CNS11643.CT-GR to GBK
GBK_CNS11643.CT-GR	GBK to CNS11643.CT-GR
GB2312.1980.CT-GR_GBK	GB2312.1980.CT-GR to GBK
GBK_GB2312.1980.CT-GR	GBK to GB2312.1980.CT-GR
GB2312.1980-0_GBK	GBK2312.1980-0 to GBK
GBK_GB2312.1980-0	GBK to GBK2312.1980-0
GB2312.1980-0-GR_GBK	GB2312.1980-0-GR to GBK
GBK_GB2312.1980-0-GR	GBK to GB2312.1980-0-GR
GB2312.1980-0-GL_GBK	GB2312.1980-0-GL to GBK
GBK_GB2312.1980-0-GL	GBK to GB2312.1980-0-GL
ASCII-GR_GBK	ASCII-GR to GBK
GBK_ASCII-GR	GBK to ASCII-GR
ISO8859-1_GBK	ISO8859-1 to GBK
GBK_ISO8859-1	GBK to ISO8859-1
IBM-eucCN_GBK	IBM-eucCN to GBK
GBK_IBM-eucCN	GBK to IBM-eucCN

Interchange Converters—7-bit

This converter provides conversion between internal code and 7-bit standard interchange formats (fold7). The fold7 name identifies encodings that can be used to pass text data through 7-bit mail protocols. The encodings are based on ISO2022. For more information about fold7, see Understanding libiconv.

The fold7 converters convert characters from a code set to a canonical 7-bit encoding that identifies each character. This type of conversion is useful in networks where clients communicate with different code sets but use the same character sets. For example:

IBM-850 <—> ISO8859-1	Common Latin characters
IBM-932 <—>IBM-eucJP	Common Japanese characters

The following escape sequences designate standard code sets:

Escape Sequence	Standard Code Set
01/11 02/04 04/00	GL JIS X0208.1978-0.
01/11 02/04 02/08 04/01	GL left half of GB2312.1980-0.
01/11 02/08 04/02	GL 7-bit ASCII or left half of ISO8859-1.
01/11 02/14 04/01	GL right half of ISO8859-1.
01/11 02/14 04/02	GL right half of ISO8859-2.
01/11 02/14 04/03	GL right half of ISO8859-3.
01/11 02/14 04/04	GL right half of ISO8859-4.
01/11 02/14 04/06	GL right half of ISO8859-7.
01/11 02/14 04/07	GL right half of ISO8859-6.
01/11 02/14 04/08	GL right half of ISO8859-8.
01/11 02/14 04/12	GL right half of ISO8859-5.
01/11 02/14 04/13	GL right half of ISO8859-9.
01/11 02/08 04/09	GL right half of JIS X0201.1976-0.
01/11 02/08 04/10	GL left half of JIS X0201.1976.
01/11 02/04 04/02	GL JIS X0208.1983-0.
01/11 02/04 02/08 04/02	GL JIS X0208.1983-0.
01/11 02/04 02/08 04/00	GL JISX0208.1978-0.
01/11 02/05 02/15 03/01 M L 06/09 06/02 06/13 02/13 03/08 03/05 03/00 00/02	GL right half of IBM-850 unique characters. Characters common to ISO8859-1 do not use this escape sequence.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02	GL Japanese) IBM-udcJP) user-definable characters.
01/11 02/04 02/08 04/03	GL KSC5601-1987.
01/11 02/04 02/09 03/00	GL CNS11643-1986-1.
01/11 02/04 02/10 03/01	GL CNS11643-1986-2.
01/11 02/05 02/15 03/00 M L 05/05 05/04 04/06 02/13 03/07 00/02	UCS-2 encoded as base64; used only for those characters not encoded by any of the other 7-bit escape sequences listed above.

When converting from a code set to fold7, the escape sequence used to designate the code set is chosen according to the order listed. For example, the JISX0208.1983-0 characters use 01/11 01/04 04/02 as the designation.

Files

The following list describes the fold7 converters that are found in the /usr/lib/nls/loc/iconv directory:

Converter	Description
fold7_IBM-850	Interchange format to IBM-850
fold7_IBM-921	Interchange format to IBM-921
fold7_IBM-922	Interchange format to IBM-922
fold7_IBM-932	Interchange format to IBM-932
fold7_IBM-943	Interchange format to IBM-943
fold7_IBM_1124	Interchange format to IBM-1124
fold7_IBM_1129	Interchange format to IBM-1129
fold7_IBM_eucCN	Interchange format to IBM-eucCN
fold7_IBM-eucJP	Interchange format to IBM-eucJP
fold7_IBM-eucKR	Interchange format to IBM-eucKR
fold7_IBM-eucTW	Interchange format to IBM-eucTW
fold7_ISO8859-1	Interchange format to ISO8859-1
fold7_ISO8859-2	Interchange format to ISO8859-2
fold7_ISO8859-3	Interchange format to ISO8859-3
fold7_ISO8859-4	Interchange format to ISO8859-4
fold7_ISO8859-5	Interchange format to ISO8859-5
fold7_ISO8859-6	Interchange format to ISO8859-6
fold7_ISO8859-7	Interchange format to ISO8859-7
fold7_ISO8859-8	Interchange format to ISO8859-8
fold7_ISO8859-9	Interchange format to ISO8859-9
fold7_TIS-620	Interchange format to TIS-620
fold7_UTF-8	Interchange format to UTF-8
fold7_big5	Interchange format to big5
fold7_GBK	Interchange format to GBK
IBM-921_fold7	IBM-921 to interchange format
IBM-922_fold7	IBM-922 to interchange format
IBM-850_fold7	IBM-850 to interchange format
IBM-932_fold7	IBM-932 to interchange format
IBM-943_fold7	IBM-943 to interchange format
IBM-1124_fold7	IBM-1124 to interchange format
IBM-1129_fold7	IBM-1129 to interchange format
IBM-eucCN_fold7	IBM-eucCN to interchange format
IBM-eucJP_fold7	IBM-eucJP to interchange format
IBM-eucKR_fold7	IBM-eucKR to interchange format
IBM-eucTW_fold7	IBM-eucTW to interchange format
ISO8859-1_fold7	ISO8859-1 to interchange format
ISO8859-2_fold7	ISO8859-2 to interchange format
ISO8859-3_fold7	ISO8859-3 to interchange format
ISO8859-4_fold7	ISO8859-4 to interchange format
ISO8859-5_fold7	ISO8859-5 to interchange format
ISO8859-6_fold7	ISO8859-6 to interchange format
ISO8859-7_fold7	ISO8859-7 to interchange format
ISO8859-8_fold7	ISO8859-8 to interchange format
ISO8859-9_fold7	ISO8859-9 to interchange format
TIS-620_fold7	TIS-620 to interchange format
UTF-8_fold7	UTF-8 to interchange format
big5_fold7	big5 to interchange format
GBK_fold7	GBK to interchange format

Interchange Converters—8-bit

This converter provides conversions between internal code and 8-bit standard interchange formats (fold8). The fold8 name identifies encodings that can be used to pass text data through 8-bit mail protocols. The encodings are based on ISO2022. For more information about fold8, see Understanding libiconv.

The fold8 converters convert characters from a specific code set encoding to a canonical 8-bit encoding that identifies each character. This type of conversion is useful in networks where clients communicate with different code sets but use the same character sets. For example:

IBM-850 <—> ISO8859-1	Common Latin characters
IBM-932 <—>IBM-eucJP	Common Japanese characters

The following escape sequences designate standard code sets.

Escape Sequence	Standard Code Set
01/11 02/04 02/09 04/01	GR right half of GB2312.1980-0.
01/11 02/13 04/01	GR right half of ISO8859-1.
01/11 02/13 04/02	GR right half of ISO8859-2.
01/11 02/13 04/03	GR right half of ISO8859-3.
01/11 02/13 04/04	GR right half of ISO8859-4.
01/11 02/13 04/06	GR right half of ISO8859-7.
01/11 02/13 04/07	GR right half of ISO8859-6.
01/11 02/13 04/08	GR right half of ISO8859-8.
01/11 02/13 04/13	GR right half of ISO8859-5.
01/11 02/13 04/13	GR right half of ISO8859-9.
01/11 02/09 04/09	GR right half of JIS X0201.1976-1.
01/11 02/04 02/09 04/02	GR JIS X0208.1983-1.
01/11 02/04 02/09 04/00	GR JISX0208.1978-1.
01/11 02/09 04/02	GR 7-bit ASCII or left half of ISO8859-1.
01/11 02/05 02/15 03/01 M L 04/09 04/02 04/13 02/13 03/08 03/05 03/00 00/02	GR right half of IBM-850 unique characters. Characters common to ISO8859-1 should not use this escape sequence.
01/11 02/05 02/15 03/02 M L 04/09 04/02 04/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02	GR right half of Japanese user-definable characters.
01/11 02/08 04/02	GL 7-bit ASCII or left half of ISO8859-1.
01/11 02/14 04/01	GL right half of ISO8859-1.
01/11 02/14 04/02	GL right half of ISO8859-2.
01/11 02/14 04/03	GL right half of ISO8859-3.
01/11 02/14 04/04	GL right half of ISO8859-4.
01/11 02/14 04/06	GL right half of ISO8859-7.
01/11 02/14 04/07	GL right half of ISO8859-6.
01/11 02/14 04/08	GL right half of ISO8859-8.
01/11 02/14 04/12	GL right half of ISO8859-5.
01/11 02/14 04/13	GL right half of ISO8859-9.
01/11 02/08 04/09	GL right half of JIS X0201.1976-0.
01/11 02/08 04/10	GL left half of JIS X0201.1976.
01/11 02/04 02/08 04/02	GL JIS X0208.1983-0.
01/11 02/04 04/02	GL JIS X0208.1983-0.
01/11 02/04 04/00	GL JIS X0208.1978-0.
01/11 02/05 02/15 03/01 M L 06/09 06/02 06/13 02/13 03/08 03/05 03/00 00/02	GL right half of IBM-850 unique characters. Characters common to ISO8859-1 do not use this escape sequence.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02	GL Japanese (IBM-udcJP) user-definable characters.
01/11 02/04 02/09 04/03	GR KSC5601-1987.
01/11 02/04 02/09 03/00	GR CNS11643-1986-1.
01/11 02/04 02/10 03/01	GR CNS11643-1986-2.
01/11 02/05 02/15 03/02 M L 04/09 04/02 04/13 02/13 07/05 06/04 06/03 05/05 05/08 00/02	GR right half of Traditional Chinese user-definable characters.
01/11 02/05 02/15 03/02 M L 04/09 04/02 04/13 02/13 07/03 06/02 06/04 05/05 05/08 00/02	GR right half of IBM-850 unique symbols.
01/11 02/04 02/08 04/03	GL KSC5601-1987.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/05 06/04 06/03 05/05 05/08 00/02	GL Traditional Chinese (IBM-udcTW) user-definable characters.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/03 06/02 06/04 05/05 05/08 00/02	GL Traditional Chinese IBM-850 unique symbols (IBM-shdTW) user-definable characters.
01/11 02/05 02/15 03/00 M L 05/05 05/04 04/06 02/13 03/08 00/02	UCS-2 encoded as UTF-8; used only for those characters not encoded by any of the above escape sequences listed above.

When converting from a code set to fold8, the escape sequence used to designate the code set is chosen according to the order listed. For example, the JISX0208.1983-0 characters use 01/11 02/04 02/08 04/02 as the designation.

Files

The following list describes the fold8 converters found in the /usr/lib/nls/loc/iconv directory:

Converter	Description
fold8_IBM-850	Interchange format to IBM-850
fold8_IBM-921	Interchange format to IBM-921
fold8_IBM-922	Interchange format to IBM-922
fold8_IBM-932	Interchange format to IBM-932
fold8_IBM-943	Interchange format to IBM-943
fold8_IBM-1124	Interchange format to IBM-1124
fold8_IBM-1129	Interchange format to IBM-1129
fold8_IBM-eucCN	Interchange format to IBM-eucCN
fold8_IBM-eucJP	Interchange format to IBM-eucJP
fold8_IBM-eucKR	Interchange format to IBM-eucKR
fold8_IBM-eucTW	Interchange format to IBM-eucTW
fold8_IBM-eucCN	Interchange fromat to IBM-eucCN
fold8_ISO8859-1	Interchange format to ISO8859-1
fold8_ISO8859-2	Interchange format to ISO8859-2
fold8_ISO8859-3	Interchange format to ISO8859-3
fold8_ISO8859-4	Interchange format to ISO8859-4
fold8_ISO8859-5	Interchange format to ISO8859-5
fold8_ISO8859-6	Interchange format to ISO8859-6
fold8_ISO8859-7	Interchange format to ISO8859-7
fold8_ISO8859-8	Interchange format to ISO8859-8
fold8_ISO8859-9	Interchange format to ISO8859-9
fold8_TIS-620	Interchange format to TIS-620
fold8_UTF-8	Interchange format to UTF-8
fold8_big5	Interchange format to big5
fold8_GBK	Interchange format to GBK
IBM-921_fold8	IBM-921 to interchange format
IBM-922_fold8	IBM-922 to interchange format
IBM-850_fold8	IBM-850 to interchange format
IBM-932_fold8	IBM-932 to interchange format
IBM-943_fold8	IBM-943 to interchange format
IBM-1124_fold8	IBM-1124 to interchange format
IBM-1129_fold8	IBM-1129 to interchange format
IBM-eucCN_fold8	IBM-eucCN to interchange format
IBM-eucJP_fold8	IBM-eucJP to interchange format
IBM-eucKR_fold8	IBM-eucKR to interchange format
IBM-eucTW_fold8	IBM-eucTW to interchange format
IBM-eucCN_fold8	IBM-eucCN to interchange format
ISO8859-1_fold8	ISO8859-1 to interchange format
ISO8859-2_fold8	ISO8859-2 to interchange format
ISO8859-3_fold8	ISO8859-3 to interchange format
ISO8859-4_fold8	ISO8859-4 to interchange format
ISO8859-5_fold8	ISO8859-5 to interchange format
ISO8859-6_fold8	ISO8859-6 to interchange format
ISO8859-7_fold8	ISO8859-7 to interchange format
ISO8859-8_fold8	ISO8859-8 to interchange format
ISO8859-9_fold8	ISO8859-9 to interchange format
TIS-620_fold8	TIS-620 to interchange format
UTF-8_fold8	UTF-8 to interchange format
big5_fold8	big5 to interchange format
GBK_fold8	GBK to interchange format

Interchange Converters—Compound Text

Compound text interchange converters convert between compound text and internal code sets.

Compound text is an interchange encoding defined by the X Consortium. It is used to communicate text between X clients. Compound text is based on ISO2022 and can encode most character sets using standard escape sequences. It also provides extensions for encoding private character sets. The supported code sets provide a converter to and from compound text. The name used to identify the compound text encoding is ct.

The following escape sequences are used to designate standard code sets in the order listed below.

01/11 02/05 02/15 03/01 M L 04/09 04/02 04/13 02/13 03/08 03/05 03/00 00/02: GR right half of IBM-850 unique characters. Characters common to ISO8859-1 should not use this escape sequence.
01/11 02/05 02/15 03/02 M L 04/09 04/02 04/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02: GR right half of Japanese user-definable characters.
01/11 02/05 02/15 03/01 M L 06/09 06/02 06/13 02/13 03/08 03/05 03/00 00/02: GL right half of IBM-850 unique characters. Characters common to ISO8859-1 do not use this escape sequence.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02: GL Japanese (IBM-udcJP) user-definable characters.

Files

The following list describes the compound text converters that are found in the /usr/lib/nls/loc/iconv directory:

Converter	Description
ct_IBM-850	Interchange format to IBM-850
ct_IBM-921	Interchange format to IBM-921
ct_IBM-922	Interchange format to IBM-922
ct_IBM-932	Interchange format to IBM-932
ct_IBM-943	Interchange format to IBM-943
ct_IBM-1124	Interchange format to IBM-1124
ct_IBM-1129	Interchange format to IBM-1129
ct_IBM-eucCN	Interchange format to IBM-eucCN
ct_IBM-eucJP	Interchange format to IBM-eucJP
ct_IBM-eucKR	Interchange format to IBM-eucKR
ct_IBM-eucTW	Interchange format to IBM-eucTW
ct_ISO8859-1	Interchange format to ISO8859-1
ct_ISO8859-2	Interchange format to ISO8859-2
ct_ISO8859-3	Interchange format to ISO8859-3
ct_ISO8859-4	Interchange format to ISO8859-4
ct_ISO8859-5	Interchange format to ISO8859-5
ct_ISO8859-6	Interchange format to ISO8859-6
ct_ISO8859-7	Interchange format to ISO8859-7
ct_ISO8859-8	Interchange format to ISO8859-8
ct_ISO8859-9	Interchange format to ISO8859-9
ct_TIS-620	Interchange format to TIS-620
ct_big5	Interchange format to big5
ct_GBK	Interchange format to GBK
IBM-850_ct	IBM-850 to interchange format
IBM-921_ct	IBM-921 to interchange format
IBM-922_ct	IBM-922 to interchange format
IBM-932_ct	IBM-932 to interchange format
IBM-943_ct	IBM-943 to interchange format
IBM-1124_ct	IBM-1124 to interchange format
IBM-1129_ct	IBM-1129 to interchange format
IBM-eucCN_ct	IBM-eucCN to interchange format
IBM-eucJP_ct	IBM-eucJP to interchange format
IBM-eucKR_ct	IBM-eucKR to interchange format
IBM-eucTW_ct	IBM-eucTW to interchange format
ISO8859-1_ct	ISO8859-1 to interchange format
ISO8859-2_ct	ISO8859-2 to interchange format
ISO8859-3_ct	ISO8859-3 to interchange format
ISO8859-4_ct	ISO8859-4 to interchange format
ISO8859-5_ct	ISO8859-5 to interchange format
ISO8859-6_ct	ISO8859-6 to interchange format
ISO8859-7_ct	ISO8859-7 to interchange format
ISO8859-8_ct	ISO8859-8 to interchange format
ISO8859-9_ct	ISO8859-9 to interchange format
TIS-620_ct	TIS-620 to interchange format
big5_ct	big5 to interchange format
GBK_ct	GBK to interchange format

Interchange Converters—uucode

This converter provides the same mapping as the uuencode and uudecode commands.

During conversion from uucode, 62 bytes at a time (including a new-line character trailing the record) are converted, and generating 45 bytes in outbuf.

Files

The following list describes the uucode converters found in the /usr/lib/nls/loc/iconv directory:

Converter	Description
IBM-850_uucode	IBM-850 to uucode
IBM-921_uucode	IBM-921 to uucode
IBM-922_uucode	IBM-922 to uucode
IBM-932_uucode	IBM-932 to uucode
IBM-943_uucode	IBM-943 to uucode
IBM-1124_uucode	IBM-1124 to uucode
IBM-1129_uucode	IBM-1129 to uucode
IBM-eucJP_uucode	IBM-eucJP to uucode
IBM-eucKR_uucode	IBM-eucKR to uucode
IBM-eucTW_uucode	IBM-eucTW to uucode
IBM-eucCN_uucode	IBM-eucCN to uucode
ISO8859-1_uucode	ISO8859-1 to uucode
ISO8859-2_uucode	ISO8859-2 to uucode
ISO8859-3_uucode	ISO8859-3 to uucode
ISO8859-4_uucode	ISO8859-4 to uucode
ISO8859-5_uucode	ISO8859-5 to uucode
ISO8859-6_uucode	ISO8859-6 to uucode
ISO8859-7_uucode	ISO8859-7 to uucode
ISO8859-8_uucode	ISO8859-8 to uucode
ISO8859-9_uucode	ISO8859-9 to uucode
TIS-620_uucode	TIS-620 to uucode
big5_uucode	big5 to uucode
GBK_uucode	GBK to uucode
uucode_IBM-850	uucode to IBM-850
uucode_IBM-921	uucode to IBM-921
uucode_IBM-922	uucode to IBM-922
uucode_IBM-932	uucode to IBM-932
uucode_IBM-943	uucode to IBM-943
uucode_IBM-1124	uucode to IBM-1124
uucode_IBM-1129	uucode to IBM-1129
uucode_IBM-eucCN	uucode to IBM-eucCN
uucode_IBM-eucJP	uucode to IBM-eucJP
uucode_IBM-eucKR	uucode to IBM-eucKR
uucode_IBM-eucTW	uucode to IBM-eucTW
uucode_ISO8859-1	uucode to ISO8859-1
uucode_ISO8859-2	uucode to ISO8859-2
uucode_ISO8859-3	uucode to ISO8859-3
uucode_ISO8859-4	uucode to ISO8859-4
uucode_ISO8859-5	uucode to ISO8859-5
uucode_ISO8859-6	uucode to ISO8859-6
uucode_ISO8859-7	uucode to ISO8859-7
uucode_ISO8859-8	uucode to ISO8859-8
uucode_ISO8859-9	uucode to ISO8859-9
uucode_TIS-1124	uucode to TIS-1129
uucode_big5	uucode to big5
uucode_GBK	uucode to GBK

UCS-2 Interchange Converters

UCS-2 uses a universal 16-bit encoding. Conversions for each code set are provided in both directions, between the code set and UCS-2. For more information, see Code Sets for National Language Support.

UCS-2 converters are found in /usr/lib/nls/loc/uconvTable and /usr/lib/nls/loc/uconv directories. The uconvdef command is used to generate new converters or to customize existing UCS-2 converters.

Converter	Description
ISO8859-1	UCS-2 <—> ISO Latin-1
ISO8859-2	UCS-2 <—> ISO Latin-2
ISO8859-3	UCS-2 <—> ISO Latin-3
ISO8859-4	UCS-2 <—> ISO Baltic
ISO8859-5	UCS-2 <—> ISO Cyrillic
ISO8859-6	UCS-2 <—> ISO Arabic
ISO8859-7	UCS-2 <—> ISO Greek
ISO8859-8	UCS-2 <—> ISO Hebrew
ISO8859-9	UCS-2 <—> ISO Turkish
JISX0201.1976-0	UCS-2 <—> Japanese JISX0201-0
JISX0208.1983-0	UCS-2 <—> Japanese JISX0208-0
CNS11643.1986-1	UCS-2 <—> Chinese CNS11643-1
CNS11643.1986-2	UCS-2 <—> Chinese CNS11643-2
KSC5601.1987-0	UCS-2 <—> Korean KSC5601-0
IBM-eucCN	UCS-2 <—> Simplified Chinese EUC
IBM-udcCN	UCS-2 <—> Simplified Chinese user-defined characters
IBM-sbdCN	UCS-2 <—> Simplified Chinese IBM-specific characters
GB2312.1980-0	UCS-2 <—> Simplified Chinese GB
IBM-1381	UCS-2 <—> Simplified Chinese PC data code
IBM-935	UCS-2 <—> Simplified Chinese EBCDIC
IBM-936	UCS-2 <—> Simplified Chinese PC5550
IBM-eucJP	UCS-2 <—> Japanese EUC
IBM-eucKR	UCS-2 <—> Korean EUC
IBM-eucTW	UCS-2 <—> Traditional Chinese EUC
IBM-udcJP	UCS-2 <—> Japanese user-defined characters
IBM-udcTW	UCS-2 <—> Traditional Chinese user-defined characters
IBM-sbdTW	UCS-2 <—> Traditional Chinese IBM-specific characters
UTF-8	UCS-2 <—> UTF-8
IBM-437	UCS-2 <—> USA PC data code
IBM-850	UCS-2 <—> Latin-1 PC data code
IBM-852	UCS-2 <—> Latin-2 PC data code
IBM-857	UCS-2 <—> Turkish PC data code
IBM-860	UCS-2 <—> Portuguese PC data code
IBM-861	UCS-2 <—> Icelandic PC data code
IBM-863	UCS-2 <—> French Canadian PC data code
IBM-865	UCS-2 <—> Nordic PC data code
IBM-869	UCS-2 <—> Greek PC data code
IBM-921	UCS-2 <—> Baltic Multilingual data code
IBM-922	UCS-2 <—> Estonian data code
IBM-932	UCS-2 <—> Japanese PC data code
IBM-943	UCS-2 <—> Japanese PC data code
IBM-934	UCS-2 <—> Korea PC data code
IBM-936	UCS-2 <—> People's Republic of China PC data code
IBM-938	UCS-2 <—> Taiwanese PC data code
IBM-942	UCS-2 <—> Extended Japanese PC data code
IBM-944	UCS-2 <—> Korean PC data code
IBM-946	UCS-2 <—> People's Republic of China SAA data code
IBM-948	UCS-2 <—> Traditional Chinese PC data code
IBM-1124	UCS-2 <—> Ukranian PC data code
IBM-1129	UCS-2 <—> Vietnamese PC data code
TIS-620	UCS-2 <—> Thailand PC data code
IBM-037	UCS-2 <—> USA, Canada EBCDIC
IBM-273	UCS-2 <—> Germany, Austria EBCDIC
IBM-277	UCS-2 <—> Denmark, Norway EBCDIC
IBM-278	UCS-2 <—> Finland, Sweden EBCDIC
IBM-280	UCS-2 <—> Italy EBCDIC
IBM-284	UCS-2 <—> Spain, Latin America EBCDIC
IBM-285	UCS-2 <—> United Kingdom EBCDIC
IBM-297	UCS-2 <—> France EBCDIC
IBM-500	UCS-2 <—> International EBCDIC
IBM-875	UCS-2 <—> Greek EBCDIC
IBM-930	UCS-2 <—> Japanese Katakana-Kanji EBCDIC
IBM-933	UCS-2 <—> Korean EBCDIC
IBM-937	UCS-2 <—> Traditional Chinese EBCDIC
IBM-939	UCS-2 <—> Japanese Latin-Kanji EBCDIC
IBM-1026	UCS-2 <—> Turkish EBCDIC
IBM-1112	UCS-2 <—> Baltic Multilingual EBCDIC
IBM-1122	UCS-2 <—> Estonian EBCDIC
IBM-1124	UCS-2 <—> Ukranian EBCDIC
IBM-1129	UCS-2 <—> Vietnamese EBCDIC
TIS-620	UCS-2 <—>Thailand EBCDIC

UTF-8 Interchange Converters

UTF-8 is a universal, multibyte encoding described in the UCS-2 and UTF-8. Conversions for each code set are provided in both directions, between the code set and UTF-8.

UTF-8 conversions are usually done by using the Universal_UCS_Conv and /usr/lib/nls/loc/uconv/UTF-8 converter. For more information, see UCS-2 Interchange Converters.

Converter	Description
ISO8859-1	UTF-8 <—> ISO Latin-1
ISO8859-2	UTF-8 <—> ISO Latin-2
ISO8859-3	UTF-8 <—> ISO Latin-3
ISO8859-4	UTF-8 <—> ISO Baltic
ISO8859-5	UTF-8 <—> ISO Cyrillic
ISO8859-6	UTF-8 <—> ISO Arabic
ISO8859-7	UTF-8 <—> ISO Greek
ISO8859-8	UTF-8 <—> ISO Hebrew
ISO8859-9	UTF-8 <—> ISO Turkish
JISX0201.1976-0	UTF-8 <—> Japanese JISX0201-0
JISX0208.1983-0	UTF-8 <—> Japanese JISX0208-0
CNS11643.1986-1	UTF-8 <—> Chinese CNS11643-1
CNS11643.1986-2	UTF-8 <—> Chinese CNS11643-2
KSC5601.1987-0	UTF-8 <—> Korean KSC5601-0
IBM-eucCN	UTF-8 <—> Simplified Chinese EUC
IBM-eucJP	UTF-8 <—> Japanese EUC
IBM-eucKR	UTF-8 <—> Korean EUC
IBM-eucTW	UTF-8 <—> Traditional Chinese EUC
IBM-udcJP	UTF-8 <—> Japanese user-defined characters
IBM-udcTW	UTF-8 <—> Traditional Chinese user-defined characters
IBM-sbdTW	UTF-8 <—> Traditional Chinese IBM-specific characters
UCS-2	UTF-8 <—> UCS-2
IBM-437	UTF-8 <—> USA PC data code
IBM-850	UTF-8 <—> Latin-1 PC data code
IBM-852	UTF-8 <—> Latin-2 PC data code
IBM-857	UTF-8 <—> Turkish PC data code
IBM-860	UTF-8 <—> Portuguese PC data code
IBM-861	UTF-8 <—> Icelandic PC data code
IBM-863	UTF-8 <—> French Canadian PC data code
IBM-865	UTF-8 <—> Nordic PC data code
IBM-869	UTF-8 <—> Greek PC data code
IBM-921	UTF-8 <—> Baltic Multilingual data code
IBM-922	UTF-8 <—> Estonian data code
IBM-932	UTF-8 <—> Japanese PC data code
IBM-943	UTF-8 <—> Japanese PC data code
IBM-934	UTF-8 <—> Korea PC data code
IBM-935	UTF-8 <—> Simplified Chinese EBCDIC
IBM-936	UTF-8 <—> People's Republic of China PC data code
IBM-938	UTF-8 <—> Taiwanese PC data code
IBM-942	UTF-8 <—> Extended Japanese PC data code
IBM-944	UTF-8 <—> Korean PC data code
IBM-946	UTF-8 <—> People's Republic of China SAA data code
IBM-948	UTF-8 <—> Traditional Chinese PC data code
IBM-1124	UTF-8 <—> Ukrainian PC data code
IBM-1129	UTF-8 <—> Vietnamese PC data code
TIS-620	UTF-8 <—> Thailand PC data code
IBM-037	UTF-8 <—> USA, Canada EBCDIC
IBM-273	UTF-8 <—> Germany, Austria EBCDIC
IBM-277	UTF-8 <—> Denmark, Norway EBCDIC
IBM-278	UTF-8 <—> Finland, Sweden EBCDIC
IBM-280	UTF-8 <—> Italy EBCDIC
IBM-284	UTF-8 <—> Spain, Latin America EBCDIC
IBM-285	UTF-8 <—> United Kingdom EBCDIC
IBM-297	UTF-8 <—> France EBCDIC
IBM-500	UTF-8 <—> International EBCDIC
IBM-875	UTF-8 <—> Greek EBCDIC
IBM-930	UTF-8 <—> Japanese Katakana-Kanji EBCDIC
IBM-933	UTF-8 <—> Korean EBCDIC
IBM-937	UTF-8 <—> Traditional Chinese EBCDIC
IBM-939	UTF-8 <—> Japanese Latin-Kanji EBCDIC
IBM-1026	UTF-8 <—> Turkish EBCDIC
IBM-1112	UTF-8 <—> Baltic Multilingual EBCDIC
IBM-1122	UTF-8 <—> Estonian EBCDIC
IBM-1124	UTF-8 <—> Ukranian EBCDIC
IBM-1129	UTF-8 <—> Vietnamese EBCDIC
IBM-1381	UTF-8 <—> Simplified Chinese PC data code
GB18030	UTF-8<—> Simplified Chinese
TIS-620	UTF-8 <—> Thailand EBCDIC

Miscellaneous Converters

A set of low-level converters used by the code set and interchange converters is provided. These converters are called miscellaneous converters. These low-level converters may be used by some of the interchange converters. However, the use of these converters is discouraged because they are intended for support of other converters.

Files

The following list describes the miscellaneous converters found in the /usr/lib/nls/loc/iconv and /usr/lib/nls/loc/iconvTable directories:

Converter	Description
IBM-932_JISX0201.1976-0	IBM-932 to JISX0201.1976-0
IBM-932_JISX0208.1983-0	IBM-932 to JISX0208.1983-0
IBM-932_IBM-udcJP	IBM-932 to IBM-udcJP (Japanese user-defined characters)
IBM-943_JISX0201.1976-0	IBM-943 to JISX0201.1976-0
IBM-943_JISX0208.1983-0	IBM-943 to JISX0208.1983-0
IBM-943_IBM-udcJP	IBM-943 to IBM-udcJP (Japanese user-defined characters
IBM-eucJP_JISX0201.1976-0	IBM-eucJP to JISX0201.1976-0
IBM-eucJP_JISX0208.1983-0	IBM-eucJP to JISX0208.1983-0
IBM-eucJP_IBM-udcJP	IBM-eucJP to IBM-udcJP (Japanese user-defined characters)
IBM-eucKR_KSC5601.1987-0	IBM_eucKR to KSC5601.1987-0
IBM-eucTW_CNS11643.1986-1	IBM-eucTW to CNS11643.1986.1
IBM-eucTW_CNS11643.1986-2	IBM-eucTW to CNS11643.1986-2
IBM-eucCN_GB2312.1980-0	IBM-eucCN to GB2312.1980-0

[ Top of Page | Previous Page | Next Page | Index | Feedback ]

저작자표시 비영리 변경금지

'소프트웨어개발&환경' 카테고리의 다른 글

애자일 프로젝트 기반의 소프트웨어 요구공학 개념 (0)	2016.01.09
ISO 국제 문자 코드 (0)	2015.11.12
한글 코드 변환과 깨진 한글 복구, 한글 코드 체계 (0)	2015.11.11
소프트웨어 개발 기술 요약 1. (0)	2015.04.27
온라인 코딩 사이트 : CodePad.org (0)	2014.02.20

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

수알치 블로그

국가 언어 코드, iconv

출처: http://www-01.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.nls/doc/nlsgdrf/iconv.htm%23a197c1176

National Language Support Guide and Reference

Using the iconv Command

Understanding libiconv

Using the iconv_open Subroutine

Converter Programs versus Tables

Unicode and Universal Converters

Universal UCS Converter

Using Converters

Code Set Conversion Filter Example

Naming Converters

List of Converters

PC, ISO, and EBCDIC Code Set Converters

Compatible Code Set Names

Files

Multibyte Code Set Converters

Files

Interchange Converters—7-bit

Files

Interchange Converters—8-bit

Files

Interchange Converters—Compound Text

Files

Interchange Converters—uucode

Files

UCS-2 Interchange Converters

UTF-8 Interchange Converters

Miscellaneous Converters

Files

'소프트웨어개발&환경' 카테고리의 다른 글

+ Recent posts

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역