[Top] [Prev] [Next]

byte2char, char2byte - convert between bytes and characters

include "sys.m";
sys:= load Sys Sys->PATH;
byte2char: fn(buf: array of byte, n: int): (int, int, int);
char2byte: fn(c: int, buf: array of byte, n: int): int;

Description

byte2char (buf, n)

The byte2char function converts a byte sequence to the corresponding Unicode character. The buf argument gives an array of bytes holding the sequence and n is an index in buf where the scanning of the UTF-8 bytes begins. The returned tuple, (c, len, status), specifies the result of the translation:
c

The resulting Unicode character.

len

The number of bytes consumed by the translation.

status

Non-zero if the bytes are a valid UTF sequence and zero otherwise.

If the input sequence is not long enough to determine its validity, byte2char consumes zero bytes. If the input sequence is otherwise invalid, byte2char consumes one input byte and generates an error character (UTFerror), which prints in most fonts as a boxed question mark.

char2byte (c, buf, n)

The char2byte performs the inverse of byte2char. It translates a Unicode character, c, to a UTF byte sequence which is placed in buf. The byte sequence is starts at the index n. The longest UTF sequence for a single Unicode character is UTFmax bytes.

If the translation succeeds, char2byte returns the number of bytes placed in the buffer. If the space between n and the end of the buffer is too small to hold the result, char2byte returns zero and leaves the array unchanged.

Notes

If the array bounds are invalid or insufficient to hold results, a run-time error occurs.

See Also
Limbo System Module

utfbytes - compute the Unicode length of a UTF byte sequence

UTF, Unicode, ASCII - character set and format in Appendix A



[Top] [Prev] [Next]

infernosupport@lucent.com
Copyright © 1996,Lucent Technologies, Inc. All rights reserved.