Understanding Escape Codes and Control Sequences

The exact term of escape code defined in ISO 6429 is a control function. Escape code is commonly used to refer to the code or sequence that represents control functions. Control codes defined by ISO 6429 are divided into two categories; C0 codes and C1 codes.

C0 codes correspond to non-printing characters in ASCII. These codes are familiar to developers. It includes line feed (\n), carriage return (\r), tab (\t), and null character (\0).

There are 32 codes in C1. They are represented by one-byte values between 0x80 and 0x9F. Unlike C0 codes, C1 codes are not defined in ASCII. They can only be used in terminals that support ASCII. In other words, they are available in 7-bit environments. In modern 8-bit environments, C1 codes cannot be used directly and must be expressed as escape sequences. Therefore, most modern terminals use escape sequences to represent C1 codes when needed.

An escape sequence is a series of characters that begins with ESC (0x1B). Sequences that correspond to C1 codes consist of a first byte of ESC (0x1B) and a second byte from @ (0x40) to _ (0x5F). For example, in a 7-bit environment, IND is represented by 0x84, but in an 8-bit environment, it is represented by ESC D (0x1B 0x44). This second byte is called the "Final character of Escape sequence," or "Fe," and two-byte C1 codes are also referred to as "Fe sequences."

In addition to Fe sequences, ISO 6429 defines other escape sequences. For example, values between `(0x60) and ~(0x7E) are called "Fs," and sequences of ESC and Fs are called "Fs sequences." Unlike C1 codes, Fs sequences are always expressed as escape sequences, regardless of the environment; they are called "independent control functions."

Fe sequences and Fs sequences mostly provide commands for controlling terminal devices. As a result, they are rarely used in modern terminal emulators. Instead, control sequences that begin with CSI (Control Sequence Introducer) are mainly used.

CSI is a value defined in the C1 control code. In a 7-bit environment, it is represented by 0x9B; in an 8-bit environment, CSI is represented by ESC [ (0x1B 0x5B). A series of sequences from CSI to byte between @ (0x40) and ~ (0x7E) are called "control sequences," and they can adjust various aspects of terminals, such as font, color, and cursor position. We will discuss this in more detail in the next article.

Seulgi Kim

Search This Blog

Understanding Escape Codes and Control Sequences

Labels

Comments

Post a Comment

Popular posts from this blog

Iterator Adapters in Rust

Understanding Aspect-Oriented Programming with Python Examples

[C++] enum class