Skip to main content

Understanding Escape Codes and Control Sequences

The exact term of escape code defined in ISO 6429 is a control function. Escape code is commonly used to refer to the code or sequence that represents control functions. Control codes defined by ISO 6429 are divided into two categories; C0 codes and C1 codes.

C0 codes correspond to non-printing characters in ASCII. These codes are familiar to developers. It includes line feed (\n), carriage return (\r), tab (\t), and null character (\0).

There are 32 codes in C1. They are represented by one-byte values between 0x80 and 0x9F. Unlike C0 codes, C1 codes are not defined in ASCII. They can only be used in terminals that support ASCII. In other words, they are available in 7-bit environments. In modern 8-bit environments, C1 codes cannot be used directly and must be expressed as escape sequences. Therefore, most modern terminals use escape sequences to represent C1 codes when needed.

An escape sequence is a series of characters that begins with ESC (0x1B). Sequences that correspond to C1 codes consist of a first byte of ESC (0x1B) and a second byte from @ (0x40) to _ (0x5F). For example, in a 7-bit environment, IND is represented by 0x84, but in an 8-bit environment, it is represented by ESC D (0x1B 0x44). This second byte is called the "Final character of Escape sequence," or "Fe," and two-byte C1 codes are also referred to as "Fe sequences."

In addition to Fe sequences, ISO 6429 defines other escape sequences. For example, values between `(0x60) and ~(0x7E) are called "Fs," and sequences of ESC and Fs are called "Fs sequences." Unlike C1 codes, Fs sequences are always expressed as escape sequences, regardless of the environment; they are called "independent control functions."

Fe sequences and Fs sequences mostly provide commands for controlling terminal devices. As a result, they are rarely used in modern terminal emulators. Instead, control sequences that begin with CSI (Control Sequence Introducer) are mainly used.

CSI is a value defined in the C1 control code. In a 7-bit environment, it is represented by 0x9B; in an 8-bit environment, CSI is represented by ESC [ (0x1B 0x5B). A series of sequences from CSI to byte between @ (0x40) and ~ (0x7E) are called "control sequences," and they can adjust various aspects of terminals, such as font, color, and cursor position. We will discuss this in more detail in the next article.


Popular posts from this blog

Type Conversion in Rust

Type conversion is not special in Rust. It's just a function that takes ownership of the value and returns the other type. So you can name convert functions anything. However, it's a convention to use as_ , to_ , and into_ prefixed name or to use from_ prefixed constructor. From You can create any function for type conversion. However, if you want to provide generic interfaces, you'd better implement the From trait. For instance, you should implement From<X> for Y when you want the interface that converts the X type value to the Y type value. The From trait have an associated function named from . You can call this function like From::from(x) . You also can call it like Y::from(x) if the compiler cannot infer the type of the destination type. Into From have an associated function, it makes you be able to specify the destination type. It's why From has an associated function instead of a method, but on the other hands, you cannot use it as a me

Do not use garbage collection to catch memory leak

Garbage collection is a technique that automatically releases unnecessary memory. It's very famous because many programming languages adopted garbage collection after John McCarthy implemented it in Lisp. However, there are a few people who misunderstand what garbage collection does. If you think garbage collection prevents a memory leak, unfortunately, you are one of them. Garbage collection cannot prevent a memory leak. There is no way to avoid all memory leaks if you are using Turing-complete language. To understand it you should know what a memory leak is. Wikipedia describes a memory leak as the following: a type of resource leak that occurs when a computer program incorrectly manages memory allocations in such a way that memory which is no longer needed is not released. Briefly, a memory leak is a bug that doesn't release a memory that you don't use. So it is first to find the memory which will not be used in order to detect memory leaks. Unfortunately, it i

Handling Terminal Output with Termios

As I explained in the previous article , Unix-like operating systems, for instance, OS X and Linux, use LF (line feed, 0x0A , \n ) as the newline character which moves the cursor to the beginning of the next line. However, the standard-defined behavior of LF only moves the cursor down to the next line, not to the beginning of the line. This difference is acceptable if files are always accessed through operating system-dependent applications. However, Unix-like systems have no distinction between files and input/output; this difference can be problematic when file and process input/output interact. To handle this difference, a terminal emulator post-processes the output appropriately. The c_oflag in the termios structure defined by the POSIX.1 standard controls this. The c_oflag is a flag for what post-processing the terminal should perform before displaying the received characters. The most important flag in c_oflag is OPOST . This flag determines whether or not to post-pro