Seulgi Kim

Posts

Showing posts from 2023

Iterator Adapters in Rust

An Iterator that takes another iterator and returns a new one is called an iterator adapter . The name "adapter" comes from one of the GoF's design patterns, the adapter pattern . However, in reality, it corresponds more to the decorator pattern , so if you pay too much attention to the name, you might get confused about its purpose. So it's better not to worry too much about the name. Enough complaining about the name, what does an iterator adapter do? An iterator adapter adds a task to be performed when the iterator iterates. This will be easier to understand when you see an example. The map function is one of the famous adapters. The iterator returned by the map function for those who have used functional languages iterates over new values transformed from the original values. Besides, various adapters are already implemented in the standard library. Among them, the most frequently used are those that are convenient to use with loops. Examples include the ...

Difference Between the clear Command in Linux and Mac

I've been writing a series of posts about CSI Sequences, but we rarely use CSI Sequences directly. However, there is a CSI Sequence that we use unknowingly. It's the clear command that clears the screen. The clear command basically uses two types of CSI sequences. One is CSI H ( Cu rsor P osition, a.k.a CUP); it moves the cursor to the beginning of the screen. The cursor is at the top-left corner after the command ends, thanks to CUP. The second CSI Sequence is CSI 2 J ( E rase in D isplay, a.k.a. ED), which is used to clear the entire screen. Linux and Mac use these two sequences; they behave the same way up to this point. However, Linux's clear and Mac's differ in their subsequent actions. In a nutshell, Linux's clear clears the scrollback buffer, while Mac's does not. Linux's one prints CSI 3 J after the two sequences. CSI 3 J is an extension of the Escape Sequence introduced by xterm that removes lines stored in the scrollback buffer. Since be...

What is the size of an empty object?

Consider a class like the one above. Commonly called an "empty class," this class has no internal variables. So, how big is this empty class? At first glance, the size should be 0 since there are no member variables. However, the size is never 0 in any language, whether Java, C#, C (in this case, a struct), or C++. This is to ensure that two different objects never have the same address. Empty classes typically have a size of 1 byte in a 32-bit environment and 2 bytes in a 64-bit environment. However, the exact size cannot be determined. According to the specification, the size just needs to be non-zero. The precise size depends on the implementation. This is a translation of my old Korean post written in 2015. Because the size can vary depending on the implementation, it is now possible to have different sizes (although still not 0). And Languages like Rust have even introduced zero-sized types . We will look at this topic in more detail at a future opportunity.

Clear Screen with CSI Sequence

Today, following my previous post , I will explain how to clear the screen using CSI Sequences. There are two sequences in the CSI Sequence for clearing. The first one is the Erase in Line sequence, called EL . It is composed of CSI # K ; it is used to erase lines, as the name suggests. If the # is not provided, the default value is 0, and if a value is provided, it must be one of the three: 0, 1, or 2. The terminal will ignore the sequence if any other value is provided. For example, if you print 0x311b5b334b32 (or 1^[3K2 ), the terminal ignores ^[3K , and the screen displays only 12 . The behavior of 0, 1, and 2 can be summarized as follows. 0 Erases from the cursor to the end of the line. 1 Erases from the beginning of the line to the cursor. 2 Erases the entire line, regardless of the cursor's position. Remember that the EL sequence does not move the cursor's position. Therefore, if you want to erase the current line and write a new line on the c...

[C++] Handling Exceptions in Constructors

When you use RAII idiom, there are often situations where constructors have to do complex tasks. These complex tasks can sometimes fail, resulting in throwing exceptions. This raises a concern: Is it okay to throw exceptions in constructors? The first concern is memory leaks. Fortunately, memory leaks do not occur. Variables created on the stack are released through stack unwinding, and if an exception occurs during heap allocation with the new operator, the new operator automatically deallocates the memory and returns nullptr . The next concern is whether the destructor of the member variables will be called correctly. However, this is also not a problem. When an exception occurs, member variables can be divided into three categories: fully initialized member variables, member variables being initialized, and uninitialized member variables. Fully initialized member variables have had their constructors called and memory allocations completed successfully. In the example code, t...

[C++] enum class

Traditional C++ enum had several issues. To solve these problems, C++11 introduced a new feature called enum class . In this article, I will examine the problems with the traditional enum and how they are solved with enum class . First, traditional enum could not be forward-declared. The reason was that if the values in the enumerator were unknown, it was impossible to determine their size . However, enum class is treated as int if an underlying type is not specified, assigning values outside the range of an int will raise a compilation error. If you want to use values outside the range of an int , you need to specify the underlying type. Another problem with traditional enum was that the scope of enumerator names was not limited. Let's see the following example. Here, we try to represent the results of IO and Parse functions with enum s. However, this code will not compile because the Error and Ok of IOResult conflict with those of ParseResult . To resolve t...

C is Not a Subset of C++

I came across an absurd article. A well-written C program is a C++ program. Therefore, a well-written C program should be compilable with a C++ compiler. This statement was undoubtedly true before 1999. Bjarne Stroustrup definitely took C compatibility into account when creating C++. At that time, well-written C code that adhered to the ANSI C standard was correctly compiled with a C++ compiler. However, that's limited to the time before the release of C99. C99 introduced various new features, which C++ had already implemented differently or did not consider necessary. Moreover, the release of the new C11 standard and the new C++ standards(C++03, C++11, and more) have widened the gap between the two languages to a point where it is practically impossible to bridge. Code that follows the C89 standard can still be compiled with a C++ compiler. But how many programs nowadays use C89? Try to find an actively developed project that uses C89. I have never tried to find one. So,...

[C++] Object slicing

Object slicing refers to the loss of information from a derived class instance when it is copied to a parent class instance, due to the nature of value types that assign values to the stack instead of the heap. This is a bug that occurs in languages like Java, which only have reference types that allocate values to the heap. Upcasting should not be used for value types due to the issue of object slicing. In most cases where upcasting is needed, there is already an issue with the code that needs to be fixed. If upcasting must be used under certain circumstances, values must be assigned to the heap. This article is a translation of a Korean post written in 2015. If you would like to view the original, please refer to this link .

What Is RAII

RAII is a frequently used idiom in C++ that ensures the safe usage of resources by releasing them when an object's scope ends. In C++, resources allocated on the heap are not released unless explicitly done so, but those allocated on the stack are automatically released when their scope ends, triggering their destructor. Originally, RAII was used to guard against unexpected changes in control flow, such as exceptions. In the above code example, the unsafeFunction() function is not safe. If the thisFunctionCanThrowException() throws an exception, the resource may not be released. The unmaintanableFunction releases the resource , but it is not easy to read and maintain. The safeFunction example uses unique_ptr , a smart pointer introduced at C++11, for RAII. unique_ptr automatically releases the memory it holds when it is destroyed, ensuring that the resource is released when the function exits. The resource does not only refer to heap memory but also includes files, d...

Cursor Movement with CSI Sequences

Code Abbr Name CSI # A CUU CUrsor Up CSI # B CUD CUrsor Down CSI # C CUF CUrsor Forward CSI # D CUB CUrsor Backward CSI # E CNL CUrsor Next Line CSI # F CPL CUrsor Previous Line CSI # I CHT Cursor Horizontal forward Tabulation CSI # Z CBT Cursor Backward Tabulation CSI # G CHA Cursor Horizontal Absolute CSI # ; # H CUP CUrsor Position Today, we will continue from the previous article to explore how to move the cursor using CSI sequences. The types of CSI sequences for moving the cursor can be summarized as follows. CUU, CUD, CUF, CUB These are the abbreviations for CUrsor Up, CUrsor Down, CUrsor Forward, and CUrsor Backward; as the names suggest, they move the cursor up, down, forward, and backward. They take a single number as an argument; if the argument is omitted, it is treated as 1. Thus, 0x1b[A is equivalent to 0x1b[1A . In this case, CUF and CUB move only within the same line. In other words, CUB rece...

Use Carriage Return for Simple Progress Bar in Text Applications

Since most systems, including Unix, use LF ( \n , 0x0A ) as a newline character, using of CR ( \r , 0x0D ) is quite rare. One of the few cases where CR is used in modern computers is when creating progress bars in text applications. Using CR allows for a simple implementation of progress bars in terminals. The code above draws a progress bar with # and ' '(space). For convenience, I fixed the progress bar's length at 20 characters, adding one # for every 5% increase in progress. When using CR to draw a progress bar like this, there are three points to consider. The first point is to draw the progress bar on stderr instead of stdout . One of the significant differences between stdout and stderr is that stdout buffers output rather than immediately displaying it on the screen. Typically, stdout buffers output until it encounters a newline character. Therefore, if you print a progress bar without a newline character on stdout , the screen will not be updated unti...

Handling Terminal Output with Termios

As I explained in the previous article , Unix-like operating systems, for instance, OS X and Linux, use LF (line feed, 0x0A , \n ) as the newline character which moves the cursor to the beginning of the next line. However, the standard-defined behavior of LF only moves the cursor down to the next line, not to the beginning of the line. This difference is acceptable if files are always accessed through operating system-dependent applications. However, Unix-like systems have no distinction between files and input/output; this difference can be problematic when file and process input/output interact. To handle this difference, a terminal emulator post-processes the output appropriately. The c_oflag in the termios structure defined by the POSIX.1 standard controls this. The c_oflag is a flag for what post-processing the terminal should perform before displaying the received characters. The most important flag in c_oflag is OPOST . This flag determines whether or not to post-pro...

CR, LF, and CRLF

One of the confusing aspects for people working across multiple platforms is the newline character. Mac OS, Windows, and Linux all use different characters for newline. Even Mac OS behaves differently between older and newer versions. In this article, we will explore the reasons behind the different newline characters used across systems. According to the ISO 6429 standard, LF (line feed, \n) moves the cursor to the next line while maintaining the current column, and CR (carriage return, \r ) moves the cursor to the beginning of the current line. To achieve the newline function, both CR and LF should be used together. This distinction was made to mimic the behavior of early printers and typewriters that separated the line-changing action from the action of moving the cursor to the beginning. A B For instance, a string " A\nB " should not result in B directly below A , but rather B should appear diagonally below A , like in the above example. However, systems usin...

Understanding Escape Codes and Control Sequences

The exact term of escape code defined in ISO 6429 is a control function. Escape code is commonly used to refer to the code or sequence that represents control functions. Control codes defined by ISO 6429 are divided into two categories; C0 codes and C1 codes. C0 codes correspond to non-printing characters in ASCII . These codes are familiar to developers. It includes line feed ( \n ), carriage return ( \r ), tab ( \t ), and null character ( \0 ). There are 32 codes in C1. They are represented by one-byte values between 0x80 and 0x9F . Unlike C0 codes, C1 codes are not defined in ASCII . They can only be used in terminals that support ASCII . In other words, they are available in 7-bit environments. In modern 8-bit environments, C1 codes cannot be used directly and must be expressed as escape sequences. Therefore, most modern terminals use escape sequences to represent C1 codes when needed. An escape sequence is a series of characters that begins with ESC ( 0x1B ). Sequences t...

Brief History of Escape Codes

When installing Linux on a computer, I always install a program called sl . This program displays a train when you execute sl . It is not a practical program but rather a program that gives you time to think when you make a typo with the commonly used ls command in the terminal. Showing a train on the screen helps you calm down and not make other mistakes when you are in a hurry to type. That's why I install this program. source: https://github.com/mtoyoda/sl The terminal is a program that receives and displays two streams, stdout, and stderr, from a program. These outputs are sequential outputs and typically flow from the top left to the bottom right. However, to draw new characters on an already-used screen, a special method is needed. This special method is called escape codes . Escape codes are a kind of promise defined in the terminal. Currently, these promises follow the standards defined in ISO 6429 . However, in the past, there was no unified consensus, and each term...