Skip to main content

CR, LF, and CRLF

One of the confusing aspects for people working across multiple platforms is the newline character. Mac OS, Windows, and Linux all use different characters for newline. Even Mac OS behaves differently between older and newer versions. In this article, we will explore the reasons behind the different newline characters used across systems.

According to the ISO 6429 standard, LF (line feed, \n) moves the cursor to the next line while maintaining the current column, and CR (carriage return, \r) moves the cursor to the beginning of the current line. To achieve the newline function, both CR and LF should be used together. This distinction was made to mimic the behavior of early printers and typewriters that separated the line-changing action from the action of moving the cursor to the beginning.

A
 B

For instance, a string "A\nB" should not result in B directly below A, but rather B should appear diagonally below A, like in the above example. However, systems using CRLF as a newline character are rare. Storage is a relatively cheap resource nowadays, but in the past, it was quite expensive. System designers of that time considered allocating two bytes for newline characters to be an excessive expense. Consequently, some system designers began using either LF or CR as the sole newline character.

Amidst this, some systems adhered to the standard. One such operating system was CP/M, which used a combination of CR and LF for newline characters. This choice was not simply due to the desire to follow the standard but was a strategic decision to maintain compatibility with earlier remote terminal devices. In other words, the costly storage expense was deemed worthwhile to secure market dominance by maintaining backward compatibility. Some subsequent operating systems also made the same choice, one of which was Microsoft's MS-DOS. This choice continues today, with Microsoft Windows using CRLF as its newline character.

Among the operating systems that chose to reduce newline characters to a single byte, there were different preferences. Some designers chose LF as the newline character, while others picked CR. Multics, an operating system that later influenced the design of Unix and BSD, chose LF. Though Multics is no longer in use, Unix-like systems, including Linux, adopted LF and move the cursor to the beginning of the next line when encountering LF.

Apple was a prominent supporter of using CR as the newline character. The Apple II, created in 1977, used CR for newline and simply ignored LF. This choice was carried over to Mac OS. However, with the creation of the POSIX.1 (Portable Operating System Interface) standard in 1988, Unix-based operating systems, including Linux, began to prioritize compatibility with one another. As a result, Apple started to change as well. Eventually, in 2001, OS X adopted LF as its newline character, making systems using CR as the newline character increasingly rare. Currently, aside from some legacy Apple programs, only LF and CRLF exist.

Comments

Popular posts from this blog

[C++] Handling Exceptions in Constructors

When you use RAII idiom, there are often situations where constructors have to do complex tasks. These complex tasks can sometimes fail, resulting in throwing exceptions. This raises a concern: Is it okay to throw exceptions in constructors? The first concern is memory leaks. Fortunately, memory leaks do not occur. Variables created on the stack are released through stack unwinding, and if an exception occurs during heap allocation with the new operator, the new operator automatically deallocates the memory and returns nullptr . The next concern is whether the destructor of the member variables will be called correctly. However, this is also not a problem. When an exception occurs, member variables can be divided into three categories: fully initialized member variables, member variables being initialized, and uninitialized member variables. Fully initialized member variables have had their constructors called and memory allocations completed successfully. In the example code, t

Iterator Adapters in Rust

An Iterator that takes another iterator and returns a new one is called an iterator adapter . The name "adapter" comes from one of the GoF's design patterns, the adapter pattern . However, in reality, it corresponds more to the decorator pattern , so if you pay too much attention to the name, you might get confused about its purpose. So it's better not to worry too much about the name. Enough complaining about the name, what does an iterator adapter do? An iterator adapter adds a task to be performed when the iterator iterates. This will be easier to understand when you see an example. The map function is one of the famous adapters. The iterator returned by the map function for those who have used functional languages iterates over new values transformed from the original values. Besides, various adapters are already implemented in the standard library. Among them, the most frequently used are those that are convenient to use with loops. Examples include the

Clear Screen with CSI Sequence

Today, following my previous post , I will explain how to clear the screen using CSI Sequences. There are two sequences in the CSI Sequence for clearing. The first one is the Erase in Line sequence, called EL . It is composed of CSI # K ; it is used to erase lines, as the name suggests. If the # is not provided, the default value is 0, and if a value is provided, it must be one of the three: 0, 1, or 2. The terminal will ignore the sequence if any other value is provided. For example, if you print 0x311b5b334b32 (or 1^[3K2 ), the terminal ignores ^[3K , and the screen displays only 12 . The behavior of 0, 1, and 2 can be summarized as follows. 0 Erases from the cursor to the end of the line. 1 Erases from the beginning of the line to the cursor. 2 Erases the entire line, regardless of the cursor's position. Remember that the EL sequence does not move the cursor's position. Therefore, if you want to erase the current line and write a new line on the c