Skip to main content

CR, LF, and CRLF

One of the confusing aspects for people working across multiple platforms is the newline character. Mac OS, Windows, and Linux all use different characters for newline. Even Mac OS behaves differently between older and newer versions. In this article, we will explore the reasons behind the different newline characters used across systems.

According to the ISO 6429 standard, LF (line feed, \n) moves the cursor to the next line while maintaining the current column, and CR (carriage return, \r) moves the cursor to the beginning of the current line. To achieve the newline function, both CR and LF should be used together. This distinction was made to mimic the behavior of early printers and typewriters that separated the line-changing action from the action of moving the cursor to the beginning.

A
 B

For instance, a string "A\nB" should not result in B directly below A, but rather B should appear diagonally below A, like in the above example. However, systems using CRLF as a newline character are rare. Storage is a relatively cheap resource nowadays, but in the past, it was quite expensive. System designers of that time considered allocating two bytes for newline characters to be an excessive expense. Consequently, some system designers began using either LF or CR as the sole newline character.

Amidst this, some systems adhered to the standard. One such operating system was CP/M, which used a combination of CR and LF for newline characters. This choice was not simply due to the desire to follow the standard but was a strategic decision to maintain compatibility with earlier remote terminal devices. In other words, the costly storage expense was deemed worthwhile to secure market dominance by maintaining backward compatibility. Some subsequent operating systems also made the same choice, one of which was Microsoft's MS-DOS. This choice continues today, with Microsoft Windows using CRLF as its newline character.

Among the operating systems that chose to reduce newline characters to a single byte, there were different preferences. Some designers chose LF as the newline character, while others picked CR. Multics, an operating system that later influenced the design of Unix and BSD, chose LF. Though Multics is no longer in use, Unix-like systems, including Linux, adopted LF and move the cursor to the beginning of the next line when encountering LF.

Apple was a prominent supporter of using CR as the newline character. The Apple II, created in 1977, used CR for newline and simply ignored LF. This choice was carried over to Mac OS. However, with the creation of the POSIX.1 (Portable Operating System Interface) standard in 1988, Unix-based operating systems, including Linux, began to prioritize compatibility with one another. As a result, Apple started to change as well. Eventually, in 2001, OS X adopted LF as its newline character, making systems using CR as the newline character increasingly rare. Currently, aside from some legacy Apple programs, only LF and CRLF exist.

Comments

Popular posts from this blog

Type Conversion in Rust

Type conversion is not special in Rust. It's just a function that takes ownership of the value and returns the other type. So you can name convert functions anything. However, it's a convention to use as_ , to_ , and into_ prefixed name or to use from_ prefixed constructor. From You can create any function for type conversion. However, if you want to provide generic interfaces, you'd better implement the From trait. For instance, you should implement From<X> for Y when you want the interface that converts the X type value to the Y type value. The From trait have an associated function named from . You can call this function like From::from(x) . You also can call it like Y::from(x) if the compiler cannot infer the type of the destination type. Into From have an associated function, it makes you be able to specify the destination type. It's why From has an associated function instead of a method, but on the other hands, you cannot use it as a m

Do not use garbage collection to catch memory leak

Garbage collection is a technique that automatically releases unnecessary memory. It's very famous because many programming languages adopted garbage collection after John McCarthy implemented it in Lisp. However, there are a few people who misunderstand what garbage collection does. If you think garbage collection prevents a memory leak, unfortunately, you are one of them. Garbage collection cannot prevent a memory leak. There is no way to avoid all memory leaks if you are using Turing-complete language. To understand it you should know what a memory leak is. Wikipedia describes a memory leak as the following: a type of resource leak that occurs when a computer program incorrectly manages memory allocations in such a way that memory which is no longer needed is not released. Briefly, a memory leak is a bug that doesn't release a memory that you don't use. So it is first to find the memory which will not be used in order to detect memory leaks. Unfortunately, it i

Handling Terminal Output with Termios

As I explained in the previous article , Unix-like operating systems, for instance, OS X and Linux, use LF (line feed, 0x0A , \n ) as the newline character which moves the cursor to the beginning of the next line. However, the standard-defined behavior of LF only moves the cursor down to the next line, not to the beginning of the line. This difference is acceptable if files are always accessed through operating system-dependent applications. However, Unix-like systems have no distinction between files and input/output; this difference can be problematic when file and process input/output interact. To handle this difference, a terminal emulator post-processes the output appropriately. The c_oflag in the termios structure defined by the POSIX.1 standard controls this. The c_oflag is a flag for what post-processing the terminal should perform before displaying the received characters. The most important flag in c_oflag is OPOST . This flag determines whether or not to post-pro