Ksh mystery: how are newlines represented in files?

In summary, the mystery newline character is not represented the same in every file, and editing it out of a variable using the cat -E option is the solution.
  • #1
gnome
1,041
1
Please take a look at these lines which demonstrate my problem in a korn shell terminal:

$ var="abc\ndef"
$ print "$var"
abc
def
$ print "$var" > file1
$ var=${var//\\n/}
$ print $var
abcdef
$ var=$(<file1)
$ print $var
abc
def
$ var=${var//\\n/}
$ print $var
abc
def
$

First I defined var with a newline embedded in it. I printed it and, as expected, got
abc
def.
I printed the same thing to file1.

Next, I edited it using the pattern operator ${var//\\n\} to remove the newline, printed it and again got what I expected. Now var is
abcdef

Then, I replaced var with the contents of file1: var=$(<file1)
Printed it, and again there's
abc
def

So far, so good.
Now, I try to edit it again with exactly the same command var=${var//\\n/}
But it has no effect. Var still prints as
abc
def

What's going on here? Is the newline represented differently in file1? How? How can I edit it out?

One other observation:
I noticed that in the FIRST instance, after defining var="abc\ndef", if I entered
print $var
I got
abc def
Only by entering
print "$var"
would I get
abc
def

But after reading it back in from the file, it prints as
abc
def
whether I enter
print "$var"
or
print $var
 
Computer science news on Phys.org
  • #2
Well, I found a solution using the cat -E option that gives me a way to edit out the mystery newline character without knowing exactly what it is. For example:

$ var="abc\ndef"
$ print "$var"
abc
def
$ print "$var" > file1
$ cat file1
abc
def
$ var=$(cat -E file1)
$ print "$var"
abc$
def$
$ var=${var//\$?/}
$ var=${var//\$*/}
$ print "$var"
abcdef

But if anybody knows what that character is & how to edit it out of a variable without going through this merry-go-round procedure, please let me know.
 
  • #3




It appears that in the korn shell, newlines are represented as "\n" within a string variable. This is why when you used the pattern operator to remove the newline, you used ${var//\\n/} - the double backslash is escaping the first backslash, so it is searching for the literal "\n" to replace.

However, when you read the variable back in from the file, the newline is represented as an actual newline character, so the pattern operator does not work.

To remove the newline from the variable after reading it from the file, you can use the tr command to replace the newline character with an empty string. For example, you could use var=$(<file1 | tr -d '\n') to remove the newline from the variable before printing it.

As for the difference in printing with and without quotes, this is because the quotes preserve the whitespace in the string. Without quotes, the newline character is treated as a space and is not displayed.

I hope this helps clarify the mystery of newlines in files for you.
 

1. What is a Ksh mystery and how does it relate to newlines in files?

A Ksh mystery refers to the mystery surrounding how newlines are represented in files, specifically in KornShell (Ksh) programming. This is a commonly asked question because the way newlines are represented can impact how a file is read and interpreted by a computer.

2. Why is it important to understand how newlines are represented in files?

Understanding how newlines are represented in files is important because it can affect the functionality and compatibility of a program. For example, if a file was created on a Windows system with "CRLF" (carriage return + line feed) newlines, but is read on a Unix system that uses "LF" (line feed) newlines, it may cause errors or unexpected behavior.

3. How are newlines typically represented in files?

Newlines are represented differently depending on the operating system and programming language. In general, Windows systems use "CRLF" (carriage return + line feed) while Unix and Linux systems use "LF" (line feed). Some programming languages, such as Python, have a built-in "universal" newline character that can be used on any system.

4. Can newlines be changed or converted in a file?

Yes, newlines can be changed or converted in a file using various methods. One way is to use a text editor that allows you to specify the type of newline to use when saving the file. Another way is to use a command-line tool or programming language to convert the newlines in a file from one type to another.

5. How can I prevent newline issues when working with files?

To prevent newline issues when working with files, it is best to be aware of the default newline representation used by your operating system and programming language. If sharing files between different systems, it may be necessary to convert the newlines to ensure compatibility. Additionally, using a programming language with a built-in "universal" newline character can also help prevent these issues.

Similar threads

  • Sticky
  • Engineering and Comp Sci Homework Help
Replies
1
Views
13K
  • Programming and Computer Science
Replies
4
Views
713
  • Programming and Computer Science
Replies
17
Views
2K
  • Precalculus Mathematics Homework Help
Replies
7
Views
3K
  • Programming and Computer Science
Replies
5
Views
1K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
4
Views
877
  • Calculus and Beyond Homework Help
Replies
30
Views
4K
  • Programming and Computer Science
Replies
1
Views
898
  • Programming and Computer Science
Replies
2
Views
2K
Back
Top