Multiple patterns in SED delete

Click For Summary
SUMMARY

This discussion focuses on using the SED command to delete lines starting with specific characters, including tab, newline, forward slash, single space, and equals sign. The user attempted to use the command `sed -n '/^\(\t\|\n\|/\| \|=\)$/d' OUTPUT.txt > test.txt`, but encountered issues due to SED's handling of newline and tab characters. The recommended solution involves using egrep with the command `egrep -v '(^$|^ |^ |^=|^/)' somefile > newfile`, which simplifies the process. Additionally, utilizing POSIX character classes like `^[:space:]` is suggested for broader whitespace matching.

PREREQUISITES
  • Familiarity with SED command syntax
  • Understanding of regular expressions (regex)
  • Knowledge of egrep and its usage
  • Basic command line skills in Linux or Windows
NEXT STEPS
  • Research the differences between SED and egrep for text processing
  • Learn about POSIX character classes in regex
  • Explore advanced SED commands and their applications
  • Practice writing regex patterns for various text manipulation tasks
USEFUL FOR

This discussion is beneficial for system administrators, developers, and anyone involved in text processing or data cleaning tasks using command line tools like SED and egrep.

swartzism
Messages
103
Reaction score
0
I would like to delete all lines starting with the following

- \t (tab)
- \n (newline)
- /
- (single space)
- =

I've tried `sed -n '/^\(\t\|\n\|/\| \|=\)$/d' OUTPUT.txt > test.txt` and combos of that to no avail. What am I doing wrong?
 
Technology news on Phys.org
Without special syntax sed will not look at \n the way you want. Plus \t will not work. You have to hit the tab key and get what seem to be spaces.
egrep syntax is easier for me at the moment - but you can use the regex - note that a bunch of spaces is not spaces:
^$ is regex for a line that starts with \n, the | symbol is alternation.
Code:
egrep -v '(^$|^   |^ |^=|^/)' somefile > newfile

You can also use POSIX character classes like
Code:
^[:space:]
to get either \t or ' ' (space).
 
swartzism said:
I would like to delete all lines starting with the following

- \t (tab)
- \n (newline)
- /
- (single space)
- =

I've tried `sed -n '/^\(\t\|\n\|/\| \|=\)$/d' OUTPUT.txt > test.txt` and combos of that to no avail. What am I doing wrong?
Are you wanting to do this in windows or linux? As jim mcnamara indicated, sed is not the best choice; sed brings more problems than you could imagine.
 

Similar threads

  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 19 ·
Replies
19
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 31 ·
2
Replies
31
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 75 ·
3
Replies
75
Views
7K
  • · Replies 36 ·
2
Replies
36
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
4
Views
4K