Last updated on February 22nd, 2022 at 11:29 am

Today I would like to share an idea on how to skip blank lines or lines that have a ‘#’ at the start (or after
one or more white-space characters, i.e. spaces, tabs etc.):

while ( < $f> ) {
next if /^\s*($|#)/;
do_something_with_the_line( $_ );
}

The tilda and dollar in between slashes is the regular expression, applied to try to match the default argument ‘$_’, which in this case is the line you just read from the file.

The ‘^’ at the start says that the comparison is to start at the very beginning of the line. The ‘\s*’ matches an unspecified number of white-space characters (between 0 and as many as there are). The ‘$|#’ means: either the end of the line (‘$’) or a ‘#’ character, with the ‘|’ being the or operator. Thus the whole line can be read as: next if the line just read starts with zero or more white-space characters, followed by the end of the line or a ‘#’ character.

The paranthesis around the ‘$|#’ are necessary because

/^\s*$|#/

would mean: match if the line either just contains zero or more white space characters (i.e. it’s a blank line) or if there’s a ‘#’ to be found anywhere within the line.

Now let us take a look at the actual code. I have a file (master.txt) with an empty line as shown. I also have a # in the third line

$ cat master.txt
This is first
second and third
#fourth and fifth
sixth and seventh
eight and nine

tenth

As a next step, I have created a Perl script by applying the above logic, named the file read_line.pl

#!/usr/bin/perl
use warnings;
my $file = "master.txt";
open my $info, $file or die "Could not open $file: $!";
while ( my $line = <$info> ) {
next if ($line =~ /^\s*($|#)/);
my $test= $_ ;
print $line;
}
close $info;

Let us execute the script, as you can see the empty line between “eight and nine” & “tenth” got removed. Also it has removed the # from the third line

$ ./read_line.pl
This is first
second and third
fourth and fifth
sixth and seventh
eight and nine
tenth

Leave a Reply

Your email address will not be published. Required fields are marked *