Nice Things About Perl – Searching Gzip Files

Recently, I needed to search for a string across log files that were compressed by gzip. The gzipped log files span across different directories and nested subdirectories. Uncompressing the files and then using grep would not be a nice option, so what came to my mind immediately was zgrep.

Zgrep works like grep but with the ability to search compressed files. I thought this was pretty neat, until I needed to perform more complex search using regular expressions.

For some reason, I couldn’t make zgrep perform regex like \s+ or $ (end of line anchor). Rather than spend more time researching and experimenting what possible tweaks I should do, I decided to just use Perl and get it over with.

Below is my crude implementation of zgrep with full Perl regex capability.

for i in `find /var/log -name "*.gz"`; do export file=$i; gzip -dc $i | perl -ne 'print "$ENV{'file'}: $_" if /write\s+failure/'; done

It’s still a one-liner command, however, I agree it looks ugly. But once you figured out how it works, you’d realize it’s really simple.

The find command lists all the gzip files under the directory you want to search. If you don’t want to search in subdirectories, you can supply the -maxdepth option. The for loop iterates through each gzip file. Each gzip file is then uncompressed inline by gzip. But instead of uncompressing to a file, it uncompresses to stdout. The output is then fed to the Perl script. The string between the forward slashes is the regex pattern that you supply for searching. The $ENV{‘file’} is a way to access the gzip file it is operating so that it can print with the line when a match is found. Without this, you wouldn’t be able to know which gzip file a match was found.

Given access to Perl’s regex, you now have a powerful search tool.


3 thoughts on “Nice Things About Perl – Searching Gzip Files

  1. I like the use of the $ENV{file} !!

    I do like this but as you say it is ugly, I don’t mean it is ugly because it is ugly in sight but it is ugly cause it uses for/done, find and gzip.

    What would be nice would be to convert it to work entirely in Perl, it would be one hell of a one liner but this article is not really about a one liner but ‘Nice things about Perl’

    It would take just a few lines of Perl code and it would make for a great script as, like you, I’ve often needed to search my gzipped log files.

    I’ll put it in my Todos!

    Thanks for your idea!

  2. Pingback: Zgrep | TagHall

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s