TheGeekery

The Usual Tech Ramblings

More Bash Magic...

While this really isn’t a huge amount of magic, it shows how powerful a simple shell script can be. It’s also probably an example of non-efficiency, and there are probably much better ways of handling them.

One of the helpdesk guys reported an issue with a third party vendor trying to send us pictures of vehicles. They claim that they have been sending pictures, but we’re not processing them. So I decided to take a look at the logs. I’d been given a bunch of VINs to track down, and over 2GB of log files. This wasn’t going to be a fun job.

I first started off with TextPad, which has a “Search in Files” option. This was okay, if I was handling a vin or two, but I’d been given 40 to process. While TextPad does have a scripting/macro language built into it, it’d probably take me more time to find how to record the macro, and recurse through the VINs, than it would be to write a quick script.

As with my last bash magic task, it requires the use of a simple loop, or two in this case. To start, I dropped all the VINs into a text file called “vins” (with no extension on purpose). Each VIN was entered on a new line.

for vin in `cat vins`;
do

done

This little snippet loads the contents of the “vins” file, and each line is assigned to the variable vin on each run through the loop. The next bit is to set a marker of some sort, to show if we find it or not.

  VCNT=0
  echo -n "${vin}"

VNCT is going to be our line counter, and using echo -n allows us to log any occurances. The next step is to start going through the log files.

  for log in *.txt;
  do

  done

This simple takes all txt files in the current directory, and puts the name in the variable log. This is where the fun stuff comes in. I started to use some external calls, and storing the output into the VCNT variable.

    VCNT=`grep -i "${vin}" ${log} | wc -l`

This line executes the grep command, and looks for the VIN number in the log file. wc -l simply counts the number of lines returned. If it’s more than 0, we know a record was found in that log file. As I’m processing over 2GB of log files, the first instance of the VIN, I decided to bail out on that VIN, and not check any other logs for the same VIN.

    if [ ${VCNT} -ne 0 ]
    then
      echo " - ${log}"
      continue 2
    fi

The above code basically checks that VCNT is 0. If it’s not, it outputs “ - logname”. This will be appended to the end of the VIN from the earlier echo statement because I used the -n switch. continue 2 forces the loop to be incremented to the next level, and resumed from the top. The 2 forces it to do it to the second level loop. This means that it breaks out from the log loop, and back up to the VIN loop.

Now, binding it all together, I ended up with the following:

for vin in `cat vins`;
do
  VCNT=0
  echo -n "${vin}"

  for log in *.txt;
  do
    VCNT=`grep -i "${vin}" ${log} | wc -l`
    if [ ${VCNT} -ne 0 ]
    then
      echo " - ${log}"
      continue 2
    fi
  done

  echo " - not found"

done

This will output something like this:

  123123123123123 - not found
  321321321321321 - 4_1_log.txt

Total execution time to process about 40 VINs was about 25 minutes, which is probably a lot quicker than trying to process it using TextPad.

Comments