pkgdiff and pkggrep:

Two useful shell scripts authored by Gary Kerbaugh

Download these here. Please note that there are two files in this tarball, pkgdiff and pkggrep, although the tar file itself is just called pkggrep.tgz.

Brief example of how to use pkgdiff to see if Apple's X11 header files were installed properly (a common problem):

% pkgdiff /usr/X11R6/include
You have everything X11SDK.pkg installed in /usr/X11R6/include
You have everything X11SDK.pkg installed in /usr/X11R6/include/DPS
You have everything X11SDK.pkg installed in /usr/X11R6/include/GL
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/ICE
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/PM
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/SM
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/Xaw
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/Xcursor
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/Xft
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/Xmu
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/extensions
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/fonts
You have everything X11SDK.pkg installed in /usr/X11R6/include/X11/fonts/codeconv
You have everything X11SDK.pkg installed in /usr/X11R6/include/fontconfig
You have everything X11SDK.pkg installed in /usr/X11R6/include/freetype2
You have everything X11SDK.pkg installed in /usr/X11R6/include/freetype2/freetype
You have everything X11SDK.pkg installed in /usr/X11R6/include/freetype2/freetype/cache
You have everything X11SDK.pkg installed in /usr/X11R6/include/freetype2/freetype/config
You have everything X11SDK.pkg installed in /usr/X11R6/include/freetype2/freetype/internal

You have everything X11User.pkg installed in /usr/X11R6/include
You have everything X11User.pkg installed in /usr/X11R6/include/X11
You have everything X11User.pkg installed in /usr/X11R6/include/X11/bitmaps

Gary's description:

Pkgdiff takes a directory (or directories) as an argument and compares the actual contents with contents that would be expected on the basis of the bom (bill of materials) files of the package receipts in /Library/Receipts. It reports missing files and extra files installed by the user or packages that don't have bom files. It also reports files that have changed since installation, based on their size, permissions and modification date. That's it; that's all you really need to know. How complicated could that be? The answer is that the script is rather complex, possibly the most complex I've written. However, those not interested in the gory details of how it works need not read on. Just grab the script, mark in your calendar this glorious day beginning the rest of your life and bask in the revere of the newfound power of you computer. Just don't invoke pkgdiff in the root of your hard drive unless you don't need your computer for a couple of days. :-)

A month ago, I mention in a thread here that the pkgdiff script was done but I've spent the time since refining and optimizing. I essentially wrote two new scripts based on the original. One grouped the output by directory, did the work on each directory and the grouped that work by installer. The other grouped the output by installer, queried each installer in turn and the grouped the results by directory. Both have been optimized in a myriad of ways and sped up dramatically. However, no matter how much optimizing I did, the second version is still eight times faster. The final script is a combination of the two with the latter version being the default mode and the former available with the "-d" option, since I find that logical organization more useful. Since it's faster, the default version compares modification times of files while the former doesn't.

This could take an hour to run [NB: the above example took 2 minutes on my G4 --wgs] (and will take much longer if you aim it at a large directory) but the script contains quite a variety of optimizations. It uses diff, comm and sort to do a lot of comparisons between large lists but I really had to write some awk scripts to get the lines comparable to the character. The script also uses awk to manipulate large lists in long loops but it minimizes that work by paring down the lists as matches are found. Taking advantage of that, it checks installers from newest to oldest so that older versions of the same file are never considered.

The "core engine" of the script has been factored into a function and completely rewritten. The entries in individual fields need to be compared, which suggests awk. However, I needed to input two lists and output at least four lists. I tried some fancy interprocess communication using named pipes. The file version of this worked but using named pipes resulted only in deadlocks. Then it dawned on me that I could "serialize" the output. Now, I dump all of the lists to STDOUT, separated by "markers". Then I filter the output into the appropriate list variables with sed filters, as many as there are lists. That actually sped up the process but I couldn't resist the power of the mechanism so I refined the lists to group files that differ by time, size or permissions only. I also added options to allow the user to turn off specified checks. The "core engine" now writes the awk script "on the fly", putting in only code necessary to do the comparisons the user requests. The net result of these changes is a 25% drop in speed. However, this new version is so powerful and flexible, I went with it.

As with all of my recent code, these scripts function as expected when used in pipes. Please be tolerant of the "speed" of these scripts. They manipulate vast quantities of information; basically sifting through everything you installed on your machine.

-- Gary Kerbaugh



Click here for web site index Valid HTML 4.01!