Unix and OS X: Third-Party Unix Software

From OS X Scientific Computing

Jump to: navigation, search

Contents

Compilers

Apple's GNU-Based Free Compilers

Apple provides compilers and associated utilities like make free of charge. These are not, however, installed in OS X by default. You can install these for free along with all of Apple's Developer Tools (now called Xcode) by installing the similarly-named package that is provided on the operating system installation disks, or, on a new machine, in the directory /Applications/Installers. This installs the GNU Free Software Foundation gcc, g++, objective c and related compilers, as well as a host of other utilities and an entire suite of integrated development environment applications that enable one to create Cocoa and Carbon OS X software. Our needs are more mundane; all we require is the command-line software that includes the above compilers and associated unix command-line utilities. These will be installed in /usr/bin and other such directories. But it is best to install everything in the Xcode package, including the documentation. If you are very short on space, you can remove the root-level directory /Developer, preferably backing it up to a CD or DVD. The contents of that directory will not be required for unix-level command-line compilations.

The Xcode developer tools pacakge is distributed with the OS X installation disks and can also be downloaded for free from Apple's website. In order to do that, you have to join their developer group, which is free but involves submitting a tedious form, getting a username and password, etc. (You can spend a lot of time going in circles; this site is a pain to navigate.)

For additional details, please read the Xcode wiki page.


High Performance Computing

Apple's multiprocessor computers offer the possibility of high-performance computing and clustering. Advice for high-performance computing is currently beyond the scope of this document (and its author's competence), but here are some good resources to help you get started (or until someone else adds to this wiki section:

My very limited experience is that standard, multi-platform unix software tends to run faster ``out of the box'' on the fastest PCs than it does on the fastest G5, but if one has the knowledge and time to tweak the source code, the opportunity for optimization is greater on the OS X platform.

Installing Packages in /usr/local

The standard location to install third-party unix software is in the directory /usr/local. This directory is empy in a new OS X installation, and during an archive-install upgrade (say from 10.3.x to 10.4) the contents are moved to a temporary location (/Previous Systems. Most GNU Free Software Foundation and Open-Source software installs into /usr/local by default. OS X has been around long enough that many programs can be installed "out of the box" without having to modify the source code, configure scripts or makefiles. The usual procedure of ./configure followed by make in the unpacked source directory can by carried out anywhere. Since /usr/local is owned by the system, you will likely have to issue sudo make install to install software in /usr/local. The directories /usr/local/bin and /usr/local/sbin are not in the PATH by default. They must be added with a command in the appropriate shell startup file (cf: section 1.5.4) with commands like


              PATH=$PATH:/usr/local/bin
              export PATH

for bash, ksh and zsh users, and


              setenv PATH $PATH:/usr/local/bin

for tcsh users.

Installing Packages in /Library/Frameworks

The process of installing unix applications into /Library/Frameworks will likely be less familiar as this directory structure is unique to OS X. It is best illustrated with an example. OS X version 10.4.0 installs python version 2.3 into /System/Library/Frameworks. However, version 2.4 was available at the time of the release, so I wanted to use the latest version of python. Python version 2.3 is part of the system, so you should not remove it or disable it. Rather than install it in /System/Library/Frameworks where it could overwrite the system-provided version, it is much safer to install it into /Library/Frameworks. Fortunately, this is in fact where Python2.4 will install by default (given the appropriate directives).

Here is how I installed Python-2.4.1 into /Library/Frameworks:


    mkdir src
    cd src
    tar xvfz Python-2.4.1.tgz

    cd Python-2.4.1

    export PATH=/usr/bin:$PATH

    ./configure --enable-framework
    make
    sudo make frameworkinstall

The first command makes a (temporary) directory called src (for sourcecode), and the Python-2.4.1.tgz tarball can be downloaded, placed in that directory, and expanded. Once unpacked, we descend into the directory Python-2.4.1. For this to work properly,2.6 it is imperative to ensure that /usr/bin comes at the front of the PATH; hence we explicitly prepend it. We then configure using the -enable-framework flag, followed by make and then sudo make frameworkinstall. When complete, the new python binary will reside in /Library/Frameworks/Python.framework/Versions/Current/bin/python, so it is convenient to create a symbolic link to it from /usr/bin.

Linking to Frameworks

Frameworks contain compiled libraries, executables and header files all in one location. The system provides a large number of these, and third-party frameworks, as noted in the previous section, may also be installed. Issuing the command


         ls -1F /System/Library/Frameworks | wc -l

gives us the number of individual framework directories supplied by the OS X system. In 10.4.1, I have 72 such framework directories. These are documented on Apple's website. The last five frameworks listed in my installation, for example, are as follows:


         /System/Library/Tcl.framework/
         /System/Library/Tk.framework/
         /System/Library/WebKit.framework/
         /System/Library/XgridFoundation.framework/
         /System/Library/vecLib.framework/

As mentioned in the previous section, OS X version 10.4 now provides native Aqua ports of Tcl and Tk. When we compiled Python2.4 in the above example, the compiler linked against the Tcl and Tk frameworks in /System/Library. Because the configure script provided with Python2.4 was already aware of frameworks, nothing special had to be done (apart from ensuring that /usr/bin was prepended to the PATH variable). Obviously, this will not be true in general for all unix programs you might wish to compile.

For example, if you are compiling scientific software that requires Apple's Blas/Lapack libraries provided in /System/Library/vecLib.framework, you can do this by issuing the compiler directives


    -Wl,-framework -Wl,vecLib

when linking. The vecLib framework in 10.4 has subsumed in the ``umbrella'' framework Accelerate.framework, so it might be more appropriate to use


    -Wl,-framework -Wl,Accelerate

even though the former directive still seems to work. Some frameworks contain dynamic libraries as well as, occasionally, static libraries. Properly installed OS X dynamic libraries have their absolute paths hard-coded, which can be seen using the otool -L command as illustrated below. Doing so eliminates the need for setting the DYLD_LIBRARY_PATH and LD_LOAD_PATH environment variables2.8.


    % cd /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A
    % ls
    Headers/                libBLAS.dylib          libvDSP.dylib         vecLib
    Resources/              libLAPACK.dylib        libvMisc.dylib

    % otool -L libLAPACK.dylib
    libLAPACK.dylib:
            /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib
            (compatibility version 1.0.0, current version 176.0.0)
            /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
            (compatibility version 1.0.0, current version 176.0.0)
    ...

When a unix executable is compiled and linked against a dynamic library or framework, it too will have the absolute path to the library or framework hard-coded in the binary. For example, the third-party crystallographic refinement program refmac5 will be linked to the vecLib framework as well as several dynamic libraries residing in /sw/lib and in /usr/lib, i.e.,


    % otool -L refmac5
    refmac5:
            /sw/lib/ccp4-5.0.2/libccp4c.dylib (compatibility version 0.0.0, current version 0.0.0)
            /sw/lib/ccp4-5.0.2/libmmdb.dylib (compatibility version 0.0.0, current version 0.0.0)
            /sw/lib/ccp4-5.0.2/libccif.dylib (compatibility version 0.0.0, current version 0.0.0)
            /System/Library/Frameworks/vecLib.framework/Versions/A/vecLib (compatibility version 1.0.0, current version 176.1.0)
            /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0)

If any of these dynamic libraries cannot be found in these locations, the program will fail to execute.

Package Management Systems

Compiling every program that you need manually can rapidly turn into a major chore, especially if the software depends on many other software installations. For that reason, package management systems like RPM and Debian on Linux have become quite popular, especially among those folks who are more interested in using their computers to run software rather than spend all of their waking hours compiling it. These package management systems are a major asset for both amateur systems administrators (which describes most owners of individual PCs and OS X machines) and professionals alike. On Apple's OS X, there are at least two reasonably mature third-party package management systems that have evolved and may be obtained for free. For scientific applications, the Fink package managing system, based on the Debian model, has the richest collection of software available. The other system, MacPorts, often provides a complementary set of software packages. There is no reason you can't have both, but I recommend that you at least start out with Fink.

Fink

Fink is a package management system that enables you to compile and install GNU, Open-Source and other software in a relatively painless and completely automated manner. It also finds and installs all required dependencies, thus removing much of the burden of software installation. Fink installs all of its software in a root-level directory, /sw, that it creates when the package manager is installed. By doing so, it ensures that its own software installs do not interfere with either the software provided by the system or the software you may have installed into /usr/local. For that reason, it is entirely self-contained, and in the unlikely event of a problem, you can rid yourself of everything by simply typing sudo rm -rf /sw.

Many unix programs critical for scientific computing can be installed easily with Fink. The GNU g77 Fortran compiler, for example, is often critical. As of this writing, g77 version 3.4.3 can be installed either from sourcecode or as a precompiled debian binary package (I recommend the latter as it takes forever to compile g77) on PPC platforms only; for the newer intel platform, you need to use gfortran or g95. The fast-Fourier transform libraries fftw2 and fftw3 are available, as is tcltk, btl, gsl, plotting programs such as grace (xmgr), gnuplot, and so on, as well as many python extensions such as Numeric, Numarray, ScientificPython, and the like. Programs for biophysics, bio-informatics, chemistry, particle physics, math and many more subject areas are also well-represented in Fink. A complete listing of scientific packages in Fink is available from their package database on their website. If you need an open-source scientific package, it makes sense to check Fink's package database first.

Some packages are only available in the ``unstable'' tree of the fink distribution. All this means is that the packages have not been tested extensively on the OS X platform, or that the maintainer hasn't gotten around to moving it into the ``stable'' tree, or that a package dependency is only available in the ``stable'' tree. It does not mean that the software itself is unstable, although the user should be on the lookout for any problems and should report them immediately to the package maintainers.

The following statement is taken directly from Fink's website:

Fink is a project that wants to bring the full world of Unix Open Source software to Darwin and Mac OS X. As a result, we have two main goals. First, to modify existing Open Source software so that it will compile and run on Mac OS X. (This process is called porting.) Second, to make the results available to casual users as a coherent, comfortable distribution that matches what Linux users are used to. (This process is called packaging.) The project offers precompiled binary packages as well as a fully automated build-from-source system.
To achieve these goals, Fink relies on the excellent package management tools produced by the Debian project - dpkg, dselect and apt-get. On top of that, Fink adds its own package manager, named (surprise!) fink. You can view fink as a build engine - it takes package descriptions and produces binary .deb packages from that. In the process, it downloads the original source code from the Internet, patches it as necessary, then goes through the whole process of configuring and building the package. Finally, it wraps the results up in a package archive that is ready to be installed by dpkg.
Since Fink sits on top of Mac OS X, it has a strict policy to avoid interference with the base system. As a result, Fink manages a separate directory tree and provides the infrastructure to make it easy to use2.11.

Installing Fink:

MacPorts (formerly DarwinPorts)

MacPorts is the other major option for OS X package management. I personally have made only limited use of MacPorts, but most regular users are quite happy with it. MacPorts handles the problem of package management a bit different from what is done in Fink. Whether this is really an advantage is a matter of personal taste. MacPorts software is installed into a directory created by the package manager called /opt/local and can thus peacefully coexist with Fink. There is no reason not to maintain both systems if you have enough space.

The MacPorts home page is linked here. The following description is taken from their website:

The aim of the MacPorts project is to develop a second- generation system for the building, installation and management of third party software. MacPorts is mainly developed on Mac OS X, however by design it is quite portable and is intended to work on other UNIX-like systems, especially *BSD and hopefully Linux-based systems.


MacPorts is probably best described by comparison: It's sort of like the FreeBSD ports collection or Fink in that it automates the process of building third party software for Mac OS X and other operating systems. MacPorts also tracks all dependency information for a given piece of software. In other words, it knows what it needs to build and install and in what order for the piece of software you want to work properly. MacPorts knows how to make, build and install the software to a specific location, meaning that software installed via MacPorts doesn't simply scatter itself all over the system or require user knowledge of dependencies in what order.

Like Fink, MacPorts involves first installing a package manager, and it then allows you to install individual packages. The organization is a bit simpler; there are no ``stable and ``unstable trees, for example. At the time of this writing, the number of scientific packages available through MacPorts is more limited than through Fink.

GNU-Darwin

A number of scientific programs are also available via the GNU-Darwin distribution. The GNU-Darwin project has been instrumental in making the free version of the Darwin operating system available on PC platforms as well as the Mac platform, and takes a self-described activist role in the Free Software and Fair Use movements. I believe it was the pioneer in making scientific programs available on the OS X platform. The GNU-Darwin website has the following description of the GNU-Darwin project:

The GNU-Darwin Distribution is an amalgamation of the Darwin and GNU operating systems and a large collection of free software compatible with Darwin and Mac OS X. We are commited to Darwin as a free OS, Mac OS X compatibility, and helping users attain the benefits of software freedom.
Founded in November 2000 by proclus, The ports system and package management system were adapted from FreeBSD in order to bring Unix software to Darwin / MacOSX on PowerPC and also on the x86 architecture.
In 2002, GNU-Darwin extended its services to full-featured mail accounts (with POP, IMAP and webmail support), Web Hosting and file sharing with an original web interface that provides users an easy way to manage their site and, since 2003, shell accounts on a Darwin x86 ssh server.
GNU-Darwin has always been very cautious regarding software freedom and dedicated to concrete progress in this direction while defending digital liberties in general2.17.

The available scientific software is sometimes less up to date that the previously listed options, but you should visit their website anyway.


Personal tools