Haskell Package Layout
I’ve been looking at cabal’s default layout of installed packages, as implemented in Distribution.Simple.InstallDirs. On non-Windows systems this ends up like:
$prefix -- /usr/local if --global, ~/.cabal if --user
bin -- binaries
lib
$pkgid
$compiler -- libraries & .hi files
include -- include files
libexec -- private binaries
share
$pkgid -- data files
doc
$pkgid -- documentation
html -- html doc
man -- man pages
There are several problems with this layout:
- It doesn’t truly support multiple compilers (or even versions of the same compiler), because while the libraries and .hi files can be multiply resident, things like the doc and the binaries only get the last set built. But, the doc for a package could change depending on which compiler it is compiled for (perhaps not all of the API is available under an older version…)
- If you want to remove a package, you’ve got to ferret all the pieces out of global bin, libexec, and man directories, and there are three separate directories named with
$pkgidto remove. - If you want to remove a compiler, you need to remove all the
$compilerdirectories out of all the packages. Then, if a package has no other$compilersubtrees, remove that package (see #2).
Most other language library sets on other platforms seem to place things under per interpreter version sub-trees[1]. In keeping with that, and trying to better support the three use cases above, I developed this:
$prefix -- /usr/local/haskell if --global, and ~/.cabal if --user
$compiler
$pkgid
bin -- binaries
lib -- libraries & .hi files
include -- include files
libexec -- private binaries
share -- data files
doc -- documentation
html -- html doc
man -- man pages
The first big advantage is that a package can be installed for multiple compilers easily, and independently. The second is that removing an older compiler, and all the package versions for it, is really easy. And removing a package is quite a bit easier: just remove the $pkgid under each $compiler. Of course, there is also the nicety that bits of Haskell packages aren’t intermingled throughout /usr/local.
The actual $prefix directories would probably be platform and distribution specific. For example, on Mac OS X the would be /Library/Haskell and ~/Library/Haskell.
This structure is similar to what I proposed for Mac OS X awhile back, and have been running on my systems for about a year. Note that the GHC distribution, uses a somewhat different layout for the packages it includes, but shares with this structure the ordering of $compiler/$pkgid rather than the other way ’round. This structure also has no need for the special $libsubdir and $datasubdir processing.
To Facilitate easy access to binaries and docs, we could add:
$prefix -- /usr/local/haskell or ~/.cabal
$compiler
bin -- symlinks to binaries in built with this $compiler
doc -- doc for packages for this $compiler
html -- master index of html
man -- symlinks to man pages under this
current -- symlink to current $compiler
bin -- symlink to current/bin
doc -- symlink to current/doc
Users can then put /usr/local/haskell/bin and ~/.cabal on their PATH, or further simlink from those locations to bin directories that already are.
It is relatively easy to set up your own .cabal/config file to do this. But, now with Haskell Platform, more people will be doing their initial installs via these prepackaged means, and they all use the current layout, and default new package installs to that layout as well. If there is consensus that the above layout would improve things going forward, especially in supporting multiple installed compiler versions, then I’d be happy to submit a patch to Cabal for it.
Thoughts?
- Mark
[1] See, for example, how Python installs things under /usr/lib/python$version, and Perl uses /usr/lib/perl5/$version.
Comments are closed.
Nice proposal.
My biggest concern is the approach to binaries – I don’t want to lose my ‘darcs’ or ‘cabal’ binaries because I decided to try out GHC 7.2. These are programs that are intended to work independent of which Haskell compiled is on your path.
However Haddock seems to have some ties to GHC versions (although I’m a bit hazy on this point), so I would want that in per-compiler bin directories.
I’ve also been using this scheme for about a year, also, and I heartily recommend it.
In particular, a good test of a GHC installation is to build another from source, including all desired packages. This scheme makes that easy.
About 1.
A package A with a specific version v must provide a specific API independent of the compiler. If a package B imports A-v it cannot additionally check what compiler was used to compile A (and B, too). Thus documentation should really only depend on the package version. If there is a feature, that is available only for a specific compiler, then this must be moved into a separate package C. Package C can then be compiled completely or not on a given compiler, but not in parts.
I have an additional problem with the current layout: You may use the same compiler version on different operating systems and processors, say Solaris/SPARC and Solaris/Intel. It’s currently not possible to use them in parallel, unless you use different Cabal directories. In Modula-3 they define Targets, where a target specifies Operating system, Processor, Linker object format and compiler backend (GCC based or othes). In short: The target contains every feature, that can make an installed library different for the same compiler version.
I have still a problem with the current local Cabal directory structure. Here everything is build below dist/build. I use Cabal in development and have long dependency chains and different compilers installed. If a basic package changes I have to recompile all dependent packages, but ghc’s ‘make’ feature of recompiling only dependent modules does not help, because I have to recompile all modules, because dist/build cannot hold the compiled files for two compiler versions.