The Zero Install system

Dr Thomas Leonard [ contact | GPG public key | blog | donations ]

Zero Install: What's wrong with APT?

I'm going to talk here about the many problems with Debian's APT system. I want to make it clear that I'm directing my criticism at APT not because it's worse than other packaging systems, but because it's one of the best. Therefore, these arguments will also apply to other systems (RPM, urpmi, tarballs, etc).

Also, I'll give a little information about myself, because of the shocking number of people posting comments like "This guy's obviously never used a Debian system" or "He's clearly never packaged a non-trivial application". I run Debian (testing and unstable) on my own machines and have done for years. Debian is much more than its packaging system; it's about quality control, bug tracking and support. If Debian switched from APT to Zero Install, it would still be a great project. I'm also the original author and maintainer of the ROX desktop (which is why many of the examples use ROX). ROX is multi-platform (Linux, BSD, MacOS X, Solaris, etc), so I'm very familiar with binary and source compatibility issues.

Let's start with a simple example. A student is sitting at one of his school's Debian machines and wishes to write a report using AbiWord. But AbiWord either isn't installed, or the version installed is too old and doesn't have a feature he requires. We can identify the following problems with a typical Debian APT system:

Requires the root password

Our student doesn't have the root password. He must try to find the administrator for the machines and get her to run 'apt-get install abiword' for him. Yet, since his running AbiWord shouldn't affect any other users of the machine, there is no need to require this. Indeed, he could download the AbiWord source code, install it into his home, and run it, all without needing the root password. But then he'd lose APT's dependency handling.

Imagine if running Mozilla from the shell prompt would run it directly, but running it from GNOME's panel required the user to enter the root password first. Requiring root privileges unnecessarily is clearly unacceptable. It is both inconvenient and a security risk.

Zero Install allows any user to run software without needing the root password.

There seems to be a belief among some people that APT must be more secure than Zero Install because it requires a password (and passwords are about security, right?). The reason you enter a password when installing software with APT is to bypass the security system, so that the software can perform its (potentially dangerous) installation operations.

(In zero install - serious critiques?, a Debian user suggests that the admin of the machine should have full control over what software is installed. This is appropriate for some systems (though probably not in a university), and a whilelist should be used in that case. However, consider the case of a jointly-owned family computer if you want a situation where users clearly must be allowed to install software, and where installing each user's software as root would be an unreasonable risk.)

Confusing

Debian has three separate places where software is installed, corresponding to the three different ways of installing software: /usr contains software managed by the APT system and installed by root; /usr/local contains non-APT software installed by root; and each user's home directory contains user-installed software. This is rather confusing for everyone, and leads to duplication if several users install the same software in their home directories.

Zero Install has a single location for software: the Internet. The same program will always be at the same location in the filesystem, regardless of who 'installed' it.

Unnecessary extra steps

Our student just wants to run the software. The software has already been written, and he has found it and now wants to run it. There is no reason why the user interface for this operation should be any more complicated than simply telling the computer to run it.

Imagine if the disk cache worked this way:

$ gimp
bash: 'gimp' not found in disk cache
$ su -c 'disk-cache load /usr/bin/gimp'
Enter root's password:
$ gimp

We don't do this. We simply tell the computer to run the program from disk, and the computer will use the cache automatically to speed the operation up if possible.

Zero Install uses the disk as a cache for the Internet, just as the memory is used as a cache for the disk. Thus the behaviour of the system is simple to understand, but just as fast.

Security and stability risk

Running anything as root is a security risk. If the Debian package for AbiWord contains malicious code (or just a simple bug), it will be running that code as root, with full power to do anything it likes to the machine. When a Debian machine is upgraded (eg, with 'apt-get upgrade'), the administrator is running hundreds or thousands of newly downloaded shell scripts as root. And if the software isn't in the main Debian archive, this means running software from less trusted sources as root. And this is almost completely unnecessary.

Running software with Zero Install does not run any part of the downloaded code as root, only as the user who downloaded it. One user of the system may run any amount of software from all over the Internet without risking the security of other users. Remember: just because installing software is difficult doesn't mean that it's secure (no, not even if you had to type commands at a shell prompt!); neither does an easy installation procedure mean that it is less secure.

Fragile

APT relies on a database to keep track of what's installed and what isn't. This database must be kept in sync with the filesystem... if the admin deletes a file to save space, then APT will continue to think that the file is installed.

The admin may delete any file from the Zero Install cache at any time, or the system can remove old files automatically from time to time. The system determines whether it needs to fetch a file by simply checking the cache; if the file is not there, it fetches it.

Downloads too much

APT often downloads more than you need. Some packages have been split, for example 'python' and 'python-doc', but most packages require you to download a considerable amount of data that you simply don't need.

Zero Install makes it easy to group files in smaller packages (as if every package had it's own '-common' and '-doc' packages). In fact, files are grouped by directory by default. Thus, you only download what you need.

Downloads too little

Despite trying to download every file for every feature of a program you might possibly need, APT still often fails to get things you want. For example: install gqview and open an image. Choose 'Edit in Gimp' from the menu, and you'll get an error complaining that Gimp isn't installed.

Everything you need is always available in Zero Install. It may simply take a little longer to run if it's not yet in the cache.

APT is not scalable

Since every package is installed as root, every package must be carefully checked by a trusted Debian developer. This means that a great deal of software is not available. Though Debian prides itself on having a huge range of packages, it has only a tiny fraction of the software available. While Debian has core tasks such as word processing and web browsing covered, there are many more specialised programs that Debian will never have the resources to package. Ordinary software authors cannot just put their software on the web and be part of the Debian repository.

Although third-party repositories can be added manually, the security risks of editing sources.list mean that few admins are willing to do it. For example, the ROX project provides a Debian repository, but a university student sitting at a Debian machine is very unlikely to be able to run ROX-Filer using it.

Anyone can make software available via Zero Install. Trust is for individual users to decide, not the admin, since their choices only affect them. Sites such as freshmeat.net provide better ranking and rating information to help users decide which software is best. Of course, the same warnings that apply to opening email attachments must be given when running untrusted software.

Index handling is slow

APT must download the latest package listing for the whole archive before doing anything. Thus, the time taken to install a package is proportional to both the size of the package, and to the total number of packages available. Installing a 50K package with APT over a dial-up connection often takes around 15 minutes due to this.

Zero Install only needs to download the index for the site you are accessing, not for every site in the world. Thus, increasing the total amount of software available through Zero Install does not slow the system down.

Upgrading is very slow

Debian allows users to upgrade all their packages at once (with 'apt-get upgrade'). This allows users to keep up with security fixes and all the latest features. However, it requires downloading a vast amount of software, most of which won't be used before it's upgraded again.

Upgrading software in Zero Install only requires refetching the indexes. Thus, upgrading the whole system is very fast, and the actual software is only downloaded if it's used. It's also possible to 'upgrade' by simply deleting all the index files, causing them to be refetched on demand too. However, this may slow down other users of the computer, since they have to wait for each index to be refetched even if the software hasn't been updated.

Closing remarks

Of course, APT does have the considerable advantage over Zero Install that it is widely used and well tested. However, hopefully these points suggest that the design can still be bettered.

There is another tool, auto-apt, which tries to install Debian packages when files they contain are accessed. However, this does not solve the problems of security (installation still happens as root), scalability (only trusted packages can be used) or speed (if fact it's slower, as it must build an additional database). It also has the problem that resources don't appear until you access them, so you can't click on an application until it's installed... but you can't install it until you click on it, so it's mainly useful from the shell prompt.

I should also point out that APT does some things not covered by Zero Install: it handles configuration, and automatically starts services. That is, when you install Apache, APT will ask some questions to get it ready and then actually run it for you (which, of course, requires root permissions). In Zero Install, these functions would be handled by a separate system. Since the actual installation is automatic, this means that such tasks take the same number of steps. For example, this command might present the same configuration boxes as 'sudo apt-get install apache' does on Debian:

$ sudo /uri/0install/apache.org/apache