Last updated on January 15th, 1998



Disaster Strikes!

Okay, how many people out there have done something to their Linux system and really screwed something up, whether through their own fault or not? Ever boot up and see messages like "missing library..." or some other catastrophe?

This page is going to be an growing document of tips, tricks and techniques you can use to bring your Linux system back to life after the seemingly disastrous. You'd be surprised at how easy it is to both break and fix a Linux system, if you know the right hotspots and techniques.

If anyone out there would like to suggest any special techniques they've used, feel free to Email me, and I'll include it here and give you credit for it - I'm doing this page to help people out and any good technique is welcome!

Booting - LILO/MBR or Kernel Failure

Okay. Let's start with a system that LILO won't boot up at all because LILO, your MBR (Master Boot Record) or kernel are hosed. Ideally, the way out of this mess is to have a Linux boot disk handy. The best way to accomplish this is if you compiled your own kernel, then just pop a fresh floppy into the drive and do a "make zdisk" after your usual routine. Keep this floppy somewhere safe and near the computer. If trouble strikes, just boot with this floppy and your system should start up again as normal - just type lilo at the shell prompt to reinstall the loader.

You didn't make one? Shame on you! Here's another trick. Use your distribution's boot disk to boot your machine. At the LILO prompt, type this:

linux root=/dev/xxxx         Red Hat
mount root=/dev/xxxx         Slackware

...where the "xxxx" is your root disk/partition. What this does, is boot with the floppy but mount your normal root partition. Once you get in, re-run lilo (usually just by typing lilo at the shell prompt). Now go make a boot disk.

NOTE: This latter option (using the dist disk) only works if your system has an IDE drive. If you use a SCSI setup, you'll have to boot normally with the boot disk, and proceed into the setup - after the required modules/drivers are loaded, there is usually a shell prompt where you can start your repair mission. On Red Hat, it's on the second VT if you use the upgrade/install option at boot, and on both VT1 and VT2 if you use the rescue option at boot. This is kinda stinky though, because it's not as easy and efficient as using a boot disk that you make of your own kernel. Also, using this method, you'll need to manually mount your root partition, and it just gets more interesting from there.

MAKE A BOOT DISKETTE OF YOUR KERNEL!

David Miller sent me another solution which also works very well. It uses the loadlin tool to boot a Linux kernel on your DOS partition. Here's what David writes:

I (gasp) also run DOS/Windows 3.1 on my system, and have the ability to boot Linux directly from DOS without rebooting the box. I use loadlin to do this. It is very simple.
  1. I have a separate/backup copy of a good kernel in my DOS partition, called C:\LINUX\VMLINUZ
  2. I have loadlin in C:\LINUX\LOADLIN.EXE
  3. I have a 'C:\DOS\LINUX.BAT' with this as it's contents:

    loadlin c:\linux\vmlinuz root=/dev/hda2 rw
That's it. I just type 'linux' from DOS to retrieve sanity. Not only can I quickly escape from the DOS environment, but I have a backup copy of my kernel in a separate partition (of the same drive, Dohhh!)

Booting - Recent Change Backfired

Here's one that happens to all of us, and can happen every so often. Let's say you're mucking about in your system and upgrading or modifying something for whatever reason. You reboot the system to check it out, and your system barfs when it comes back up, if it even does!

Many distributions comes with a "rescue" option with their boot disks. Red Hat has something like this - you just type "rescue" at the first prompt and you should then be able to hack your system. The problem is, many key files are missing like mv or your favorite editor should you need them. This happened to me twice; once in moving a partition to be mounted at boot time which failed and locked me out, and another time by having a kernel with a SCSI driver included in it that would lock the system as it booted when it initialized the SCSI adapter. In the former case, I needed the mv command, and in the latter, I needed joe which is my usual editor.

One great way to cure this is to have a back door into the system which lets you get in and reverse your changes. One nifty thing to have on your system is the ash shell, statically linked. This is a lightweight shell that does not require any dynamic (outside) libraries (just in case your /lib or /usr/lib directories were effected). I have mine in the /bin directory called ash. To make sure your copy is static, type:

ldd /bin/ash

This will tell you the libraries, if any, this copy of ash requires. If it reports back anything other than "statically linked (ELF)" than it depends on those libraries - it will still work, but not in all situations (such as library problems).

Boot your system at the LILO prompt like this:

linux init=/bin/ash

This will boot up your system right into the ash shell, bypassing a lot of normal system startup stuff. Your root filesystem will be mounted as read-only, which will prevent you from modifying things. To fix that, type:

mount -n -o remount,rw /

This will remount your root so that you can save (write) changes. You should also have all your normal commands at your disposal. Fix whatever you did! When you're done, just do a sync ; exit at your shell prompt and reboot the computer - you may need to hit the reset button. (The reason why is because you're not in a normal "init'ed" mode. Shutdown and reboot don't work here).

Using this method, you can get into your system and actually work with things using your normal tools. It's a lifesaver!

Installing a New Kernel the Right Way

I'm not going to go into how to compile and install your own kernel, but I will tell you how not to screw up your system - at least not without another way in, should the new kernel fail for some reason.

As mentioned above, you can make a boot disk by typing make zdisk during your kernel compile. I suggest when you first finish configuring your kernel and you want to compile it, use that command to make a boot floppy with the new kernel on it. Reboot your machine with that floppy to test your kernel. If it works, go back and do a make zlilo to make the change final. Note that there are differences between distributions that might alter this somewhat. Red Hat likes kernels in the /boot directory, whereas Slackware likes them in the root. You'll have to take this into consideration as well. The idea is to test the kernel on the floppy first and then overwrite your old one!

Another method of doing this without a floppy is to put your new kernel and map file into another directory from your standard location, or follow a neat Red Hat convention. What Red Hat does is call "vmlinuz" something like "vmlinuz-2.0.30" (or whatever version of kernel you're running) and make a symbolic link, having "vmlinuz" point to the real "vmlinuz-2.0.30". What this does is let you move your new kernel to the /boot directory with a name like "vmlinuz-2.0.33" (for example) and have multiple kernel versions, each with different names. When you want a different kernel, you can either relink the symbolic link "vmlinuz" and "System.map" to the version you want and run lilo to make it stick, or try this method...

You might want to add a new entry to your /etc/lilo.conf file (for Red Hat systems) to add your new kernel to the boot options. Here is an example of a lilo.conf file (yours may vary from this):

boot=/dev/hda
map=/boot/map
install=/boot/boot.b
prompt
timeout=30
image=/boot/vmlinuz
	label=linux
	root=/dev/hda1
	initrd=/boot/initrd
	read-only

Now let's add your new configuration to this, so that you can boot your new kernel as well as the old, should the new one fail. This is your backup. Edit lilo.conf to reflect the new section as seen here:

boot=/dev/hda
map=/boot/map
install=/boot/boot.b
prompt
timeout=30
image=/boot/vmlinuz
	label=linux
	root=/dev/hda1
	initrd=/boot/initrd
	read-only
image=/boot/vmlinuz-2.0.33
	label=new
	root=/dev/hda1
	initrd=/boot/initrd
	read-only

Remember that after ANY change to lilo.conf or when compiling new kernels you should run the lilo command to update your LILO configuration, or your changes might not be as expected!

The thing to notice in this newly added section is that it is after the first, and the "image=/boot/vmlinuz-2.0.33" line. This assumes that your newly compiled kernel image is called "vmlinuz-2.0.33" and is in the /boot directory. Change this to whatever and wherever yours is. The next change is the "label=new" line. Give your new kernel a unique label - "new" is just an example. Keep it short and one word though.

When you reboot, you can then type "new" at the LILO prompt (or whatever your label is) and the system will boot using the new kernel. If it fails, just boot up normally (or type whatever your label is for the regular kernel - in the example above, it's "linux").

You can keep multiple kernels in your /boot directory by naming your kernel images using the versions like above as well as your "System.map" file using the same convention, and just symlinking "vmlinuz" and "System.map" files to the version you like to use as the default, or have multiple entries in your /etc/lilo.conf file, or even a combination of the two... I hope this makes sense. If not, let me know.


All images are (C) 1994-2005 by Michael Holve