Grub (and LILO too for that matter) has a useful ‘failsafe’ feature that can be configured. This proves especially useful for remote kernel upgrades, where a failed boot will render your machine offline and unavailable.

Here is my standard grub config. I have just added my new 2.6.28 kernel.

default         0
timeout         5
color cyan/blue white/blue

title           Debian GNU/Linux, kernel 2.6.28 #NEW KERNEL
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.28 root=/dev/sda1 ro
initrd          /boot/initrd.img-2.6.28

title           Debian GNU/Linux, kernel 2.6.18 #CURRENT GOOD KERNEL
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.18 root=/dev/sda1 ro
initrd          /boot/initrd.img-2.6.18</div>
Firstly, I will change my 'default 0' to 'default 1' to ensure that the default booted kernel is the current good working one. Then I will make set panic=5 on the new kernel to ensure that if it does fail, the system reboots automatically after 5 seconds. My new configuration is now:
<div class="code">default         1
timeout         5
color cyan/blue white/blue

title           Debian GNU/Linux, kernel 2.6.28 #NEW KERNEL
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.28 root=/dev/sda1 ro panic=5
initrd          /boot/initrd.img-2.6.28

title           Debian GNU/Linux, kernel 2.6.18 #CURRENT GOOD KERNEL
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.18 root=/dev/sda1 ro
initrd          /boot/initrd.img-2.6.18

One edited, I can now set the default just once on the next boot to be entry ‘0’ which is the new kernel:

echo "savedefault --default=0 --once" | grub --batch

At this point, the default kernel on my next reboot will be my new kernel. Should it panic however, the system will reboot after 5 seconds into my regular default kernel which is my good working kernel.

Upon rebooting, use ‘uname -a’ to check which kernel is running. Should it be your new one, you can set that to your default. If your old one reappears, then the new kernel caused a panic and needs work.

Bear in mind that this will prevent a kernel panic situation only. Should your new kernel not contain network device drivers for example, it will still likely boot successfully, just without bringing a network interface up, which will lock you out of your machine.

You could set a timer on boot in one of the boot scripts or as a cron job that will test for such a situation and reboot, or automatically reboot within ‘x’ minutes unless you’ve successfully logged in to kill the timer.