Part 3 of the Linux Newbie Administrator Guide
3.1.6 How do I deal with a hanged program?
(continued after the ads...)
|
Buggy programs do hang under Linux. A crash of an application should not, however, affect the operating system itself so it should not be too often that you have to reboot your computer. Linux servers are known to run for more than a year without a reboot. In our experience, a misbehaving operating system may be a sign of hardware or configuration problems: we repeatedly encountered problems with the Pentium processor overheating (the fan on the Pentium did not turn as fast as it should or it stopped altogether, the heat sink on the Pentium was plugged with dirt), bad memory chips, different timing of different memory chips (you may try re-arranging the order of the chips, it might help), wrong BIOS setup (you should probably turn off all the "advanced" options, Linux takes care of things by itself). The "signal 11" error message is typically (99%) associated with hardware problems and is most likely to manifest itself when you perform computing-intensive tasks: Linux setup, kernel compilation, etc. If your Pentium has the tendency to overheat (very common for early Pentiums), here are some tips to keep it cool, particulary during hot weather: clean the processor heat sink, replace the processor fan, operate the computer with the cover off and aim an extra fan inside, increase the processor "wait-state" in the computer BIOS, don't overclock, decrease useless load, e.g., replace this super-fancy screen saver with a blank screen. Not really
hanged. Some programs might give the uninitiated impression of hanging,
although in reality they just wait for user input. Typically, this happens
if a program expects an input filename as a command line argument and
no input filename is given by the user, so the program defaults to the
standard input (which is console). For example, this command A text-mode
program in the foreground can often be killed by pressing This command stands for "print status" and shows the list of programs that are currently being run by the current user. In the ps output, I find the process id (PID) of the program that hanged, and now I can kill it. For example: kill 123 will kill the program with the process id (PID) of "123". As user, I can only kill the processes I own (this is, the ones which I started). The root can kill any process. To see the complete list of all processes running on the system issue: ps axu | more This lists all the processes currently running (option "a"), even those without the controlling terminal (option "x"), and together with the login name of the user that owns each process ("u"). Since the display is likely to be longer than one screen, I used the "more" pipe so that the display stops after each screenful. The kill command has a shortcut killall to kill programs by name, for example: killall netscape will kill any program with "netscape" in its name, while killall pppd will surely disconnect any dial-up connection by killing the ppp daemon. X-windows-based programs have no control terminals and may be easiest to kill using this (typed in an X-terminal):xkill to which the cursor changes into something looking like a death sentence; you point onto the window of the program to kill and press the left mouse button; the window disappears for good, and the associated program is terminated. A shortcut
to the last command is to press If you have programs in the background, the operating systems will object your logging out, and issue a message like "There are stopped jobs". To override and logout anyway, just repeat the logout (or exit) command --the background program(s) will be automatically terminated and you will be logged out. Core files. When a program crashes, it often dumps a "core" into your home directory. This is accompanied by an appropriate message. A core is a memory image (plus debugging info) and is meant to be a debugging tool. If you are a user who does not intend to debug the program, you may simply delete the core: rm core or do nothing (the core will be overwritten when another core is ever dumped). You can also disable dumping the core using the command: ulimit -c 0 Checked if it worked using: ulimit -a (This shows "user limits", the option "-a" stands for "all".) To make the option of disabling core dumps permanent for all users, edit the file /etc/profile (as root), where ulimit is set, and adjust the setting. Re-login for the changes to /etc/profile to take effect. If you would like to see how a core file can be used, try (in the directory where you have a core file): gdb -c core This launches GNU debugger (gdb) on the core file "core" and displays the name of the program that created the core, signal on which the program was terminated, etc. Type "quit" to exit the debugger. To learn the meaning of different signals, try: cat /usr/include/bits/signum.h |more Next > 3.1.7 Command options
|
||

