Return to MTsort guide base window


MTsort Troubleshooting guide

Introduction

There will be times when the sort program fails to function properly. This can be due to problems in several areas, and this page is an attempt to help the user understand how to proceed.

Sort setup problems

Make sure you have enough disc space and quota to create the directory structure that contains all the files necessary to successfully run the sort.

Compiling the sort source file will result in messages appearing in the text window at the bottom of the Setup window. If the setup has worked then the message

Sort setup of xxxxx successful

will appear at the end of the messages.

Errors in the syntax of the sort source file will be listed. Start by looking at the first one, as successive ones may have occurred as a consequence of earlier ones.
If you cannot solve such a message, send it with the sort source file to the address below.

Sort execution problems

If the setup and compile seems to work, but the sort has a run problem,
Then try setting "Pause before each event" before GOing.
This will result in a complete setup with spectra.
In this way setup and run problems can be separated.

If the setup and compile works, but the Run windows do not appear ...

sos proc start returned 0x100015 errors seen in the session xterm may be caused by the apparatus claim not having a hostname that is in the tables

e.g.
 127.0.0.1   localhost
not in the /etc/hosts file.

Spectrum errors
During startup of a sort, spectra may have to be created. Errors during this stage will appear in the "Sort Message" text window at the bottom of the Sort Run window.

EG cant create RPC client (30015) errors are caused by the shared memory spectrum server not running. Make sure a "shmsas" task is running. It can be started by executing the "rerunshmsas" command, or from the Setup window.

Error 0x3002b is caused by a previous sort run not exiting cleanly. Exit the run window and "kill -9" all of your tasks called "MTsortSchedule".

No shared memory space free is probably caused by debris remaining from a previous sort run. Exit from the sort running, and check there are no MTsortSchedule, MTsortInput, MTsortEvent or MTsortOutput processes running by killing them. Then execute killshm which is found in the /MIDAS/MTsort bin_Linux/_SunOS directory.

Error 0x30000 >>> OK is caused by a previous sort run overwriting one of the map files. This may have been fixed in the latest release. If it still occurs, exit the Run and compile the sort again.

Other 0x300xx errors are caused by problems associated with the shared memory spectrum server. Make sure a "shmsas" task is running. If it is, try freeing and claiming the shmsas resource in the MIDAS Apparatus window.

There are many messages output in the Sort Scheduler window as a sort is started and error messages may not be immediately noticed. If a sort starts, but fails to produce counts in the spectra, and the tape does seem to move, look closer at the list of messages.

If there is a problem reading one of the map files, or in creating a spectrum, the event handler may exit. Correct the problem with the named map/spectrum file.

It is possible for the maps, spectra etc. to be created incorrectly if the filesystem that the sortdir resides on is NFS-exported from a server, and the time is not in sync between the server and client.
Ensure the client and server are both running a time-updating task such as ntpdate.

If a task thread crashes with a SEGV error, the thread ID number will be printed out. Trace back through the messages to find when the thread was started and see if it is an input,output or event thread.

If the input handler has crashed, it may be a problem of the data format. Has the data format for that handler type changed in any way, compared to the description in the help window.

If the event handler thread has crashed and you are using an Input handler other than Eurogam,then it may still be a problem as above. Check the data format you have on the tapes.
Alternatively, if you are including a USER routine, you may have a bug.

Any other crashes, please inform to the address below, with all relevant information.

Reporting problem

Any problems should be reported to the address below, with whatever information is relevant to the problem. This will include the Input handler type, and if outputting to disc/tape, and the machine type - Solaris/Linux. We may need the sortfile also.

Maintained by John Cresswell and Janet Sampson (University of Liverpool)
Email to support@ns.ph.liv.ac.uk