How to start cobra's online dedispersion code

This describes briefly the necessary process for observing using COBRA with a number of seperate 5MHz bandwidth samplers.

I currently have a vncserver going with all the windows open, in all the necessary directories to do a single sampler run.
If you run vncviewer and log into megahard:5 with the online password, you should be able to pick up where ever it last left off.
There are 4 working directories at the moment, and different samplers should be run from separate directories because the software passes information between processes in files with the same names!.

These are the directories that we're currently using:
/raid1/online/obs/subb1
/raid1/online/obs/subb2
/raid1/online/obs/subb3
/raid1/online/obs/subb4
I usually use subb1 for testing things with one sampler, so this should have the current version of the software, cobra_acquisition, cobra_display, cobra_command_testx and cobra_testy. It's worth making sure you have the same versions of these programs in the other subb directories if you're going to run more samplers.
I'm using the file README in this directory to make notes, if you get stuck or get various error messages, have a look at this file,
if I've met the error before, it may be noted.

The sequence of the following commands must be repeated per subband of 5MHz.
ADLINK drivers Log into the sampler node as root
Check whether the drivers are loaded by typing lsmod
[root@node1-7 /root]# lsmod
Module                  Size  Used by
p7300b                 35633   1 
adl_mem_mgr             2632   2  [p7300b]
If these two drivers are not seen, install as follows:
cd /raid1/bjoshi/testsamp/
./dask_inst_SMP.pl 
cobra_acquisition This program should be run as root on the sampler node, in the same subbx directory as everything else, 
since it writes a parameter file header_ipcx which holds shared memory parameters for other processes 
to pick up. Make sure you remove any stop_acq file from the directory before starting.
Run command: 
./cobra_acquisition <nsleep> <rank> <mjd> <sec> <pac_size> <[logfile]>
Where
  • Nsleep is a delay in startup
  • rank is a variable to separate data from different samplers (it get tacked on the epn file name  
    and the sahred memory parameters file name.)
  • mjd &&; sec are no longer in use - but were used to set time tag on the data.
  • pac_size is the size of the data packets
  • logfile is optional, but should be on a local disc since it may take a lot of traffic.
eg.
./cobra_acquisition 1 2 52575 40000 16384 /scratch/COBRA/acq.log
mpi software All the rest of the software should be run from cobra, using the online account.
mpi software
parameter files
The processing software reads a set of parameter files on startup (in current directory as usual). 
They are:
  • cobra_test.master.par
  • cobra_test.collector.par
  • cobra_test.server.par
  • cobra_test.client.par
I've annotated them a bit, but you'll probably only need to edit them if you change the number of 
clients, everyhting else is mostly standard.
mpi software
starting it!
First remove any files called shut, abort or setup 
The main processing software is started with the script mpistart20, which should be edited for the 
processing nodes that you want to use. eg.
mpimon cobra_test5 -- node1-8 1 node1-8 1 node1-7 1 node2-1 2 node2-2 2 node2-3 2 node2-4 2 node2-5 2 node2-6 2 node2-7 2 node2-8 2 node2-9 2 node2-10 2
in this example,
  • Cobra_test5 is the current evolution of the processing software.
  • node1-8 runs the master - which communicates with the outside world 
  • node1-8 runs the collector (both fairly low cpu/memory applications)
  • node1-7 runs the server - which must run on the sampler node.
  • nodes2-1 to 2-10 run 20 clients. 
cobra_command In a different window, you now need to start cobra_command, which links the mpiprograms to the 
outside world. This needs to be run not less than 3 seconds after running mpimon, and before all the 
node ready messages start coming along (see below). The run command is:
cobra_command_test7 192.168.0.108 1080 1082 d a 
Where the IP-address is that of the master process, and the two port address can be pretty much 
anything (>1060) but are those given in the parameter file for the master. D is a flag for logging and 
a is a flag to look for commands from arthur (rather than the keyboard).
If you've started successfully, you'll get the messageI
Connected to: 192.168.0.102
Connected to: 192.168.0.102
mpi software
what it does on startup...
as each node starts you get a meesage, and finally a node ready command. eg
Linux node2-1 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-1 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-5 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-5 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-3 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-3 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-8 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-8 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-4 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-2 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-9 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-9 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-4 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-2 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-7 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-7 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-6 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-6 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-10 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node2-10 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node1-8 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node1-8 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Linux node1-7 2.4.3-12scalismp #1 SMP Wed Oct 24 13:20:46 CEST 2001 i686 unknown
Cobra_master on node1-8 checking if nodes are ready
    Node 0 ready
    Node 1 ready
Server (node1-7) - System ready
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
    Node 2 ready
    Node 3 ready
    Node 4 ready
    Node 5 ready
    Node 6 ready
    Node 7 ready
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
Pack sizes : 16777216 16777236 33555476
    Node 8 ready
    Node 9 ready
    Node 10 ready
    Node 11 ready
    Node 12 ready
    Node 13 ready
    Node 14 ready
    Node 15 ready
    Node 16 ready
    Node 17 ready
    Node 18 ready
    Node 19 ready
    Node 20 ready
    Node 21 ready
    Node 22 ready
Master (node1-8) - System ready
And that's it ready to go......
If you get "process 2 - segmentation violation" check that cobra_acquistion hasn't died.
Integrating - manually create a suitable pulsar.eph file and setup.in in subbx,
touch setup

 To stop and integration, touch abort

 To shut down everything (including cobra_acquisition) touch shut

Integrating - arthur driven In yet another window, goto the directory /raid1/online/scripts/ and check the file 
cobracq_config_lovell.dat. Currently the only part of this in use is the frequency of the sampler!
run the script /raid1/online/scripts/total_lovell.pl
which checks what the filter bank hardware is doing, and creates the appropriate pulsar.eph and 
setup.in files, then starts and stops integrations in time with the filter bank observations. 
Progess messages When a new setup file is created in the working directory, the following sequence should take place:
  • cobra_command will read the files setup.in and pulsar.eph, and remove setup. It displays some 
    of the parameters that it's found eg.
  •  
    s
    10 10 2002
    J1635+2418 pulsar.eph 52557.000000 480 12 h 1352.500000
    Using parameter file pulsar.eph
     254
    Arg 28 225 254
    and sends the data to cobra_master.
  • cobra_master, in the mpistart20 window will gets the parameters and displays some. eg. 
  • header 1342.500000 5.000000 27.215000 0.00000020
    27.215000 1342.500000 5.000000 0.000000 0.000467 2333
    Server sent setup command to data acq
    Server allocated send buffers
    Opening /scratch/COBRA/polyco.dat003
    Opening /scratch/COBRA/polyco.dat006
    Opening /scratch/COBRA/polyco.dat005
    Opening /scratch/COBRA/polyco.dat004
    Opening /scratch/COBRA/polyco.dat007
    Opening /scratch/COBRA/polyco.dat008
    Opening /scratch/COBRA/polyco.dat009
    Opening /scratch/COBRA/polyco.dat011
    Opening /scratch/COBRA/polyco.dat010
    Opening /scratch/COBRA/polyco.dat012
    Opening /scratch/COBRA/polyco.dat013
    1144
    Opening /scratch/COBRA/polyco.dat015
    Opening /scratch/COBRA/polyco.dat014
    Opening /scratch/COBRA/polyco.dat016
    Server allocated ring buffers
    Server print_epnhead
    Server waiting for ready from acq
    Opening /scratch/COBRA/polyco.dat017
    Opening /scratch/COBRA/polyco.dat018
    Opening /scratch/COBRA/polyco.dat020
    Opening /scratch/COBRA/polyco.dat019
    Opening /scratch/COBRA/polyco.dat021
    Opening /scratch/COBRA/polyco.dat022
    Server received ready from acq
    mstat now : 1
    Master (node1-8) - System ready
    Server sent begin command to data acq
    Computing profiles 3
    Computing profiles 4
    Computing profiles 5
    Computing profiles 6
    Computing profiles 7
    Computing profiles 8
    Computing profiles 3
    Computing profiles 4
    Computing profiles 9
    etc. until a stop is received. (An abort file created in the working directory). Integrating will 
    continue indefinately, with an epn file written after each x secs (specified in setup.in.) The only 
    indication that an epn file has been created is a new profile displayed by cobra_display or a 
    directory listing for new files.
  • When_command detects a file abort it will send a stop to cobra_master and remove the abort
    file. It prints the following message:
  • new cmd e
    Cobra master send stop round to all the other processes, which finish processing their current 
    buffer and finally return to System ready.
    Computing profiles 10
    Computing profiles 11
    Server sent end command to data acq
    Server received ready from acq
    Computing profiles 4
    Computing profiles 13
    Computing profiles 3
    Computing profiles 6
    Computing profiles 12
    Computing profiles 5
    Computing profiles 7
    Computing profiles 8
    Master (node1-8) - System ready
cobra_display In the current directory, run cobra_display which checks for new integrations and displays them.
To stop everything In the current directory touch shut and Ctrl-C the total_lovell script if you're running it.
There are long sleeps in some of the process shutdowns,and it may take about 60 secs to finaaly exit 
the mpi stuff.

Christine Jordan

Last modified: Fri Feb 7 15:44:44 GMT 2003