This tutorial describes the data reduction for MERLIN 5-GHz radio continuum observations of the radio galaxy 3C277.1. This version covers the simplest steps. For more information see www.merlin.ac.uk. Please note that the general principles are common to most interferometer data but the exact route through data reduction and the parameters such as solution intervals depend on the nature of the array, the observer's intentions and the quality of the data, so different data sets would require modified inputs.
These data were taken using 6 antennas, Ca (32 m) and Mk Ta Da Kn De, all 25 m. Their sensitivities vary, with De being by far the least efficient at this frequency, but is is important for North-South coverage. The array contains baselines up to 217 km. A bandpass correction and approximate flux scale have already been applied and the data are averaged to one 14-MHz channel. All four polarizations (RR RR LR RL) are present but we will just use total intensity. Antenna positions are geocentric, terrestrial E, N positive.
The data can be downloaded from 3C277.1.MULTTB which is a multi-source file containing:
Two flux scale calibrators are used because sources like 3C286 with stable, well-known amplitudes are usually extended over several arcsec or more and hence are heavily resolved by MERLIN. On the other hand, point-like, bright sources like OQ208 are usually variable.
The script below covers:
It attempts to explain the least complicated route through the data calibration and imaging processes. It is possible to combine some steps at the expense of more complicated housekeeping, which might be important for very large data sets. You can plot any calibration step or compare its effects on the data, so you can investigate the impact of changing the values of parameters. Note that the default for most tasks is not to overwrite existing data or calibration files, so the task will fail if the directory or file already exisits. The exception is the output of the task clean, which will carry on further iterations of deconvolution on an existing output image.
Further information on any task e.g. importuvfits is obtained by clicking on the link the first time the task is mentioned, or from the CASA cookbook, or by typing
help('importuvfits')
inp importuvfits
Start CASA and set variables for convenience:
prefix ='3C277.1R' msfile = prefix + '.ms'
The msfile is the Measurement Set (MS) used by CASA for visibility data. If you want to delete all data files created here by CASA at any time, type
!rm -r 3C277.1R*
Load the fits data using importuvfits and list the contents using listobs.
default importuvfits vis=msfile fitsfile='3C277.1.MULTTB' inp importuvfits() listobs()
You will get a few warnings in the Logger which you can ignor, as long as listobs gives something like this in the Logger (some columns/messages omitted):
##### Begin Task: listobs #####
================================================
MeasurementSet Name: /home/amsr/scratch/ALMA/CASANOTES/3C277.1/3C277.1R.ms MS Version 2
===============================================================================
Observer: Project:
Observation: MERLIN2
Data records: 81210 Total integration time = 49913 seconds
Observed from 18-Apr-1995/17:08:02.0 to 19-Apr-1995/06:59:55.0 (UTC)
listobs::ms::summary
ObservationID = 0 ArrayID = 0
Date Timerange (UTC) Scan FldId FieldName nVis Int(s) SpwIds
18-Apr-1995/17:08:02.0 - 17:09:54.0 1 10 1300+580 225 7.99 [0]
17:10:37.0 - 17:17:49.0 2 11 3C277.1 825 7.99 [0]
17:18:36.0 - 17:19:56.0 3 10 1300+580 165 7.99 [0]
17:20:35.0 - 17:27:55.0 4 11 3C277.1 840 7.99 [0]
17:28:42.0 - 17:29:54.0 5 10 1300+580 150 7.99 [0]
.....
21:43:01.0 - 22:04:53.0 56 0 3C286 2475 7.99 [0]
22:06:00.0 - 22:59:52.0 57 1 OQ208 6075 7.99 [0]
.....
(nVis = Total number of time/baseline visibilities per scan)
.....
Fields: 4
ID Code Name RA Decl Epoch nVis
0 3C286 13:31:08.2873 +30.30.32.9590 J2000 2475
1 OQ208 14:07:00.3944 +28.27.14.6880 J2000 6075
10 1300+580 13:02:52.4649 +57.48.37.6175 J2000 11745
11 3C277.1 12:52:26.2859 +56.34.19.4882 J2000 60915
(nVis = Total number of time/baseline visibilities per field)
Spectral Windows: (1 unique spectral windows and 1 unique polarization setups)
SpwID #Chans Frame Ch1(MHz) ChanWid(kHz)TotBW(kHz) Ref(MHz) Corrs
0 1 TOPO 4993.5 14000 14000 4993.5 RR RL LR LL
The SOURCE table is empty: see the FIELD table
Antennas: 6:
ID Name Station Diam. Long. Lat.
0 1 DEFFORD 0.0 m +002.08.35.0 +51.54.50.9
1 2 CAMBRIDG 0.0 m -000.02.19.5 +51.58.50.2
2 3 KNOCKIN 0.0 m +002.59.44.9 +52.36.18.4
3 4 DARNHALL 0.0 m +002.32.03.3 +52.58.18.5
4 5 MK2 0.0 m +002.18.08.9 +53.02.58.7
5 6 TABLEY 0.0 m +002.26.38.3 +53.06.16.2
listobs::::casa
##### End Task: listobs #####
##########################################
Each integration is 8 sec (allowing for the annoying formatting of floating point numbers). This is the duration of a single complex visibility, corresponding to the phase and amplitude for one baseline, one polarization. The polarization products present (also termed correlations or corr in CASA) are LL, LR, RL and RR i.e. full polarization. The target:phase reference source are observed on an 8:2 min cycle. 3C286 and OQ208 are observed in the middle of the run. The diameters of all antennas are given as 0.0 m but as the target does not extend far out in the primary beam we can ignor this. Here, there is only one frequency sub-band, termed a spectral window (spw), containing a single 14-MHz channel.
The paths swept out by the baselines, as seen in projection from the sources, are known as uv tracks. Plot these for the phase-reference source using plotms.
default(plotms) vis=msfile xaxis='u' yaxis='v' field= '1300+580' spw='0' correlation='LL,RR' inp plotms
At present the conjugate visibilities are not plotted and the units are metres (not wavelength units) so this may not look like a plot which you are used to, but it shows that the maximum projected baseline is about 210 km and the minimum is 10 km. At a wavelength of 6 cm, this means that the resolution of the array is about 50-60 mas and the largest spatial scale which can be imaged is about 1 arcsec.
Using the Axes tab of plotms to plot amplitude as a function of uvdistance ('UVDist_L' is in wavelength units). You can use the Display tab to colour the points by the first antenna in each baseline. This shows a lot of scatter associated with the antenna contributing baselines around 1.5-2 Mlambda, which is Defford, known to be less sensitive at 6cm wavelength. Colouring by antenna 2 shows that most of the other low points come from antenna 6, Tabley (use the locate facility to find this, easiest to get someone to show you how). This is shown in the figure below.
Right-click on any plot to see it full-size.
The plot shows some bad data but the amplitudes are otherwise fairly constant with uv distance, confirming that the phase-reference, 1300+580, source is point-like.
Experiment with plotting the other sources and confirm which sources are extended or pointlike as expected. In Data Selection set spw 0:3~12 and for Averaging use Time 30 (seconds), which known to be suitable to maximise signal-to-noise in raw data, as demonstrated later.
The target, 3C277.1, has more flux on short baselines and complex structure.
Change the axes to plot the phase reference source amplitudes as a function of time. Zoom in on the later times which show bad data. Use the Tools tab to activate Hover and Display. This means that if you select phase-reference data to flag, you can note the same time-range to flag for the target. Use Locate to see which antennas are responsible (the times etc. to flag are given below, so if in a hurry don't spend too much time on this).
Phase-ref amplitudes (averaged every 30 sec). The Locate function shows which antennas are responsible for each of the major patches of bad data. You can flag these interactively or by noting the time ranges and using the flagdata task, see next step. You can also flag the isolated low points (which are unlikely also to affect the target).
Use flagdata to edit the main periods of bad data on the phase-reference and target together. Small falls in amplitude will be taken care of by calibration, but multiplying up very faint data adds too much noise.
default flagdata vis=msfile spw='0' field='1300+580','3C277.1' antenna='6&*' timerange='1995/04/19/02:01:00~1995/04/19/03:49:00' inp flagdata antenna='4&*' timerange='1995/04/19/05:23:00~1995/04/19/05:38:00' flagdata antenna='2&*' timerange='1995/04/19/06:24:00~1995/04/19/06:35:00' flagdata
Check the messages in the logger and check by re-plotting the data. When visibility data are loaded into CASA, as well as creating a MS, a .flagversions directory is created with a (usually empty) version 'Original'. You can use flagmanager to back up flags (or restore a previous version). If all the flagging so far has worked, back up the flags.
default(flagmanager) vis=msfile mode='save' versionname = msfile+'_calsource.flags' inp flagmanager()
Inspect the amplitudes of OQ208 and 3C286 as a function of time and then phases as a function of time. For point sources, with a good position, it should be flat at zero. In the plots below, the phase wanders about but can be traced easily from integration to integration. If the phase looks like pure noise, you could try averaging up but if that still does no good on a calibration source, the time should be flagged. Here, it looks all OK unless you plot the data already flagged due to bad amplitude.
The phases for 1300+580, for all baselines to the most distant antenna (Cambridge), change by up to a full turn in a day, seen in the upper plot. The middle plot shows that the phase only changes by about 10 deg or less in 5 min. The data shown in red in the lowest plot was already flagged due to low amplitudes and also has bad (apparently random) phase.
The phase-reference source is observed for about 90 sec in between each scan on the target. The plots above show that if we choose a 30-sec solution interval, we will achieve a phase accuracy of about 5 degrees or better, and this also allows accurate interpolation onto the target data.
The origin of phase is arbitrary and in order to achieve consistent calibration, especially for polarization, the phase of one antenna is set to zero and the other phases are worked out relative to that. Plot the antenna distribution using plotants to chose a reference antenna with good data with many short baselines (slower phase-rates so less chance of solutions failing).
vis = msfile plotants
MERLIN antenna positions
Close the plotter or you may have file lock problems in future. If you do find that plotms or some other task is hanging, or you get messages about waiting for a lock, close any (other) CASA visualisation tools and type
clearstat
default('gaincal')
vis=msfile
caltable=prefix+'_cals.phcal'
field = '3C286,OQ208,1300+580'
spw='0'
solint = '0.5min'
refant='5'
calmode='p'
minsnr = 3.0
minblperant= 3
inp
gaincal()
Check the logger and plot the solutions:
default('plotcal')
caltable=prefix+'_cals.phcal'
field = '1300+580'
subplot = 611
iteration='antenna'
xaxis='time'
yaxis='phase'
plotrange=[-1, -1, -180, 180]
inp
plotcal()
Phase reference source phase solutions (the range is [-180,+180] deg)
default('applycal')
vis = msfile
gaintable= prefix+'_cals.phcal'
field='1300+580,3C286,OQ208'
calwt=False
parang=False
inp
applycal()
Plot the corrected phase (you can only select this after displaying the data).
default('plotms')
vis=msfile
xaxis='time'
yaxis='phase'
field='1300+580'
spw='0'
correlation='LL,RR'
avgtime='30'
inp
plotms
In the viewer, Axes tab, set the Data Column to 'corrected'. You should see that the phase is within about 5 degrees of zero except for a short period near the middle of the run, as shown below.
Corrected phase for the phase-reference source (all antennas, 30-sec average).
3C286 is a standard flux and polarization calibration source, with properties tabulated by Baars (1977). It is somewhat resolved even on the shortest MERLIN baselines at 5 GHz, hence a sub-standard flux density (96% of Baars) is inserted in the MODEL data column of the measurement set using setjy.
default('setjy')
vis = msfile
field = '3C286'
fluxdensity=[7.077,0.0]
inp
setjy()
gaincal is now used to derive amplitude solutions using a point model for the primary flux calibration source, 3C286 and the point source OQ208. The plot below shows that 3C286 is heavily resolved and we will select a uv range less than 500,000 wavelengths. Use plotms to reproduce this and check that the data in this range does come from at least 3 antennas. This gives 3 baselines and is the minimum possible for normal phase solutions.
3C286 is heavily resolved on all but the shortest baselines.
The phase and bandpass solution tables must be applied since gaincal uses the Data column. Set the uvrange to the range where 3C286 is almost unresolved.
default gaincal vis=msfile caltable=prefix+'_cals.a1cal' field='3C286,OQ208' selectdata=True uvrange='0~500klambda' solint='2min' minblperant =2 refant='5' gaintable=prefix+'_cals.phcal' calmode='ap' inp gaincal()
Check the solutions. The solutions for 3C286 are close to unity but those for OQ208 are not because we do not yet know its correct flux density. Note that even asking for 4 plots only produces 3, confirming that solutions were only found for the 3 antennas providing the shortest baselines.
caltable = prefix+'_cals.a1cal' xaxis='time' yaxis='amp' field='' antenna = '4,5,6' plotrange=[-1,-1,-1,-1] subplot=411 iteration='antenna' plotcal()
fluxscale is used to compare the solutions for OQ208 with those of 3C286 to find the flux density of the former.
default('fluxscale')
vis=msfile
caltable=prefix+'_cals.a1cal'
fluxtable=prefix+'_OQ208.flux'
reference='3C286'
transfer='OQ208'
inp
fluxscale()
Flux density for OQ208 in SpW=0 is: 2.43903 +/- 0.00453 (SNR = 537, nAnt= 3)The fluxtable will contain the proper scaling, but since we have only used a subset of antennas we have to set the flux density of OQ208 (insert your value below) and repeat the amplitude calibration using all antennas:
default('setjy')
vis = msfile
field='OQ208'
fluxdensity=[2.439,0]
inp
setjy()
default gaincal vis=msfile selectdata=False field='OQ208,1300+580' solint = '2min' uvrange='' antennas = '' refant = '5' minsnr=3 minblperant = 4 caltable = prefix+'_cals.a2cal' gaintable = prefix+'_cals.phcal' inp gaincal()
vis=msfile caltable=prefix+'_cals.a2cal' fluxtable=prefix+'_1300+580.flux' reference='OQ208' transfer='1300+580' inp fluxscale()
Flux density for 1300+580 in SpW=0 is: 0.433907 +/- 0.000773 (SNR = 561, nAnt= 6)
3C277.1R_1300+580.flux now contains correctly-scaled amplitude solutions (check in plotcal if you have time).
default('applycal')
vis = msfile
gaintable= prefix+'_cals.phcal',default('applycal')
vis = msfile
gaintable= prefix + '_cals.phcal', prefix+'_1300+580.flux',
field='1300+580,3C277.1'
calwt=False
parang=True
inp
applycal()
default('split')
vis=msfile
datacolumn = 'corrected'
outputvis='3C277.1R.split.ms'
field='3C277.1'
inp
split()
default(plotxy) vis = '3C277.1R.split.ms' xaxis='time' yaxis='amp' datacolumn='data' correlation='RR LL' iteration='baseline' field = '3C277.1' subplot=511 plotxy()
Amplitude and phase of the target after applying reference source and bandpass calibration, on all baselines to the reference antenna. The signal-to-noise is good enough to see the source in short time intervals on all baselines so it is possible to use self-calibration.
Imaging consists of two processes. The calibrated visibility data are Fourier transformed to make a Dirty Map. This is what will appear first in the Viewer during interactive cleaning. It contains many artefacts due to the sparse sampling of the sky by the interferometer. A plot of the Dirty Beam is shown lower down, demonstrating the sidelobes above 20% of the peak which extend over several beamwidths (defined as FWHM, about 50 mas). Cleaning is a process in which the brightest point(s) in the map are identified and a fraction, say 10%, of their flux density is recorded as Clean Components (CC) and the sidelobe response for the dirty beam centred on that position is scaled and subtracted from the map. After a certain number of iterations (known as minor cycles) the accumulated CCs are FT'd and subtracted from the visibility data, the remaining visibility data are FT'd to the map plane and the process continues (major cycles). When there is nothing left much above the noise, the cumulative list of CC are added to the residuals and a final Clean map is made, convolved with the restoring beam.
Image the target using clean. The longest baseline is 217 km and the wavelength is 6 cm. Use this to estimate the interferometer resolution and hence a suitable pixel size to give at least 3 pixels across the beam. A minimum image size of at least 512 pixels is needed for for adequate deconvolution of the rather extended dirty beam produced by the sparse MERLIN array. psfmode='hogbom' provides a large beam patch for deconvolving extended sidelobes.
Note that if interactive=True you have to set a cleaning mask yourself. This approach is suitable for a bright, extended source such as 3C277.1. Start clean as described below. When the Viewer appears, experiment with these steps:
default('clean')
vis='3C277.1R.split.ms'
cell=['0.015arcsec', '0.015arcsec']
imsize=[512,512]
psfmode='hogbom'
imagermode='csclean'
cyclefactor=4
stokes='I'
interactive=True
niter=1000
gain=0.05
field='3C277.1'
imagename='3C277.1R_clean1'
interactive=True
stokes='I'
mask=''
inp
clean()
saveinputs ('clean', '3C277.1.clean1.saved')
Check progress in the logger. In this instance we have let CASA fit the natural synthesised beam, the size is reported in the logger (and also recorded in the image header). The Clean model is left in the MS in the MODEL_DATA column. Use the viewer to load 3C277.1R_clean1.image as a raster map and use the 'data display options' to tweak it. You can access these via the spanner icon or the Data menu option Ajust.
viewer('3C277.1R_clean1.image')
You can also plot the Dirty Beam, '3C277.1R_clean1.psf'.
The dirty beam and its sidelobes.
There are several ways to measure the noise and the peak. If you draw a box and double-click, the rms etc. will be displayed. To set the measurement in a variable (here called rms_target) use imstat, setting the limits to avoid the image.
default('imstat')
imagename='3C277.1R_clean1.image'
box='20,20,492,120'
stokes='I'
noise_target = imstat()
rms_target=noise_target['rms'][0]
print rms_target
First image of the target, contours at [-0.0025,0.0025, 0.005, 0.01, 0.02, 0.04, 0.08] Jy/beam
default('gaincal')
vis='3C277.1R.split.ms'
caltable=prefix+'_self1.phcal'
field='3C277.1'
refant='5'
minblperant=3
minsnr=1
calmode='p'
solint = '2min'
parang=True
inp
gaincal()
caltable=prefix+'_self1.phcal' xaxis='time' yaxis='phase' subplot=611 iteration='antenna' plotrange = [-1, -1, -180, 180] plotcal()
Target phase solutions (the range is [-180,+180] deg).
default('applycal')
vis='3C277.1R.split.ms'
gaintable=prefix+'_self1.phcal'
field='3C277.1'
parang=True
calwt=False
inp
applycal()
execfile('3C277.1.clean1.saved')
Inspect and adapt the inputs, using the previous noise level as the minimum clean flux. You should stop cleaning when the minimum clean components have sidelobes fainter than the noise. You can calculate the stopping criteria more accurately by inspecting the sidelobe ratio of the dirty beam and comparing that with the expected final noise level. The mask is set below to use the final mask from the previous round of clean. You can leave this blank to start afresh, or you can modify the input mask as you did before.
inp
imagename = '3C277.1R_clean2'
threshold=str(rms_target)+'Jy'
niter=2000
mask='3C277.1R_clean1.mask'
clean()
saveinputs ('clean', '3C277.1.clean2.saved')
default('imstat')
imagename='3C277.1R_clean2.image'
box='20,20,492,120'
stokes='I'
noise_target = imstat()
rms_target=noise_target['rms'][0]
print rms_target
default('gaincal')
vis='3C277.1R.split.ms'
caltable=prefix+'_self2.ampcal'
field='3C277.1'
refant='5'
minsnr=1
calmode = 'ap'
minblperant=4
solint='3min'
gaintable=prefix+'_self1.phcal'
parang=True
inp
gaincal()
caltable=prefix+'_self2.ampcal' iteration='antenna' subplot=611 yaxis='amp' plotrange = [-1, -1, 0.8, 1.2] plotcal()
Target amplitude corrections, range [0.8,1.2]
vis='3C277.1R.split.ms' gaintable=prefix+'_self1.phcal',prefix+'_self2.ampcal' calwt=False parang=True inp applycal()
execfile('3C277.1.clean2.saved')
inp
imagename = '3C277.1R_clean3'
mask='3C277.1R_clean2.mask'
threshold=str(rms_target)+'Jy'
niter=5000
clean()
default('imstat')
imagename='3C277.1R_clean3.image'
box='20,20,492,120'
stokes='I'
noise_target = imstat()
rms_target=noise_target['rms'][0]
print rms_target
default('plotxy')
vis='3C277.1R.split.ms'
xaxis='uvdist'
yaxis='amp'
datacolumn='corrected'
selectdata=True
correlation='LL'
field='3C277.1'
timebin='180' # 3 min, same as solution interval
plotcolor='blue'
overplot=true
plotxy()
datacolumn='model'
plotcolor='red'
plotxy()
Target model (red) overplotted on the calibrated visibility data.
3C277.1 after self-calibration, contour levels in logarithmic multiples of 0.4 mJy/beam (3 sigma_rms).
If time, try to improve the image... read the help for clean and try multi-scale clean, experiment with a smaller gain factor or more careful boxing. You could also split out 3C286 and 1300+580 and image these in full polarization as above. 3C286 will have strong sidelobes due to the poor uv coverage but boxing the central peak and just cleaning for about 100 iterations should produce an image good enough to check the polarization angle, which should be 33 deg. Additional techniques in cleaning include: