Data Editing
Data Editing
Data Editing
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Data</strong> <strong>Editing</strong><br />
Flagging<br />
Compression<br />
The first challenges:<br />
*<strong>Data</strong> are BIG<br />
*<strong>Data</strong> has significant RFI<br />
We need to FLAG & COMPRESS.<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
The LOFAR Flagging Tools<br />
-Our subband is 12Gb in size, with 1s integrations and 64 channels – need to compress in time<br />
and frequency. (Use msoverview in the tutorial to check this for yourself)<br />
-It's also noisy, and we need an automated way to remove RFI (no AIPS!)<br />
Images from Offringa et al. (2010)<br />
Automated Flagging and Compression done in NDPPP (New Default Pre-Processing Pipeline).<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Example Usage<br />
NDPPP is usually run by the Radio Observatory automatically.<br />
One line:<br />
ker@lce072> NDPPP NDPPP.parset<br />
NDPPP<br />
Flagging<br />
Compress<br />
AOFlagger<br />
MADFlagger Time Frequency<br />
-Use the parset file to specify what you want to do.<br />
Result: A compressed & flagged file of a much more managable size<br />
12Gb------>113Mb<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Example Usage<br />
msin = /data/scratch/tutorials/cyga/L24921_SB005_uv.MS<br />
msin.startchan = 1<br />
msin.nchan = 62<br />
msin.datacolumn = DATA<br />
msout = "L24921_SB005_uv.MS"<br />
msout.datacolumn = DATA<br />
steps = [preflag,flag1,count,avg1,flag2,avg2,count]<br />
preflag.type=preflagger<br />
preflag.corrtype=auto<br />
flag1.type=madflagger<br />
flag1.threshold=4<br />
flag1.freqwindow=31<br />
flag1.timewindow=5<br />
flag1.correlations=[0,3]<br />
avg1.type = squash<br />
avg1.freqstep = 64<br />
avg1.timestep = 1<br />
flag2.type=madflagger<br />
flag2.threshold=3<br />
flag2.timewindow=51<br />
avg2.type = squash<br />
avg2.timestep = 5<br />
Flag the first & last channels<br />
Steps required<br />
flags the autocorrelations<br />
Flag with MADFlagger<br />
flags the XX and YY polarizations<br />
Compress the data<br />
(Compresses to one channel)<br />
2 nd round of Flagging<br />
compresses 5 time-slots i.e. 15 s<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
<strong>Data</strong> Inspection<br />
CASABROWSER<br />
CASAPLOTMS<br />
Variety of DIY Python scripts too for data inspection (see Neil's tutorial later this week)<br />
-Casa should only be run on compressed datasets.<br />
-Can manually flag any missed RFI (in practice this is rarely necessary)<br />
-Most importantly, ALWAYS check the data at this point.<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Try it Yourself<br />
Summary:<br />
-need to COMPRESS to have manageable dataset<br />
-need to FLAG to remove RFI in an AUTOMATED way.<br />
-need to INSPECT – LOFAR is still in commissioning!<br />
First Interactive Session:<br />
-Logging into the cluster<br />
-Run msoverview to get details of the Measurement Set.<br />
-Running NDPPP to compress & flag our Cygnus A subband.<br />
-Inspecting the data<br />
(P99-104 in the LOFAR Imaging Cookbook)<br />
Please ask questions!<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Advanced Things to Try<br />
Some Notes<br />
Demix<br />
Adv Models<br />
Global Calibration<br />
Beam Correction<br />
New Imager<br />
To test soon?<br />
Flag<br />
NDPPP<br />
Calibrate<br />
BBS<br />
Image<br />
Casapy<br />
Model<br />
Self-calibrate<br />
Steps in red will get you started...<br />
Steps in green will take us up to more or less the current commissioning stage.<br />
All well documented in Cookbook and can be easily worked through<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Demixing<br />
Amplitude<br />
Time<br />
Required for essentially all LBA, & some HBA<br />
datasets.<br />
-The 'A-Team' CassA, CygA etc dominate at<br />
low frequencies & need to be removed.<br />
-demixing allows subtraction of these sources<br />
prior to calibration<br />
Ripples from<br />
CasA etc.<br />
Phase<br />
Raw data, one baseline 3C196<br />
Time<br />
Figures courtesy<br />
George Heald<br />
Same baseline, demixed, calibrated,<br />
3C196 subtracted<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Demixing<br />
-How to?<br />
Follow Chapter 7 in the Cookbook on a raw subband.<br />
> /home/diepen/scripts/do_demixing.py<br />
BUT...<br />
- does not work when A-Team source within 30 degrees of target.<br />
-VERY compute intensive, only one demixing should ever be run at a time per<br />
node.<br />
-takes a long time …<br />
-Make sure you have enough room (demixing requires ~100Gb space), and<br />
remove all the intermediate products afterwards.<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Beam Correction<br />
-BBS has an option to calibrate taking into account the LOFAR beamshape.<br />
-Just need to add a couple of lines to parset file:<br />
Step.solve.Model.Beam.Enable = T<br />
Step.correct.Model.Sources = [CygA]<br />
Step.correct.Model.Beam.Enable = T<br />
need to specify direction<br />
-*Big* caveat. Self-calibration cannot be done for wide-field imaging, as the<br />
Casa Imager cannot apply the LOFAR beam correction.<br />
Work on an imager which can is in progress, and should be ready for testing<br />
soon..<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Making Models<br />
-Some models in wide use such as the A-Team, & some 3C sources etc are available in<br />
/globaldata/COOKBOOK/Models.<br />
-casapy2bbs.py converts a casa .model to a bbs source catalogue (as you did in the tutorial to self-calibrate).<br />
-Can also use pyBDSM, which takes a fits image & produces a bbs source file from detected sources<br />
(Chapter 9 in Cookbook).<br />
-Always worth asking other commissioners for existing models.<br />
>use LofIm<br />
>use LUS<br />
>pybdsm<br />
BDSM [1]: inp process_image<br />
BDSM [2]: filename = ’CygA.fits’<br />
BDSM [3]: go<br />
BDSM [4]: write_gaul(bbs_patches='source')<br />
--> Wrote BBS sky model ’CygA.pybdsm.sky_in’<br />
VLSS Image<br />
PyBDSM<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Global Calibration<br />
We are also able to calibrate multiple subbands simultaneously.<br />
Again, just a few lines needed in the parset file..<br />
For example, 10 subbands of our Cygnus A observation:<br />
>Strategy.UseSolver = T<br />
>Step..Solve.CalibrationGroups = [10]<br />
No more than 5-10 subbands (1-2MHz bandwidth) should be used due to<br />
increasing problems with ionosphere & station clock drift.<br />
Global calibration is particularly helpful for fainter fields, where S/N from just one<br />
subband is low.<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Troubleshooting<br />
The software sometimes fails ….<br />
[FAIL] error: setupsourcedb or remote setupsourcedb-part process(es) failed<br />
Things to check....<br />
-Are your parset files in the correct format?<br />
-INSPECT your data.<br />
-Look at the end of the .log files produced by the software – more explicit error<br />
messages can usually be found there. Still not sure? Post log output on LOFAR Users<br />
Forum.<br />
-Note sometimes software fails due to changes on the cluster, new bugs in bbs etc,<br />
again Forum a good place to look for help..<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Typical Reduction<br />
LBA<br />
LBA or HBA?<br />
HBA (typically done by RO)<br />
Demix<br />
High res clean<br />
component model<br />
BBS<br />
YES<br />
Bright, central<br />
source?<br />
Long baselines may need to be<br />
flagged with coarser VLSS models<br />
Global calibration useful for all,<br />
but essential for faint fields<br />
No<br />
NDPPP<br />
Model from<br />
pyBDSM+VLSS<br />
BBS<br />
Check & flag solutions,<br />
corrected data<br />
Check & flag solutions,<br />
corrected data<br />
Self-Cal: casapy image,<br />
bbs, casapy2bbs<br />
Image in Casapy<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
Commissioning<br />
Sign up to sources of help:<br />
-LOFAR Wiki – there's a record of commissioners' work under 'Commissioning/busy Wednesdays. A lot<br />
include the bbs parset & casapy parameters used – good reference!<br />
-LOFAR Users Forum – up to date record of any system downtime, bugs etc.<br />
-Join a Busyweek/Busy Wednesday (can do this remotely).<br />
-Getting started:<br />
-Check wiki for up to date list of data available<br />
-HBA bright central 3C sources (there are lots of these!) are good 'starters'.<br />
A 3C HBA Source:<br />
-Same basic procedure as Cygnus A.<br />
-Usually the radio observatory flags and compresses HBA datasets, and transfers 4 subbands per<br />
observation to the compute nodes for commissioners to get started (so no initial NDPPP).<br />
-You then need to 1)Inspect the data, do any additional flagging necessary. 2) Make a good sky model (e.g.<br />
pyBDSM), 3) Calibrate in BBS, 4) Check & flag solutions & corrected data, 5) Image in Casapy, 6) Selfcalibrate<br />
as required.<br />
-A good point to start is the uv-plane-cal.parset given in the cookbook for BBS. Have a browse through the<br />
wiki for other parsets.<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11
That's All!<br />
Find out more...<br />
LOFAR User's Forum<br />
LOFAR Busy Days & Busyweeks (can join remotely)<br />
Plenty of <strong>Data</strong> & exciting science on the way!<br />
Courtesy van Weeren<br />
Remember things change quickly – hopefully this<br />
talk will be out of date soon!<br />
Bootes Field at 50MHz, Ker.<br />
Louise Ker, LOFAR-UK <strong>Data</strong> School, 30/08/11