<strong>Jason</strong> <strong>Manley</strong>,<strong>Aaron</strong> <strong>Parsons</strong>, <strong>Dan</strong> <strong>Werthimer</strong>, <strong>Don</strong> Backer, HenryChen, Terry Filiba, David MacMahon, PeterMcMahon, Arash Parsa, Andrew Siemion, MelWright

Outline• About <strong>CASPER</strong> and meerKAT• Scaling correlators to large-N:– The architecture– The hardware– The software– The cost• Closing thoughts• Questions and comments

• Originally 7 dishes (KAT-7)• Upgrade ~80x ~12m dishes (meerKAT)• 1GHz instantaneous bandwidth• 16k chan.

In the beginning…Big dishAnalogueprocessorControl &MonitoringComputingbackend

…more recently…EVLA baseline board(Carlson, 2006)

<strong>CASPER</strong>Center for Astronomy Signal Processing and Electronics Research• Open-source, collaborative environment• Reusable, platform-independent gateware• Modular, upgradeable hardware• Low instrument NRE, shared by thecommunity• Enables rapid development (8 instruments /2 years)• Industry standard communication protocols• Low Cost

<strong>CASPER</strong> DSP backend conceptR e c o n f ig u r a b leP o l y p h a s eF i lt e r B a n k sC o m p u t e C l u s t e rA D CP F BF P G A D S PM o d u l eF P G A D S PA D CP F BM o d u l e..C o r r e l a t o r.F P G A D S PM o d u l eF P G A D S P...C o m m e r c i a l o f f - t h e - s h e l fM u l t ic a s t 1 0 G b p s ( 1 0 G Eo r I n f in i B a n d ) S w i t c hM o d u l eF P G A D S PM o d u l eB e a m f o r m e r s /S p e c t r o m e t e r sF P G A D S PM o d u l eP u ls a r t im e r......A D CP F BG e n e r a l- p u r p o s e C P U s

iBOBsInternet Breakout BoardShown here with two iADC 2GSa/s boardsOriginally designed to packetise data.Now used to build complete instruments!

BEE2sBerkeley Emulation Engine5x 2VP7024GB memory180 Gbps IO

ROACHReconfigurable Open Architecture Computing Hardware

ROACHReconfigurable Open Architecture Computing Hardware• Next generation hardware platform• Consolidates BEE2 and iBOB abilities• Backwards compatible with ADCs and DACs• Xilinx Virtex 5 SX95T or LX110T (your choice)• 4x 10GbE ports• DDR2 667 memory• Onboard PPC forhousekeeping, management, system healthmonitoring etc• Uses standard 1U cases and power supplies• Look for it mid-2008.• Collaboration between <strong>CASPER</strong>, NRAO, KAT

Extensive DSP Library• DDC• FFT• PFB• X engine• Vector accumulators• Datareorder, timers, counters, computerinterface modules etc.

Library ExampleBiplex FFT: 1/6 resource requirementsHighly configurable:•Transform length•Wide Bandwidth (>FPGA clock)•Complex or Real•Number of Polarizations•Input bit width and output bit width•Twiddle coefficient bit width•Run-time programmable downshifting•Decimate option

Development Environment

<strong>CASPER</strong>, the friendly…Group Helping Open-source Signal-processing Technology• Goal to help develop signal processinginstrumentation and portable librariesfor the community.• Open source hardware, gateware, andsoftware.• Provide training and tutorials• Not so much delivering turn-keyinstruments.

<strong>CASPER</strong> FX ArchitectureF Engine 0X Engine 0F Engine 1X Engine 1. . .. . .. . .F Engine N-1Number of F enginesscales linearly with numantennas10GbE SwitchNumber of ports onswitch scales linearlywith num antennas X Engine N-1Compute requirementsof X engines scales withN-squared

Design Philosophy• Standardized processing hardware• Commercial interconnect• Asynchronous compute engines• Synchronization using common 1PPS• UDP output delivery over ethernetnetwork• Correlator scales with your array

Architecture to hardware mappingExample 8 Antenna systemiBOBBEE2BEE2 user FPGA BEE2 user FPGAiBOBF EngF EngX EngX EngX EngX EngF EngF EngiBOBBEE2 user FPGABEE2 user FPGAiBOBF EngF EngX EngX EngX EngX EngF EngF Eng10GbE Switch

Engine OperationsF engineADCDDCChannelizeQuantizeReformatX EngineX engineF Engine10GbE Buffer X Eng Accum

Backend Software• UDP packets received• Currently received, parsed and saved inMIRIAD file format by single computer.• Computing requirements dependant onexperiment;• Usually single computer ok: 128antennas, 1 sec integrations, 2k chan =512MB/s

Pending systems• Bench sys: 8ant, DP, 200MHz, 2k ch• PAPER: 128ant, DP, 100MHz, 2k ch• KAT-7: 7ant, DP, 256MHz, 2k ch• meerKAT: 80ant, DP, 1GHz, 16k ch• Bologna: 32ant, SP, 32MHz, 2k ch• GMRT: 30ant, DP, 600MHz, 2k ch

Number of boardsHow does it scale100000010000010000100010010F EnginesX Engines1

Logic CellsThousandsFPGA Roadmap400Xilinx Virtex Family3003302001000431002002000200220042006• Processing power doubling every two years• V4 = ½ power requirements of V2Pro** Manufacturers claim - Xilinx Inc.

Coming soon…• 10Gbps output optionally gives integrations ~10ms• More efficient use of hardware DSP slices• High speed, scalable, distributed data capture software• Walsh codes and phase switching• Phase rotation• 64 antenna design• Upgrade to 4096 channels• ROACH hardware:–

Questions and CommentsVisit <strong>CASPER</strong> on the web(http://casper.berkley.edu) fortutorials, documentation, libraries etcLearn about meerKAT and South Africa’sSKA bid (http://www.ska.ac.za)Email me: jason_manley@hotmail.com

PFB-FFT response

Current usesPocket Spectrometer• Using ATMEL ADC’s at 2Gsamples/sec• Performing 4 real FFT’s in 1(complex) biplex pipelined FFTmodule.• 2048 channels• Uses just 1 ADC, 1 IBOB, and yourlaptop.

ROACH block diagram

