20.01.2015 Views

Dan Werthimer - CASPER - University of California, Berkeley

Dan Werthimer - CASPER - University of California, Berkeley

Dan Werthimer - CASPER - University of California, Berkeley

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Collaboration for Radio<br />

Astronomy Instrumentation<br />

<strong>Dan</strong> <strong>Werthimer</strong> and <strong>CASPER</strong> Collaborators<br />

http://casper.berkeley.edu


<strong>CASPER</strong><br />

Collaboration for Radio Astronomy<br />

Signal Processing and Electronics Research<br />

Collaborators<br />

Xilinx, Fujitsu, HP, Sun/Oracle, Nvidia, NSF, NASA, NRAO, NAIC,<br />

CFA (Havard/Smithsonian), Haystack (MIT), Caltech, Cornell, CSIRO/ATNF,<br />

JPL/DSN, South Africa KAT, Manchester/Jodrell Bank, GMRT (India),<br />

Oxford, Bologna, Metsahovi Observatory/Helsinki <strong>University</strong>,<br />

<strong>University</strong> <strong>of</strong> <strong>California</strong>, <strong>Berkeley</strong>; Swinburne <strong>University</strong> (Australia),<br />

Seti Institute, <strong>University</strong> <strong>of</strong> <strong>California</strong>, Santa Barbara;<br />

<strong>University</strong> <strong>of</strong> <strong>California</strong>, Los Angeles; CNRS (France), <strong>University</strong> <strong>of</strong> Maryland<br />

Nancay Observatory, Univerity <strong>of</strong> Cape Town (South Africa),<br />

ASTRON (Netherlands), Academica Sinica (Taiwan), Cambridge,<br />

Brigham Young <strong>University</strong>, Rhodes <strong>University</strong> (South Africa)


The Problem with the Traditional<br />

Hardware Development Model<br />

• Takes 5 to 10 years<br />

• Cost Dominated by NRE because <strong>of</strong><br />

custom Boards, Backplanes, Protocols<br />

• Antiquated by the time it’s released.<br />

• How to buy the hardware at the last<br />

minute<br />

• Each observatory designs from scratch


Solution:<br />

• Modular General Purpose Hardware<br />

– Low number <strong>of</strong> board designs<br />

– Can be upgraded piecemeal or all together<br />

– Reusable<br />

– Standard signal processing model which<br />

is consistent between upgrades.


<strong>CASPER</strong> Real-time Signal Processing Instrumentation<br />

• Low NRE, shared by the community<br />

• Rapid development<br />

• Open-source, collaborative<br />

• Reusable, platform-independent gateware<br />

• Modular, upgradeable hardware<br />

• Industry standard communication protocols<br />

• Use switches to solve correlator interconnect<br />

• Low Cost


Collaboration (not turn key instruments)<br />

• Share Open Source Libraries<br />

• Workshops (Tamara)<br />

• Video’s and Doc’s on Tool Flow, Libraries<br />

• Wiki, Mailing List<br />

• Open Source Boards (available from vendors)


Tutorials<br />

• 1: Introduction to Simulink, Roach and Borph<br />

2: 10GbE<br />

3: basic spectrometer (400MHz, 2k channels)<br />

4: 4-input pocket correlator (400MHz, 1k ch)<br />

5: ADC ROACH CPU/GPU<br />

• 6: GPU tutorials (Richard Edgar, David Kirk)<br />

Jason Manley, Terry Filiba, Mark Wagner,<br />

Wesley New, Andrew Marten, <strong>Dan</strong>ny Price,<br />

Jack Hickish, Griffin Foster


Roach Motel (Roach Nest) (KAT)


Current <strong>CASPER</strong> ADC Boards<br />

ADC2x1000-8 (dual 1GSa/sec, single 2Gsps, 8 bit)<br />

ADC1x3000-8 (3GSa/sec, 8 bit) ADC<br />

(6Gsps interleaved)<br />

64ADCx64-12 (64x 64MSa/sec, 12 bit)<br />

ADC4x250-8 (quad 250MSa/sec, 8 bit)<br />

katADC (dual 1.5GSa/sec, 8 bit, with gain, atten, synth)<br />

ADC2x550-12 (dual 550 Msps, 12 bit)<br />

ADC2x400-14 (dual 400 Msps, 14 bit)<br />

ADC1x5000-8 (1x5Gsps,2x2.5Gsps,4x1.25G sps – Taiwan)<br />

ADC1x1000-12 (optically isolated 12 bit 1Gsps – JPL)


Upcoming Hardware<br />

• Roach II (Virtex 6 – South Africa team)<br />

• Rhino (Spartan 6, ARM CPU, FMC connect)<br />

• Roach III (Virtex 7)<br />

• 20 to 26 Gsps ADC board


Board Interconnect - Upgradable<br />

• Problem: Backplanes are short lived<br />

(S100, Multibus, VME, ISA, EISA, PCI, PCIx, PCIe,<br />

compactPCI, compactPCIe, ATCA…)<br />

• Solution: Use 10Gbit Ethernet<br />

(10Gbe, Infiniband, Myrinet, Xaui, Aurora)<br />

Copper CX4 (40 meters max) or Optical


Beowulf Cluster Like General Purpose Architechture<br />

Dynamic Allocation <strong>of</strong> Resources, need not be FPGA based<br />

Polyphase<br />

Filter Banks<br />

Reconfigurable<br />

Compute Cluster<br />

ADC<br />

PFB<br />

FPGA DSP<br />

Module<br />

ADC<br />

PFB<br />

FPGA DSP<br />

Module<br />

.<br />

.<br />

.<br />

FPGA DSP<br />

Module<br />

Correlator<br />

.<br />

.<br />

.<br />

Commercial <strong>of</strong>f-the-shelf<br />

Multicast 10 Gbps (10GE<br />

or InfiniBand) Switch<br />

FPGA DSP<br />

Module<br />

FPGA DSP<br />

Module<br />

Beamformers/<br />

Spectrometers<br />

FPGA DSP<br />

Module<br />

.<br />

.<br />

.<br />

Pulsar timer<br />

.<br />

.<br />

.<br />

ADC<br />

PFB<br />

General-purpose CPUs


S<strong>of</strong>tware<br />

Hardware<br />

BORPH Operating System – Hayden So<br />

• An extended version <strong>of</strong><br />

Linux operating system<br />

– Treats FPGAs = CPUs<br />

• FPGA applications execute<br />

as hardware processes<br />

• HW/SW communication<br />

– UNIX file I/O<br />

• Benefits<br />

– Easy to understand for<br />

novice/experienced users<br />

– Remote control+monitor<br />

SW SW SW<br />

file<br />

pipe<br />

Device Driver<br />

Hardware Platform<br />

(Network, UART, HD…)<br />

User Library<br />

BORPH Kernel<br />

IPC<br />

Hardware User Library<br />

HW<br />

FPGA<br />

ioreg<br />

HW<br />

Poster Session 3 P3_09<br />

(11am):<br />

File System Access From<br />

Reconfigurable FPGA<br />

Hardware Processes in<br />

BORPH<br />

socket<br />

FPGA


Simulink-based Design Tool Flow<br />

• Simulink Xilinx System Generator Library<br />

• Custom BEE2 Library Blocksets<br />

• S<strong>of</strong>tware programmable registers<br />

• BEE Platform Studio


FFT controls<br />

Simulink Library – Aaron Parsons, David MacMahon<br />

Verilog Library – Jeff Mock<br />

• Transform length<br />

• Bandwidth<br />

• Complex or Real<br />

• Number <strong>of</strong> Polarizations<br />

• Input bit width and output bit width<br />

• twiddle coefficient bit width<br />

• Run-time programmable down-shifting<br />

• Decimate option


PFB vs. FFT


Digital Down-Converter<br />

• Selectable # <strong>of</strong> FIR taps<br />

• On-the-fly programmable mix frequency<br />

• Selectable FIR coeff<br />

• Agile sub-band selection.


X-Engine Correlation Architecture<br />

(Lynn Urry, Aaron Parsons)


Hardware and S<strong>of</strong>tware Libraries<br />

legend:


Applications


Applications<br />

• VLBI Mark 5B data recorder – Haystack, NRAO – 512 MHz<br />

• Beamforming – ATA, SMA –<br />

• SETI – Arecibo (UCB)<br />

JPL/UCB DSN (Preston, Gulkis, Levin, Jones)<br />

• Correlators and Imagers:<br />

ATA (Aaron Parsons, Mel Wright)<br />

PAPER (Reionization Experiment)<br />

Carma Next Gen<br />

MeerKAT/SKA South Africa<br />

GMRT next gen correlator <br />

Bologna (SKA), FASR <br />

Pulsar Timing and Searching, Transient<br />

Greenbank, Allen Telescope Array, VLA,<br />

Swinburne (Parkes), meerKAT, Nancay


SETI Spectrometers<br />

• Parkes Southern SERENDIP<br />

• ALFA SETI Sky Survey (300 MHz x 7 beams)<br />

• JPL DSN Sky Survey (eventually 20 GHz bandwidth)<br />

Radio Astronomy Spectrometers<br />

• GALFA Spectrometer – Arecibo Multibeam Hydrogen Survey<br />

• Astronomy Signal Processor – ASP – Don Backer, Ingrid<br />

Stairs, et al(pulsars)<br />

• Antenna Holography, ATNF, China<br />

• Gavert (DSN education, outreach) – 8 GHz BW –G. Jones<br />

• CMB Bolometer Readout – Caltech, UCB<br />

• Fast Readout Spectrometers (Parkes, NRAO, ATA...)


ATA Fly’s Eye Transient Instrument<br />

44 fast readout spectrometers<br />

3 weeks to build<br />

Ge<strong>of</strong>f Bower, Jim Cordes, Griffin<br />

Foster, Joeri van Leeuwen, Peter<br />

McMahon, Andrew Siemion, Mark<br />

Wagner, <strong>Dan</strong> <strong>Werthimer</strong>


Undergraduate Radio Astronomy Course


4096 channel Mars spectrometer<br />

“Chip in a day” FPGA to ASIC


<strong>CASPER</strong> Correlator Collaboration<br />

Allen Telescope Array (90 uS imaging – G. Jones)<br />

PAPER (Epoch <strong>of</strong> Reionization)<br />

Carma Next Generation<br />

MeerKAT/SKA South Africa<br />

GMRT next gen<br />

Bologna<br />

ISI (Infrared) – 6 Gsps<br />

SKADS (Oxford)<br />

SMA next gen (CFA, ASIAA)<br />

FASR, Baryon Acoustic Oscillation


<strong>CASPER</strong> FX Architecture<br />

F Engine 0<br />

X Engine 0<br />

F Engine 1<br />

X Engine 1<br />

. . .<br />

. . .<br />

. . .<br />

10GbE Switch<br />

F Engine N-1<br />

X Engine N-1


<strong>CASPER</strong> FXB Correlator/Beamformer<br />

(correlator needed to calibrate beamformer)<br />

F Engine 0<br />

X Engine 0<br />

F Engine 1<br />

X Engine 1<br />

. . .<br />

. . .<br />

. . .<br />

10GbE Switch<br />

F Engine N-1<br />

X Engine N-1


Correlators and Beamformers<br />

• Globally Asynchronous (like a computer cluster)<br />

• Data is time stamped with 1 PPS at ADC<br />

• Locally Synchronous, Globally Asynchronous<br />

• Solve problem <strong>of</strong> correlator/beamformer<br />

interconnect problem by using 10 Gbe switches<br />

(for both interconnect and fast readout)<br />

• No need for high density complex boards<br />

• Use Fifo’s to align data before correlation or<br />

beamforming…


Correlator Comparison - 2009 benchmarks<br />

GPU:<br />

CPU:<br />

1.3 MHz per GPU, Greenhill et al, http://www.scigpu.org<br />

2.0 MHz per 8 core computer, Roy, Gupta, et al, optimized code<br />

FPGA: 10.4 MHz per FPGA (<strong>CASPER</strong> : Xilinx XC5VSX95T roach boards)<br />

Power (critical for SKA)<br />

GPU: 150 watts/MHz (including GPU, CPU/motherboard, P.S.)<br />

CPU:<br />

FPGA:<br />

ASIC:<br />

150 watts/MHz<br />

6 watts per MHz (including digitizers, P.S., CPU, motherboard)<br />

2 watts per MHz (estimate)<br />

(For 32 antenna, dual polarization, 500 MHz correlator)


Packetized FPGA FX Correlator Cost<br />

(assuming non hierarchical)<br />

Cost (2010) = Bandwidth/GHz ( $1200 N + $5 N^2 )<br />

2010 $128M for 4,000 antenna, dual pol, 1 GHz bandwidth<br />

2012 $ 64M<br />

2014 $ 32M<br />

2016 $ 16M<br />

2018 $ 8M<br />

2020 $ 4M<br />

2022 $ 2M<br />

• N = number <strong>of</strong> dual polarization receivers (full stokes)<br />

• Moores Law Foundry Prediction through 22nm<br />

• ITRS roadmap predicts slow down (but they always do)


Astronomy Signal Processor<br />

Terry Filiba, Peter McMahon


Parkes Pulsar Discoveries<br />

Bailes, Filiba, McMahon et al<br />

• BPSR is 13 beams, 1024 channels, 64 us.<br />

• 50 pulsars<br />

• 11 millisecond pulsars<br />

• a magnetar (Levin et al.)<br />

• a pulsar with a planetary-mass companion.<br />

• large number <strong>of</strong> RRATs due to increased<br />

dynamic range over previous generation<br />

filterbanks.


1960 – First Radio Astronomy Digital Correlator<br />

21 lags<br />

300kHz clock<br />

discrete transistors<br />

$19,000<br />

Sandy<br />

Weinreb


Correlator processing power<br />

SKA<br />

GFlops<br />

10 6 DXB<br />

10 5<br />

10 7 10 3<br />

ALMA<br />

LOFAR<br />

SMA<br />

EVLA<br />

.<br />

10 9<br />

10 4<br />

EVN/WSRT<br />

10 3<br />

VLA<br />

10 6<br />

10 2<br />

10<br />

1<br />

DLB<br />

DCB<br />

DAS<br />

70 75 80 85 90 95 2000 05 10 2015<br />

source: Arnold van Ardenne


Moores Law – Instruments using FPGA’s: 2X per year<br />

(1,000,000 over 20 years)


Future Spectrometers<br />

2015 4 THz 400 beams<br />

10 GHz each<br />

2020 128 THz 12,800 beams<br />

2025 4000 THz 40,000 beams<br />

2030 128,000 THz 1M beams


Cost <strong>of</strong> FPGA integer computing<br />

2010 $200 per 1E11 MAC/sec<br />

2012 $100<br />

2014 $50<br />

2016 $25<br />

2018 $13<br />

2020 $6<br />

2022 $3<br />

• XC6SLX150T FPGA with PCB, DRAM, P.S., Cooling…<br />

• Moores Law Foundry Prediction through 22nm<br />

• ITRS roadmap predicts slow down (but they always do)


<strong>CASPER</strong> the Friendly...<br />

• Group Helping Open-source Signalprocessing<br />

Technology (GHOST)<br />

– Goal to help develop signal processing<br />

instrumenation and libraries for the<br />

community.<br />

– Open source hardware, gateware, and<br />

s<strong>of</strong>tware.<br />

– Provide training and tutorials<br />

– Not so much delivering turn-key instruments<br />

– Promote Collaboration

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!