20.01.2015 Views

Dan Werthimer - CASPER - University of California, Berkeley

Dan Werthimer - CASPER - University of California, Berkeley

Dan Werthimer - CASPER - University of California, Berkeley

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Collaboration for Radio<br />

Astronomy Instrumentation<br />

<strong>Dan</strong> <strong>Werthimer</strong> and <strong>CASPER</strong> Collaborators<br />

http://casper.berkeley.edu


<strong>CASPER</strong><br />

Collaboration for Radio Astronomy<br />

Signal Processing and Electronics Research<br />

Collaborators<br />

Xilinx, Fujitsu, HP, Sun/Oracle, Nvidia, NSF, NASA, NRAO, NAIC,<br />

CFA (Havard/Smithsonian), Haystack (MIT), Caltech, Cornell, CSIRO/ATNF,<br />

JPL/DSN, South Africa KAT, Manchester/Jodrell Bank, GMRT (India),<br />

Oxford, Bologna, Metsahovi Observatory/Helsinki <strong>University</strong>,<br />

<strong>University</strong> <strong>of</strong> <strong>California</strong>, <strong>Berkeley</strong>; Swinburne <strong>University</strong> (Australia),<br />

Seti Institute, <strong>University</strong> <strong>of</strong> <strong>California</strong>, Santa Barbara;<br />

<strong>University</strong> <strong>of</strong> <strong>California</strong>, Los Angeles; CNRS (France), <strong>University</strong> <strong>of</strong> Maryland<br />

Nancay Observatory, Univerity <strong>of</strong> Cape Town (South Africa),<br />

ASTRON (Netherlands), Academica Sinica (Taiwan), Cambridge,<br />

Brigham Young <strong>University</strong>, Rhodes <strong>University</strong> (South Africa)


The Problem with the Traditional<br />

Hardware Development Model<br />

• Takes 5 to 10 years<br />

• Cost Dominated by NRE because <strong>of</strong><br />

custom Boards, Backplanes, Protocols<br />

• Antiquated by the time it’s released.<br />

• How to buy the hardware at the last<br />

minute<br />

• Each observatory designs from scratch


Solution:<br />

• Modular General Purpose Hardware<br />

– Low number <strong>of</strong> board designs<br />

– Can be upgraded piecemeal or all together<br />

– Reusable<br />

– Standard signal processing model which<br />

is consistent between upgrades.


<strong>CASPER</strong> Real-time Signal Processing Instrumentation<br />

• Low NRE, shared by the community<br />

• Rapid development<br />

• Open-source, collaborative<br />

• Reusable, platform-independent gateware<br />

• Modular, upgradeable hardware<br />

• Industry standard communication protocols<br />

• Use switches to solve correlator interconnect<br />

• Low Cost


Collaboration (not turn key instruments)<br />

• Share Open Source Libraries<br />

• Workshops (Tamara)<br />

• Video’s and Doc’s on Tool Flow, Libraries<br />

• Wiki, Mailing List<br />

• Open Source Boards (available from vendors)


Tutorials<br />

• 1: Introduction to Simulink, Roach and Borph<br />

2: 10GbE<br />

3: basic spectrometer (400MHz, 2k channels)<br />

4: 4-input pocket correlator (400MHz, 1k ch)<br />

5: ADC ROACH CPU/GPU<br />

• 6: GPU tutorials (Richard Edgar, David Kirk)<br />

Jason Manley, Terry Filiba, Mark Wagner,<br />

Wesley New, Andrew Marten, <strong>Dan</strong>ny Price,<br />

Jack Hickish, Griffin Foster


Roach Motel (Roach Nest) (KAT)


Current <strong>CASPER</strong> ADC Boards<br />

ADC2x1000-8 (dual 1GSa/sec, single 2Gsps, 8 bit)<br />

ADC1x3000-8 (3GSa/sec, 8 bit) ADC<br />

(6Gsps interleaved)<br />

64ADCx64-12 (64x 64MSa/sec, 12 bit)<br />

ADC4x250-8 (quad 250MSa/sec, 8 bit)<br />

katADC (dual 1.5GSa/sec, 8 bit, with gain, atten, synth)<br />

ADC2x550-12 (dual 550 Msps, 12 bit)<br />

ADC2x400-14 (dual 400 Msps, 14 bit)<br />

ADC1x5000-8 (1x5Gsps,2x2.5Gsps,4x1.25G sps – Taiwan)<br />

ADC1x1000-12 (optically isolated 12 bit 1Gsps – JPL)


Upcoming Hardware<br />

• Roach II (Virtex 6 – South Africa team)<br />

• Rhino (Spartan 6, ARM CPU, FMC connect)<br />

• Roach III (Virtex 7)<br />

• 20 to 26 Gsps ADC board


Board Interconnect - Upgradable<br />

• Problem: Backplanes are short lived<br />

(S100, Multibus, VME, ISA, EISA, PCI, PCIx, PCIe,<br />

compactPCI, compactPCIe, ATCA…)<br />

• Solution: Use 10Gbit Ethernet<br />

(10Gbe, Infiniband, Myrinet, Xaui, Aurora)<br />

Copper CX4 (40 meters max) or Optical


Beowulf Cluster Like General Purpose Architechture<br />

Dynamic Allocation <strong>of</strong> Resources, need not be FPGA based<br />

Polyphase<br />

Filter Banks<br />

Reconfigurable<br />

Compute Cluster<br />

ADC<br />

PFB<br />

FPGA DSP<br />

Module<br />

ADC<br />

PFB<br />

FPGA DSP<br />

Module<br />

.<br />

.<br />

.<br />

FPGA DSP<br />

Module<br />

Correlator<br />

.<br />

.<br />

.<br />

Commercial <strong>of</strong>f-the-shelf<br />

Multicast 10 Gbps (10GE<br />

or InfiniBand) Switch<br />

FPGA DSP<br />

Module<br />

FPGA DSP<br />

Module<br />

Beamformers/<br />

Spectrometers<br />

FPGA DSP<br />

Module<br />

.<br />

.<br />

.<br />

Pulsar timer<br />

.<br />

.<br />

.<br />

ADC<br />

PFB<br />

General-purpose CPUs


S<strong>of</strong>tware<br />

Hardware<br />

BORPH Operating System – Hayden So<br />

• An extended version <strong>of</strong><br />

Linux operating system<br />

– Treats FPGAs = CPUs<br />

• FPGA applications execute<br />

as hardware processes<br />

• HW/SW communication<br />

– UNIX file I/O<br />

• Benefits<br />

– Easy to understand for<br />

novice/experienced users<br />

– Remote control+monitor<br />

SW SW SW<br />

file<br />

pipe<br />

Device Driver<br />

Hardware Platform<br />

(Network, UART, HD…)<br />

User Library<br />

BORPH Kernel<br />

IPC<br />

Hardware User Library<br />

HW<br />

FPGA<br />

ioreg<br />

HW<br />

Poster Session 3 P3_09<br />

(11am):<br />

File System Access From<br />

Reconfigurable FPGA<br />

Hardware Processes in<br />

BORPH<br />

socket<br />

FPGA


Simulink-based Design Tool Flow<br />

• Simulink Xilinx System Generator Library<br />

• Custom BEE2 Library Blocksets<br />

• S<strong>of</strong>tware programmable registers<br />

• BEE Platform Studio


FFT controls<br />

Simulink Library – Aaron Parsons, David MacMahon<br />

Verilog Library – Jeff Mock<br />

• Transform length<br />

• Bandwidth<br />

• Complex or Real<br />

• Number <strong>of</strong> Polarizations<br />

• Input bit width and output bit width<br />

• twiddle coefficient bit width<br />

• Run-time programmable down-shifting<br />

• Decimate option


PFB vs. FFT


Digital Down-Converter<br />

• Selectable # <strong>of</strong> FIR taps<br />

• On-the-fly programmable mix frequency<br />

• Selectable FIR coeff<br />

• Agile sub-band selection.


X-Engine Correlation Architecture<br />

(Lynn Urry, Aaron Parsons)


Hardware and S<strong>of</strong>tware Libraries<br />

legend:


Applications


Applications<br />

• VLBI Mark 5B data recorder – Haystack, NRAO – 512 MHz<br />

• Beamforming – ATA, SMA –<br />

• SETI – Arecibo (UCB)<br />

JPL/UCB DSN (Preston, Gulkis, Levin, Jones)<br />

• Correlators and Imagers:<br />

ATA (Aaron Parsons, Mel Wright)<br />

PAPER (Reionization Experiment)<br />

Carma Next Gen<br />

MeerKAT/SKA South Africa<br />

GMRT next gen correlator <br />

Bologna (SKA), FASR <br />

Pulsar Timing and Searching, Transient<br />

Greenbank, Allen Telescope Array, VLA,<br />

Swinburne (Parkes), meerKAT, Nancay


SETI Spectrometers<br />

• Parkes Southern SERENDIP<br />

• ALFA SETI Sky Survey (300 MHz x 7 beams)<br />

• JPL DSN Sky Survey (eventually 20 GHz bandwidth)<br />

Radio Astronomy Spectrometers<br />

• GALFA Spectrometer – Arecibo Multibeam Hydrogen Survey<br />

• Astronomy Signal Processor – ASP – Don Backer, Ingrid<br />

Stairs, et al(pulsars)<br />

• Antenna Holography, ATNF, China<br />

• Gavert (DSN education, outreach) – 8 GHz BW –G. Jones<br />

• CMB Bolometer Readout – Caltech, UCB<br />

• Fast Readout Spectrometers (Parkes, NRAO, ATA...)


ATA Fly’s Eye Transient Instrument<br />

44 fast readout spectrometers<br />

3 weeks to build<br />

Ge<strong>of</strong>f Bower, Jim Cordes, Griffin<br />

Foster, Joeri van Leeuwen, Peter<br />

McMahon, Andrew Siemion, Mark<br />

Wagner, <strong>Dan</strong> <strong>Werthimer</strong>


Undergraduate Radio Astronomy Course


4096 channel Mars spectrometer<br />

“Chip in a day” FPGA to ASIC


<strong>CASPER</strong> Correlator Collaboration<br />

Allen Telescope Array (90 uS imaging – G. Jones)<br />

PAPER (Epoch <strong>of</strong> Reionization)<br />

Carma Next Generation<br />

MeerKAT/SKA South Africa<br />

GMRT next gen<br />

Bologna<br />

ISI (Infrared) – 6 Gsps<br />

SKADS (Oxford)<br />

SMA next gen (CFA, ASIAA)<br />

FASR, Baryon Acoustic Oscillation


<strong>CASPER</strong> FX Architecture<br />

F Engine 0<br />

X Engine 0<br />

F Engine 1<br />

X Engine 1<br />

. . .<br />

. . .<br />

. . .<br />

10GbE Switch<br />

F Engine N-1<br />

X Engine N-1


<strong>CASPER</strong> FXB Correlator/Beamformer<br />

(correlator needed to calibrate beamformer)<br />

F Engine 0<br />

X Engine 0<br />

F Engine 1<br />

X Engine 1<br />

. . .<br />

. . .<br />

. . .<br />

10GbE Switch<br />

F Engine N-1<br />

X Engine N-1


Correlators and Beamformers<br />

• Globally Asynchronous (like a computer cluster)<br />

• Data is time stamped with 1 PPS at ADC<br />

• Locally Synchronous, Globally Asynchronous<br />

• Solve problem <strong>of</strong> correlator/beamformer<br />

interconnect problem by using 10 Gbe switches<br />

(for both interconnect and fast readout)<br />

• No need for high density complex boards<br />

• Use Fifo’s to align data before correlation or<br />

beamforming…


Correlator Comparison - 2009 benchmarks<br />

GPU:<br />

CPU:<br />

1.3 MHz per GPU, Greenhill et al, http://www.scigpu.org<br />

2.0 MHz per 8 core computer, Roy, Gupta, et al, optimized code<br />

FPGA: 10.4 MHz per FPGA (<strong>CASPER</strong> : Xilinx XC5VSX95T roach boards)<br />

Power (critical for SKA)<br />

GPU: 150 watts/MHz (including GPU, CPU/motherboard, P.S.)<br />

CPU:<br />

FPGA:<br />

ASIC:<br />

150 watts/MHz<br />

6 watts per MHz (including digitizers, P.S., CPU, motherboard)<br />

2 watts per MHz (estimate)<br />

(For 32 antenna, dual polarization, 500 MHz correlator)


Packetized FPGA FX Correlator Cost<br />

(assuming non hierarchical)<br />

Cost (2010) = Bandwidth/GHz ( $1200 N + $5 N^2 )<br />

2010 $128M for 4,000 antenna, dual pol, 1 GHz bandwidth<br />

2012 $ 64M<br />

2014 $ 32M<br />

2016 $ 16M<br />

2018 $ 8M<br />

2020 $ 4M<br />

2022 $ 2M<br />

• N = number <strong>of</strong> dual polarization receivers (full stokes)<br />

• Moores Law Foundry Prediction through 22nm<br />

• ITRS roadmap predicts slow down (but they always do)


Astronomy Signal Processor<br />

Terry Filiba, Peter McMahon


Parkes Pulsar Discoveries<br />

Bailes, Filiba, McMahon et al<br />

• BPSR is 13 beams, 1024 channels, 64 us.<br />

• 50 pulsars<br />

• 11 millisecond pulsars<br />

• a magnetar (Levin et al.)<br />

• a pulsar with a planetary-mass companion.<br />

• large number <strong>of</strong> RRATs due to increased<br />

dynamic range over previous generation<br />

filterbanks.


1960 – First Radio Astronomy Digital Correlator<br />

21 lags<br />

300kHz clock<br />

discrete transistors<br />

$19,000<br />

Sandy<br />

Weinreb


Correlator processing power<br />

SKA<br />

GFlops<br />

10 6 DXB<br />

10 5<br />

10 7 10 3<br />

ALMA<br />

LOFAR<br />

SMA<br />

EVLA<br />

.<br />

10 9<br />

10 4<br />

EVN/WSRT<br />

10 3<br />

VLA<br />

10 6<br />

10 2<br />

10<br />

1<br />

DLB<br />

DCB<br />

DAS<br />

70 75 80 85 90 95 2000 05 10 2015<br />

source: Arnold van Ardenne


Moores Law – Instruments using FPGA’s: 2X per year<br />

(1,000,000 over 20 years)


Future Spectrometers<br />

2015 4 THz 400 beams<br />

10 GHz each<br />

2020 128 THz 12,800 beams<br />

2025 4000 THz 40,000 beams<br />

2030 128,000 THz 1M beams


Cost <strong>of</strong> FPGA integer computing<br />

2010 $200 per 1E11 MAC/sec<br />

2012 $100<br />

2014 $50<br />

2016 $25<br />

2018 $13<br />

2020 $6<br />

2022 $3<br />

• XC6SLX150T FPGA with PCB, DRAM, P.S., Cooling…<br />

• Moores Law Foundry Prediction through 22nm<br />

• ITRS roadmap predicts slow down (but they always do)


<strong>CASPER</strong> the Friendly...<br />

• Group Helping Open-source Signalprocessing<br />

Technology (GHOST)<br />

– Goal to help develop signal processing<br />

instrumenation and libraries for the<br />

community.<br />

– Open source hardware, gateware, and<br />

s<strong>of</strong>tware.<br />

– Provide training and tutorials<br />

– Not so much delivering turn-key instruments<br />

– Promote Collaboration

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!