Simplest Visual Organs - MIT Media Lab
Simplest Visual Organs - MIT Media Lab
Simplest Visual Organs - MIT Media Lab
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Mitsubishi Electric Research <strong>Lab</strong>oratories Raskar 2007<br />
Less is More:<br />
Coded Computational Photography<br />
Projector<br />
Pos=0<br />
Pos=255<br />
Tags<br />
Ramesh Raskar<br />
Mitsubishi Electric Research <strong>Lab</strong>s (MERL)<br />
Cambridge, MA<br />
<strong>Simplest</strong> <strong>Visual</strong> <strong>Organs</strong><br />
Larval Trematode Worm<br />
‘Single Pixel’ Camera<br />
1
<strong>Simplest</strong> <strong>Visual</strong> <strong>Organs</strong><br />
Larval Trematode Worm<br />
‘Single Pixel’ Camera<br />
<strong>Simplest</strong> <strong>Visual</strong> <strong>Organs</strong><br />
?<br />
Larval Trematode Worm<br />
2
Special Aperture<br />
?<br />
Larval Trematode Worm<br />
Special Aperture<br />
The aperture of a 100 mm lens is modified<br />
Insert a coded mask with chosen binary pattern<br />
Rest of the camera is unmodified<br />
3
LED<br />
In Focus Photo<br />
Out of Focus Photo: Open Aperture<br />
4
Out of Focus Photo: Coded Aperture<br />
Bokeh<br />
5
Out of Focus Photo: Coded Aperture<br />
Captured Blurred<br />
Photo<br />
7
Refocused on Person<br />
Blurred Photos<br />
Open Aperture<br />
Coded Aperture, 7 * 7 Mask<br />
8
After Removing De-Focus Blur<br />
Open Aperture<br />
Coded Aperture, 7 * 7 Mask<br />
Motion Blurred Photo<br />
9
Short Exposure Traditional MURA<br />
Shutter<br />
Captured<br />
Single<br />
Photo<br />
Deblurred<br />
Result<br />
Dark<br />
and noisy<br />
Banding Artifacts and<br />
some spatial frequencies<br />
are lost<br />
10
Blurring == Convolution<br />
Sharp Photo<br />
Fourier<br />
Transform<br />
PSF == Sinc Function<br />
Blurred Photo<br />
ω<br />
Traditional Camera: Shutter is OPEN: Box Filter<br />
Sharp Photo<br />
Fourier<br />
Transform<br />
PSF == Broadband Function<br />
Blurred Photo<br />
Preserves High Spatial<br />
Frequencies<br />
Flutter Shutter: Shutter is OPEN and CLOSED<br />
11
Flutter Shutter Camera<br />
Raskar, Agrawal, Tumblin [Siggraph2006]<br />
LCD opacity switched<br />
in coded sequence<br />
Traditional<br />
Coded<br />
Exposure<br />
Deblurred<br />
Image<br />
Deblurred<br />
Image<br />
Image of<br />
Static<br />
Object<br />
12
Deblurred Images<br />
13
×<br />
14
Application: Aerial Imaging<br />
Sharpness versus Image Pixel Brightness<br />
T=100ms<br />
T = 0<br />
Long Exposure:<br />
Short Explosure:<br />
Flutter Shutter<br />
Shutter Open<br />
Shutter Closed<br />
Time<br />
Sharp image with<br />
sufficient brightness<br />
Motion Blur<br />
Defocus Blur<br />
15
Coded Exposure<br />
Coded Aperture<br />
Temporal 1-D broadband code:<br />
Motion Deblurring<br />
Spatial 2-D broadband mask:<br />
Focus Deblurring<br />
Less is More<br />
Blocking Light<br />
== More Information<br />
Coding in Time<br />
Coding in Space<br />
16
Codes at Work<br />
• Imaging<br />
– Aperture Modification<br />
• Without Lens<br />
– Astronomy [Fenimore and Gotterson, ’89, Skinner, ’88]<br />
– Nuclear Medicine Imaging [Zhang et al.’99]<br />
– Lensless Imaging, [Zomet & Nayar, CVPR’06]<br />
• With Lens<br />
– Range Imaging, [Johnson et al.’00, Hiura and Matsuyama’98, Farid and Simoncelli’98]<br />
– Wavefront Coding, CDM Optics<br />
– Levin et al. Siggraph’07<br />
• Illumination<br />
– Global Direct Separation , [Nayar, Guru, Grossberg, Raskar, Sig’06]<br />
– Veiling Glare Removal , [Talvala, Adams, Levoy, Sig’07]<br />
• Audio<br />
– Reverberation Analysis<br />
• Radar<br />
– Chirps for ranging<br />
Larval Trematode Worm<br />
17
Coded Aperture in Nature ?<br />
Larval Trematode Worm<br />
Turbellarian Worm<br />
Less is More ..<br />
• Coded Exposure<br />
– Motion Deblurring<br />
• Coded Aperture<br />
– Focus Deblurring<br />
• Optical Heterodyning<br />
– Light Field Capture<br />
Projector<br />
Pos=0<br />
• Coded Illumination<br />
– Motion Capture<br />
– Multi-flash: Cartoons<br />
Pos=255<br />
Tags<br />
18
Computational Photography<br />
1. Epsilon Photography<br />
– Multi-photos by perturbing camera parameters<br />
– HDR, panorama<br />
– ‘Ultimate camera’: (Photo-editors)<br />
2. Coded Photography<br />
– Single/few snapshot<br />
– Reversible encoding of data<br />
– Additional sensors/optics/illum<br />
– ‘Scene analysis’ : (Consumer software?)<br />
3. Impossible Photos<br />
– Beyond single view/illum<br />
– Not mimic human eye<br />
– ‘New art form’<br />
[ Agrawal, Raskar, Nayar, Li Siggraph05 ]<br />
No-flash<br />
Flash Result Reflection Layer<br />
Gradient Vector Projection<br />
19
Computational Photography<br />
1. Epsilon Photography<br />
– Multiphotos by varying camera parameters<br />
– HDR, panorama<br />
– ‘Ultimate camera’: (Photo-editor)<br />
2. Coded Photography<br />
– Single/few snapshot<br />
– Reversible encoding of data<br />
– Additional sensors/optics/illum<br />
– ‘Scene analysis’ : (Next software?)<br />
3. Impossible Photos<br />
– Not mimic human eye<br />
– Beyond single view/illum<br />
– ‘New artform’<br />
Computational Photography<br />
1. Epsilon Photography<br />
– Multiphotos by varying camera parameters<br />
– HDR, panorama<br />
– ‘Ultimate camera’: (Photo-editor)<br />
2. Coded Photography<br />
– Single/few snapshot<br />
– Reversible encoding of data<br />
– Additional sensors/optics/illum<br />
– ‘Scene analysis’ : (Next software?)<br />
3. Impossible Photos<br />
– Not mimic human eye<br />
– Beyond single view/illum<br />
– ‘New artform’<br />
20
Mask?<br />
Mask<br />
Sensor<br />
Mask<br />
Sensor<br />
Full Resolution Digital<br />
Refocusing:<br />
Coded Aperture Camera<br />
4D Light Field from 2D<br />
Photo:<br />
Heterodyne Light Field<br />
Camera<br />
Capturing Light Field Inside a Camera<br />
21
Capturing Light Field Inside a Camera<br />
Lenslet-based Light Field camera<br />
[Adelson and Wang, 1992, Ng et al. 2005 ]<br />
Stanford Plenoptic Camera [Ng et al 2005]<br />
Contax medium format camera<br />
Kodak 16-megapixel sensor<br />
Adaptive Optics microlens array<br />
125μ square-sided microlenses<br />
4000 × 4000 pixels ÷ 292 × 292 lenses = 14 × 14 pixels per lens<br />
22
Digital Refocusing<br />
[Ng et al 2005]<br />
Can we achieve this with a Mask alone?<br />
Heterodyne Light Field Camera<br />
Mask<br />
Sensor<br />
Scanner sensor<br />
Mask<br />
23
How to Capture<br />
4D Light Field with<br />
2D Sensor ?<br />
What should be the<br />
pattern of the mask ?<br />
Optical Heterodyning<br />
High Freq Carrier<br />
100 MHz<br />
Receiver: Demodulation<br />
Incoming<br />
Signal<br />
Baseband Audio<br />
Signal<br />
99 MHz<br />
Reference<br />
Carrier<br />
Main Lens<br />
Object Mask Sensor<br />
Software Demodulation<br />
Recovered<br />
Light<br />
Field<br />
Photographic<br />
Signal<br />
(Light Field)<br />
Carrier<br />
Incident<br />
Modulated<br />
Signal<br />
Reference<br />
Carrier<br />
24
Captured 2D Photo<br />
Encoding due to<br />
Cosine Mask<br />
Traditional Camera vs Heterodyne Camera<br />
2D<br />
FFT<br />
Traditional Camera Photo<br />
Magnitude of 2D FFT<br />
2D<br />
FFT<br />
Heterodyne Camera Photo<br />
Magnitude of 2D FFT<br />
25
Computing 4D Light Field<br />
2D Sensor Photo, 1800*1800 2D Fourier Transform, 1800*1800<br />
2D<br />
FFT<br />
9*9=81 spectral copies<br />
4D IFFT<br />
Rearrange 2D tiles into 4D planes<br />
200*200*9*9<br />
4D Light Field<br />
200*200*9*9<br />
A Theory of Mask-Enhanced Camera<br />
Main Lens<br />
Object Mask Sensor<br />
•Mask == Light Field Modulator<br />
•Intensity of ray gets multiplied by Mask<br />
•Convolution in Frequency domain<br />
26
Related Work<br />
• Light Field Capture<br />
– Gortler et al., Levoy & Hanrahan, SIG’96, Isaksen et al.‘SIG00<br />
– Light Field Microscopy: Levoy et al. SIG’06<br />
– Integral Photography<br />
• Lippman’08, Ives’30, Georgeiv et al. EGSR’06, Okano et.al’97<br />
– Camera arrays: Wilburn et al. SIG’05<br />
– Flatbed Scanner + Lenslet array: Yang, 2000<br />
– Light Field Video Camera: Wilburn et.al'02<br />
– Programmable Aperture: Liang et. al ICIP 2007<br />
– Plenoptic Camera<br />
• Wang and Adelson’92<br />
• Ng et al.’05<br />
f θ<br />
Band-limited<br />
f θ0<br />
Light Field<br />
f x0<br />
f x<br />
Sensor Slice – Fourier<br />
Slice Theorem<br />
Photo = Slice of Light Field in Fourier Domain<br />
2005]<br />
[Ren Ng, SIGGRAPH<br />
27
How to Capture 2D Light Field with 1D Sensor ?<br />
f θ<br />
Band-limited<br />
f θ0<br />
Light Field<br />
f x0<br />
f x<br />
Sensor Slice<br />
Fourier Light Field Space<br />
Extra sensor bandwidth cannot capture<br />
extra dimension of the light field<br />
f θ<br />
Extra sensor<br />
bandwidth<br />
f θ0<br />
f x0<br />
Sensor<br />
f x<br />
Slice<br />
28
f θ<br />
???<br />
???<br />
??? ???<br />
f x<br />
Solution: Modulation Theorem<br />
Make spectral copies of 2D light field<br />
f θ<br />
f θ0<br />
f x0<br />
f x<br />
Modulation<br />
Function<br />
29
Sensor Slice captures entire Light Field<br />
f θ<br />
Modulated Light Field<br />
f θ0<br />
f x0<br />
f x<br />
Modulation<br />
Function<br />
Demodulation to recover Light Field<br />
1D Fourier Transform of Sensor Signal<br />
f θ<br />
f x<br />
Reshape 1D Fourier Transform into 2D<br />
30
Modulation Function == Sum of Impulses<br />
Physical Mask = Sum of Cosines<br />
f θ<br />
f θ0<br />
f x0<br />
f x<br />
Cosine Mask Used<br />
Mask Tile<br />
1/f 0<br />
31
Where to place the Mask?<br />
Sensor<br />
Sensor<br />
Mask<br />
Mask<br />
f θ<br />
Mask Modulation<br />
Function<br />
Mask Modulation<br />
Function<br />
f x<br />
Where to place the Mask?<br />
Mask<br />
Sensor<br />
f θ<br />
f x<br />
Mask<br />
Modulation<br />
Function<br />
32
Where to place the Mask?<br />
Mask<br />
Sensor<br />
v<br />
d<br />
Mask Modulation<br />
Function<br />
α<br />
α = (d/v) (π/2)<br />
Captured 2D Photo<br />
Encoding due to<br />
Cosine Mask<br />
33
Computing 4D Light Field<br />
2D Sensor Photo, 1800*1800<br />
2D Fourier Transform<br />
2D<br />
FFT<br />
9*9=81 spectral copies<br />
4D IFFT<br />
Rearrange 2D tiles into 4D planes<br />
200*200*9*9<br />
4D Light Field<br />
200*200*9*9<br />
Only cone in<br />
focus<br />
Captured Photo<br />
Digital Refocusing<br />
34
Captured<br />
2D Photo<br />
Full resolution 2D image<br />
of Focused Scene Parts<br />
divide<br />
Image of White Lambertian<br />
Plane<br />
MERL<br />
Mask-Enhanced Cameras: Heterodyned Light Fields & Coded Aperture<br />
Veeraraghavan, Raskar, Agrawal,<br />
Mohan & Tumblin<br />
Differences with Plenoptic Camera<br />
Sensor<br />
Sensor<br />
Microlens<br />
array<br />
Mask<br />
Plenoptic Camera<br />
Heterodyne Camera<br />
• Micro-lens array<br />
• Samples individual rays<br />
• Needs high alignment precision<br />
• Wasted pixels<br />
• Narrowband Cosine Mask<br />
• Samples coded combination of rays<br />
• More flexible<br />
• No wastage<br />
35
Coding and Modulation in Camera Using Masks<br />
Mask?<br />
Sensor<br />
Mask<br />
Sensor<br />
Mask<br />
Sensor<br />
Coded Aperture for<br />
Full Resolution<br />
Digital Refocusing<br />
Heterodyne Light<br />
Field Camera<br />
Coded Imaging<br />
• Coded Exposure<br />
– Motion Deblurring<br />
• Coded Aperture<br />
– Focus Deblurring<br />
• Optical Heterodyning<br />
– Light Field Capture<br />
Projector<br />
Pos=0<br />
• Coded Illumination<br />
– Motion Capture<br />
– Multi-flash: Cartoons<br />
Pos=255<br />
Tags<br />
36
Projector-based Displays<br />
Planar<br />
Non-planar<br />
Curved<br />
Objects<br />
Pocket-Proj<br />
2000<br />
1998<br />
2001<br />
2002<br />
Single<br />
Projector<br />
Us<br />
er :<br />
T<br />
j<br />
Projecto<br />
r<br />
?<br />
2000 1999<br />
2002<br />
1999<br />
2003<br />
Multiple<br />
Projectors<br />
Vicon<br />
Optical<br />
Motion Capture<br />
Medical Rehabilitation<br />
Athlete Analysis<br />
Body-worn markers<br />
High-speed<br />
IR Camera<br />
Performance Capture<br />
Biomechanical Analysis<br />
37
Projector<br />
Pos=0<br />
Coded Illumination<br />
High Speed Motion Capture<br />
Pos=255<br />
Tags<br />
- To increase tracking speed<br />
Code position: Non-colocated emitters<br />
- To work in Ambient Light<br />
Code time: 455KHz modulation<br />
- Invisible<br />
Code wavelength: Infrared<br />
Projector<br />
Light Meters<br />
Pos=0<br />
Tags<br />
- Distributed, wireless<br />
- Real-time location<br />
- Incident light reading<br />
Pos=255<br />
- Annotate Event Photos<br />
- Coded Illumination<br />
- Capture image location of imperceptible tags<br />
- Works in ambient light, 500 Hz<br />
38
<strong>Lab</strong>eling Space<br />
Projector<br />
Pos=0<br />
Each location<br />
receives a unique<br />
temporal code<br />
Tags<br />
Pos=255<br />
But 60Hz<br />
video projector<br />
is too slow<br />
Fast Switching using<br />
Non-colocated<br />
Emitters for Structured Light<br />
Tag<br />
Fixed Masks<br />
+ Blinking LEDs<br />
Time multiplex,<br />
Freq or CDMA ?<br />
39
Fast Switching using<br />
Non-colocated<br />
Emitters for Structured Light<br />
How <strong>Lab</strong>eling Works<br />
Light<br />
source<br />
Optics<br />
Screen<br />
GrayCode Mask<br />
pos=0<br />
pos=15<br />
Light source blink one by one and each position<br />
on the screen has different light pattern.<br />
4 light make 4 bit position resolution<br />
40
<strong>Lab</strong>eling Space<br />
LED<br />
Optics<br />
Screen<br />
GrayCode Mask<br />
pos=0<br />
pos=15<br />
1 LED for 1 Bit pattern<br />
<strong>Lab</strong>eling Space<br />
LED<br />
Optics<br />
Screen<br />
GrayCode Mask<br />
pos=0<br />
pos=15<br />
1 LED for 1 Bit pattern<br />
41
<strong>Lab</strong>eling Space<br />
LED<br />
Optics<br />
Screen<br />
GrayCode Mask<br />
pos=0<br />
pos=15<br />
1 LED for 1 Bit pattern<br />
<strong>Lab</strong>eling Space<br />
LED<br />
Optics<br />
Screen<br />
GrayCode Mask<br />
pos=0<br />
pos=15<br />
1 LED for 1 Bit pattern<br />
42
Coded Illumination Projector<br />
Focusing Optics<br />
Condensing Optics<br />
Light Source<br />
Gray code Slide<br />
The Gray code pattern<br />
Photosensing Tag<br />
43
2D Location<br />
3D Location<br />
X data<br />
X2 data<br />
X data<br />
Y data<br />
Y data<br />
Emitter Complexity<br />
Optical<br />
Motion<br />
Capture<br />
Receiver Complexity<br />
44
Imperceptible tags, Ambient Lighting, Id per marker<br />
Prakash [Raskar, Nii, Summet et al Siggraph 2007]<br />
High Speed Tracking<br />
45
Lightmeters: Realistic Editing + Blurring<br />
46
Coded Illumination<br />
for<br />
Motion Capture<br />
• 500 Hz Tracking<br />
• Id for each Marker Tag<br />
• Capture in Natural Environment<br />
– <strong>Visual</strong>ly imperceptible tags<br />
– Photosensing Tag can be hidden under clothes<br />
– Ambient lighting is ok<br />
• Unlimited Number of Tags Allowed<br />
• Base station and tags only a few 10’s $<br />
Acknowledgements<br />
• Amit Agrawal, MERL<br />
• Jack Tumblin, Northwestern U.<br />
• Shree Nayar, Columbia U.<br />
• MERL<br />
– Jay Thornton, Keisuke Kojima<br />
• Mitsubishi Electric Japan<br />
– Kazuhiko Sumi, Haruhisa Okuda<br />
• Coded Aperture and Light Field<br />
– Ashok Veerarghavan, Ankit Mohan<br />
• Prakash, Motion Capture<br />
– Masahiko Inami, Hideaki Nii, Yuki Hashimoto, Jay Summet, Erich Bruns,<br />
Paul Dietz, Bert de Decker, Philippe Bekaert<br />
• Prof Yagi, Prof Ikeuchi and ACCV Organizers<br />
47
Future of Coding Light<br />
• How to block light in other ways?<br />
– Time, Space, Illumination .. Wavelength? On Sensor?<br />
• What other blockers?<br />
– Dynamic masks (LCDs), non-planar or colored masks?<br />
• Applications<br />
– Estimate params in presence of low pass convolution<br />
– Light Field Applications: lens aberration, microscopy<br />
• Coded Exposure<br />
– Motion Deblurring<br />
• Coded Aperture<br />
– Focus Deblurring<br />
• Optical Heterodyning<br />
– Light Field Capture<br />
Coded Photography<br />
• Coded Illumination<br />
– Motion Capture<br />
– Multi-flash: Shape Contours<br />
Projector<br />
Pos=0<br />
Tags<br />
• Epsilon->Coded->Impossible Photos<br />
Pos=255<br />
48
Blind Camera<br />
Sascha Pohflepp,<br />
U of the Art, Berlin, 2006<br />
49
• Coded Exposure<br />
– Motion Deblurring<br />
• Coded Aperture<br />
– Focus Deblurring<br />
• Optical Heterodyning<br />
– Light Field Capture<br />
Coded Photography<br />
• Coded Illumination<br />
– Motion Capture<br />
– Multi-flash: Shape Contours<br />
Projector<br />
Pos=0<br />
Tags<br />
• Epsilon->Coded->Impossible Photos<br />
Pos=255<br />
50
Multi-flash Camera for<br />
Detecting Depth Edges<br />
Left Top Right Bottom<br />
Depth<br />
Edges<br />
Canny Edges<br />
Depth Edges<br />
51
Car Manuals<br />
52
What are the problems<br />
with ‘real’ photo in<br />
conveying information ?<br />
Why do we hire artists<br />
to draw what can be<br />
photographed ?<br />
Shadows<br />
Clutter<br />
Many Colors<br />
Highlight Shape Edges<br />
Mark moving parts<br />
Basic colors<br />
53
Shadows<br />
A New Problem<br />
Highlight Edges<br />
Clutter<br />
Mark moving parts<br />
Many Colors<br />
Basic colors<br />
Gestures<br />
Input Photo Canny Edges Depth Edges<br />
54
Depth Edges with MultiFlash<br />
Raskar, Tan, Feris, Jingyi Yu, Turk – ACM SIGGRAPH 2004<br />
55
Depth Discontinuities<br />
Internal and external<br />
Shape boundaries, Occluding contour, Silhouettes<br />
57
Depth<br />
Edges<br />
Canny<br />
Our Method<br />
58
Photo<br />
Result<br />
Our Method<br />
Canny Intensity<br />
Edge Detection<br />
59