14.08.2013 Views

Sumit Basu - MIT Media Laboratory

Sumit Basu - MIT Media Laboratory

Sumit Basu - MIT Media Laboratory

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Sumit</strong> <strong>Basu</strong><br />

Ph.D. Candidate, <strong>MIT</strong> Dept. of Electrical Engineering and Computer Science, <strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong><br />

Personal Data<br />

Address: E15-383, 20 Ames Street<br />

Cambridge, MA 02139 USA<br />

Office: (617) 253-0370<br />

Home: (617) 784-4594<br />

Email: sbasu@media.mit.edu<br />

WWW: http://www.media.mit.edu/~sbasu<br />

Date of Birth: February 12, 1974<br />

Citizenship: USA<br />

Educational Background<br />

Massachusetts Institute of Technology<br />

Candidate, Ph.D. in Electrical Engineering and Computer Science, minor in<br />

Brain and Cognitive Sciences<br />

Advisor: Professor Alex (Sandy) Pentland, <strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong><br />

Thesis: Conversational Scene Analysis<br />

GPA: 5.0/5.0<br />

Massachusetts Institute of Technology<br />

Master of Engineering in Electrical Engineering and Computer Science<br />

Advisor: Professor Alex (Sandy) Pentland, <strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong><br />

Thesis: A Three-Dimensional Model of Human Lip Motions<br />

GPA: 5.0/5.0<br />

Massachusetts Institute of Technology<br />

Bachelor of Science in Electrical Science and Engineering, Phi Beta Kappa<br />

Thesis: Hyperacuity Sensing for Image Processing<br />

GPA: 5.0/5.0<br />

Research/Industrial Experience<br />

<strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong>, Vision and Modeling Group<br />

Research Assistant<br />

Advisor: Professor Alex (Sandy) Pentland<br />

Topics: modeling conversational interactions from low-level features, prosodic<br />

feature estimation, active interfaces, bayesian networks (exact and approximate<br />

inference), speech detection, wearable phased arrays, source localization,<br />

maximum likelihood tracking for deformable meshes (applied to lip tracking),<br />

finite element priors for 3D meshes, optical-flow regularization with 3D models<br />

(applied to head tracking), vision-steered beamforming.<br />

Perceptive Network Technologies<br />

Perceptual Engineer<br />

Supervisor: Dr. Julian Center, Jr.<br />

Topic: developed a real-time speech detection module.<br />

1997 -<br />

Summer, 2002<br />

(expected)<br />

1995 – 1997<br />

1991-1995<br />

1995 – present<br />

August 2000


Microsoft Research, Vision Group<br />

Research Consultant<br />

Advisor: Kentaro Toyama<br />

Topic: developed a real-time version of my 3D lip tracking work, began the<br />

development of a mesh-based smoothing algorithm.<br />

<strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong>, Speech Interfaces Group<br />

Undergraduate Research Assistant<br />

Advisor: Chris Schmandt<br />

Topics: audio interfaces, speaker identification and segmentation.<br />

Xerox PARC, Electronic Materials <strong>Laboratory</strong> (EML)<br />

Research Intern<br />

Advisors: Warren Jackson, David Biegelsen, David Jared<br />

Topic: image enhancement using position-sensitive detectors (PSD) and their<br />

mechanisms as applied to standard scanning elements.<br />

Xerox PARC, Computer Science <strong>Laboratory</strong> (CSL)<br />

Research Intern<br />

Advisor: David Goldberg<br />

Topics: handwritten character recognition for a palmtop device, pen-based<br />

interfaces<br />

Iowa Department of Transportation, Information Systems<br />

Developer<br />

Supervisors: Ron Laird, Robert Klopping<br />

Topics: developed a C++ application, BIAS (Bid Item Automation System), for<br />

entering/editing contract items, to be used by all contractors working for the<br />

state of Iowa.<br />

Research Interests<br />

Please see http://www.media.mit.edu/~sbasu for a full statement of my research interests<br />

• Machine Perception (Vision and Audition)<br />

• Speech Processing<br />

• Machine Learning<br />

• Computer-Human Interfaces<br />

Teaching Experience<br />

<strong>MIT</strong> Deparment of Electrical Engineering and Computer Science<br />

Graduate Teaching Assistant<br />

Class: Probabilistic Systems Analysis (6.041/6.431)<br />

Supervisor: Professor Dimitri Bertsekas<br />

Description: Undergraduate/Graduate introduction to probability, going from<br />

basic axioms through central limit theorems, markov chains, basic stochastic<br />

processes.<br />

Duties: Each week, taught one 40-student recitiation (undergraduates), eight 5student<br />

tutorials (interactive problem solving), held two-four office hours<br />

(undergraduates/graduates), and attended a two hour staff meeting with other<br />

TAs. Additionally graded papers, led several quiz reviews for the entire class<br />

(300+ students) and developed handouts for the students. Received an excellent<br />

evaluation in the students’ “Underground Guide:” “Recitation instructor S.<br />

<strong>Basu</strong> was extremely well-liked by his students, receiving comments such as<br />

August 1998<br />

1993-1995<br />

June, 1994-<br />

August, 1994<br />

June, 1993-<br />

August, 1993<br />

May, 1992-<br />

August, 1992<br />

Fall, 1998


"<strong>Sumit</strong> rocks the house!" from his students. He was very available, very friendly,<br />

and very helpful, taking the time to help his students whenever it was needed.<br />

He was considered an excellent teacher, who was always prepared and had<br />

good examples to clarify his points. His tutorials were "great fun" and very<br />

helpful for learning. His explanations were clear, and very helpful.”<br />

<strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong><br />

UROP (Undergraduate Research Opportunities Program) Supervisor<br />

Supervisor: Alex (Sandy) Pentland<br />

Description: Held regular meetings with undergraduate researchers under my<br />

supervision, taught relevant theory, gave guidance in choosing research goals<br />

and coursework/career paths, engaged in many one-on-one help sessions,<br />

evaluated their progress.<br />

<strong>MIT</strong> F/ASIP (Freshmen/Alumni Summer Internship Program)<br />

Counselor<br />

Supervisor/Organizer: Professor Arthur Steinberg<br />

Duties: Worked with groups of freshmen in the F/ASIP program. Led resume<br />

workshops, simulated work scenarios, helped students develop simulated<br />

consulting reports. Also advised students in course selection and career<br />

strategies.<br />

<strong>MIT</strong> ISP (Integrated Studies Program)<br />

Guest Lecturer<br />

Supervisor: Professor Arthur Steinberg<br />

Gave a guest lecture on “Artificial Intelligence and the Future of Computing” to<br />

a group of 40 freshmen.<br />

<strong>MIT</strong> ESP (Educational Studies Program)<br />

Teacher for Songwriting Workshop<br />

Supervisor: <strong>MIT</strong> ESP<br />

Co-taught (with Regalp Sen) a two-hour workshop on songwriting skills for 25<br />

high school students. Led warm-up exercises, discussed elements of rhyme and<br />

meter, helped students combine words and music in group activities.<br />

Teaching Interests<br />

This is a subset of the possible courses I would be interested in teaching:<br />

• Introduction to Probability and Statistics (Undergraduate/Graduate)<br />

• Introduction to Pattern Recognition/Machine Learning (Undergraduate/Graduate)<br />

• Computational Perception – Vision and Audition (Undergraduate/Graduate)<br />

• Signal Processing for Voice and Music Applications (Undergraduate/Graduate)<br />

• Speech Processing/Speech Recognition (Graduate)<br />

• Machine Perception Seminar (Graduate)<br />

• Adaptive Interfaces (Graduate)<br />

• Bayesian Networks (Graduate)<br />

• Machine Learning Seminar (Graduate)<br />

Awards, Honor Societies, and Fellowships<br />

• NSF Graduate Research Fellowship – Sept.1995-Aug., 1998<br />

• Member, Phi Beta Kappa (Undergraduate Honor Society) – May, 1995-present<br />

• Member, Sigma Xi (Scientific Research Society) – May, 1995-present<br />

Fall 1995 –<br />

present<br />

Spring, 1998;<br />

Spring, 1999<br />

Spring, 2001<br />

Fall, 2001


Patents<br />

• Member, Tau Beta Pi (Engineering Honor Society) – March, 1994 - present<br />

• Member, Eta Kappa Nu (EECS Honor Society) – March, 1994 - present<br />

• Winner, <strong>MIT</strong> 6.270 Robotics Competition (team: Loren Shih, myself) – January, 1993<br />

• United States Presidential Scholar - 1991<br />

Warren B. Jackson, David A. Jared, <strong>Sumit</strong> <strong>Basu</strong>, and David K. Biegelsen. "Macrodetector-Based<br />

Image Conversion System." US Patent No. 5,790,699. Granted August 4, 1998. Available at<br />

http://www.uspto.gov/patft/index.html.<br />

Warren B. Jackson, David A. Jared, <strong>Sumit</strong> <strong>Basu</strong>, and David K. Biegelsen. "Position Sensitive<br />

Detector Based Image Conversion System Capable of Preserving Subpixel Information." US<br />

Patent No. 5,754,690. Granted May 19, 1998. Available at http://www.uspto.gov/patft/index.html.<br />

Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, <strong>Sumit</strong> <strong>Basu</strong>, and<br />

Evgeniy Gusvatin. "Method of Establishing a Communications Link Using Perceptual Sensing of<br />

a User's Presence." Filed November 10, 2000 (Pending).<br />

Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, and <strong>Sumit</strong> <strong>Basu</strong>.<br />

"Method of Extending Image-Band Face Recognition Systems to Utilize Multi-View Image<br />

Sequences and Audio Information." Filed November 10, 2000 (Pending).<br />

Refereed Journal Publications<br />

<strong>Sumit</strong> <strong>Basu</strong>, Nuria Oliver, and Alex Pentland. "3D Lip Shapes from Video: A Combined Physical-<br />

Statistical Model." Speech Communication 26, 1998. pp. 131-148.<br />

Refereed Conference/Workshop Publications<br />

<strong>Sumit</strong> <strong>Basu</strong>*, Tanzeem Choudhury*, Brian Clarkson*, and Alex Pentland. "Towards Measuring<br />

Human Interactions in Conversational Settings." In Proceedings of the IEEE Int’l Workshop on<br />

Cues in Communication (CUES 2001) at CVPR 2001. Kauai, Hawaii. *The first three authors<br />

contributed equally to this work and are listed alphabetically.<br />

<strong>Sumit</strong> <strong>Basu</strong>, Brian Clarkson, and Alex Pentland. "Smart Headphones: Enhancing Auditory<br />

Awareness through Robust Speech Detection and Source Localization." In Proceedings of the<br />

IEEE Conf. on Acoustics, Speech, and Signal Processing (ICASSP ’01). Salt Lake City, Utah.<br />

2001.<br />

<strong>Sumit</strong> <strong>Basu</strong>, Brian Clarkson, and Alex Pentland. "Smart Headphones." In Proceedings of the<br />

Conference on Human Factors in Computing Systems (CHI ’01). (Short Paper). Seattle,<br />

Washington. April, 2001.<br />

<strong>Sumit</strong> <strong>Basu</strong>, Steve Schwartz, and Alex Pentland. "Wearable Phased Arrays for Sound Localization<br />

and Enhancement." In Proceedings of the IEEE Int’l Symposium on Wearable Computing (ISWC<br />

’00). Atlanta, Georgia. October, 2000. pp. 103-110.<br />

Jacob Strom, Tony Jebara, <strong>Sumit</strong> <strong>Basu</strong>, and Alex Pentland. "Real Time Tracking and Modeling of<br />

Faces: An EKF-based Analysis by Synthesis Approach." In Proceedings of the IEEE Modeling<br />

People Workshop at the IEEE Int’l Conf. on Computer Vision 1999 (ICCV '99). Kerkyra, Greece.<br />

September, 1999.


Christopher R. Wren, <strong>Sumit</strong> <strong>Basu</strong>, Flavia Sparacino, and Alex Pentland. "Combining Audio and<br />

Video in Perceptive Spaces." In Proceedings of the First Int’l Workshop on Managing Interactions<br />

in Smart Environments. Dublin, Ireland. 1999.<br />

<strong>Sumit</strong> <strong>Basu</strong>, Nuria Oliver, and Alex Pentland. "Coding Human Lip Motions with a Learned 3D<br />

Model." In Proceedings of the Int’l Workshop on Very Low Bitrate Video Coding (VLBV ’98).<br />

Urbana, Illinois. October, 1998.<br />

<strong>Sumit</strong> <strong>Basu</strong>, Nuria Oliver, and Alex Pentland. "3D Modeling and Tracking of Human Lips." In<br />

Proceedings of the IEEE Int'l Conf. on Computer Vision (ICCV ’98). Mumbai, India. January,<br />

1998. pp. 337-343.<br />

Christopher R. Wren, <strong>Sumit</strong> <strong>Basu</strong>, and Alex Pentland."Perceptive Spaces: Learning Dynamic<br />

Models of Human Behavior." In Proceedings of the Workshop on Perceptual User Interfaces (PUI<br />

'97). Banff, Canada. 1997.<br />

<strong>Sumit</strong> <strong>Basu</strong> and Alex Pentland. "Recovering 3D Lip Structure from 2D Observations Using a<br />

Model Trained from Video." In Proceedings of the ESCA Workshop on Audio-Visual Speech<br />

Processing (AVSP'97). Rodos, Greece. 1997.<br />

<strong>Sumit</strong> <strong>Basu</strong> and Alex Pentland. "A Three-Dimensional Model of Human Lip Motion." In<br />

Proceedings of the IEEE Non-Rigid and Articulated Motion Workshop at the IEEE Conference on<br />

Computer Vision and Pattern Recognition (CVPR '97). San Juan, Puerto Rico. June, 1997.<br />

Irfan Essa, <strong>Sumit</strong> <strong>Basu</strong>, Trevor Darrell, and Alex Pentland. "Modeling, Tracking, and Interactive<br />

Animation of Faces and Heads Using Input from Video." In Proceedings of Computer Animation<br />

’96. Geneva, Switzerland. 1996.<br />

<strong>Sumit</strong> <strong>Basu</strong>, Irfan Essa, and Alex Pentland. "Motion Regularization for Model-Based Head<br />

Tracking." In Proceedings of the 13 th IEEE Int'l Conf. on Pattern Recognition (ICPR '96). Vienna,<br />

Austria. September, 1996. pp. 611-616.<br />

<strong>Sumit</strong> <strong>Basu</strong>, Michael Casey, Bill Gardner, Ali Azarbayejani, and Alex Pentland. "Vision-Steered<br />

Audio for Interactive Environments." In Proceedings of the 1996 Image Communications<br />

Conference (IMAGE'COM '96). Bordeaux, France. 1996.<br />

Michael Casey, William G. Gardner, and <strong>Sumit</strong> <strong>Basu</strong>."Vision Steered Beamforming and<br />

Transaural Rendering for the Artificial Life Interactive Video Environment (ALIVE)." In<br />

Proceedings of the 99th Convention of the Audio Engineering Society. 1995.<br />

Selected Technical Reports<br />

<strong>Sumit</strong> <strong>Basu</strong>*, Tanzeem Choudhury*, Brian Clarkson*, and Alex Pentland. "Learning Human<br />

Interactions with the Influence Model." Vismod Technical Report #539. June, 2001. *The first<br />

three authors contributed equally to this work and are listed alphabetically.<br />

<strong>Sumit</strong> <strong>Basu</strong>. “ICA: A Critical Review of Three Prominent Approaches.” Technical Report. April,<br />

2000.<br />

<strong>Sumit</strong> <strong>Basu</strong>. "Empirical Results on the Generalization Capabilities and Convergence Properties of<br />

the Bayes Point Machine" December, 1999.<br />

<strong>Sumit</strong> <strong>Basu</strong>, Kentaro Toyama, and Alex Pentland. "A Consistent Method for Function<br />

Approximation in Mesh-based Applications." Vismod Technical Report #486. 1999.


<strong>Sumit</strong> <strong>Basu</strong>. "Efficient Multiscale Template Matching with Orthogonal Wavelet Decompositions."<br />

May, 1997.<br />

Trevor Darrell, <strong>Sumit</strong> <strong>Basu</strong>, Christopher Wren, and Alex Pentland. "Perceptually-Driven Avatars<br />

and Interfaces: Active Methods for Direct Control." Vismod Technical Report #416. 1997.<br />

Invited Talks<br />

<strong>Sumit</strong> <strong>Basu</strong> and Alex Pentland, "Concept Formation in Multi-Modal Learning." In Alex Pentland,<br />

Tony Jebara, Brian Clarkson, and <strong>Sumit</strong> <strong>Basu</strong>, Learning Techniques in Audio-Visual Information<br />

Processing, a tutorial at the Int’l Conf. on Pattern Recognition (ICPR ’00) Barcelona, Spain.<br />

September 3, 2000.<br />

<strong>Sumit</strong> <strong>Basu</strong>. "Empirical Results on the Generalization Capabilities and Convergence Properties of<br />

the Bayes Point Machine." Invited talk at Tomaso Poggio’s group meeting, <strong>MIT</strong> AI Lab/CBCL.<br />

May 12, 2000.<br />

<strong>Sumit</strong> <strong>Basu</strong>, Deb Roy, Brian Clarkson, and Alex Pentland. "Learning the Structure of Human<br />

Behavior from Sensory Inputs: Language, Daily Patterns, and Conversations." At Grounded<br />

Intersensory Language Learning in Sign and Speech (GILLS ’00). Grenoble, France. March 24,<br />

2000.<br />

Musical Projects<br />

I am an avid songwriter/singer/keyboardist/guitarist, and am involved in a number of musical projects:<br />

References<br />

08:29:06, a solo album I released in June 2000 under the name deepoceanblue, available at<br />

http://www.mp3.com/deepoceanblue<br />

Sonovar/Bodybeat, an electronic music project experimenting with convolution and natural sounds<br />

as musical building blocks (joint work with Brian Clarkson). We received an <strong>MIT</strong> Council for the<br />

Arts Grant for this project in February 2000. Samples are available on my website.<br />

Additional projects are listed at http://www.media.mit.edu/~sbasu/music.html<br />

Listed in a separate document.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!