Sumit Basu - MIT Media Laboratory
Sumit Basu - MIT Media Laboratory
Sumit Basu - MIT Media Laboratory
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Sumit</strong> <strong>Basu</strong><br />
Ph.D. Candidate, <strong>MIT</strong> Dept. of Electrical Engineering and Computer Science, <strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong><br />
Personal Data<br />
Address: E15-383, 20 Ames Street<br />
Cambridge, MA 02139 USA<br />
Office: (617) 253-0370<br />
Home: (617) 784-4594<br />
Email: sbasu@media.mit.edu<br />
WWW: http://www.media.mit.edu/~sbasu<br />
Date of Birth: February 12, 1974<br />
Citizenship: USA<br />
Educational Background<br />
Massachusetts Institute of Technology<br />
Candidate, Ph.D. in Electrical Engineering and Computer Science, minor in<br />
Brain and Cognitive Sciences<br />
Advisor: Professor Alex (Sandy) Pentland, <strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong><br />
Thesis: Conversational Scene Analysis<br />
GPA: 5.0/5.0<br />
Massachusetts Institute of Technology<br />
Master of Engineering in Electrical Engineering and Computer Science<br />
Advisor: Professor Alex (Sandy) Pentland, <strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong><br />
Thesis: A Three-Dimensional Model of Human Lip Motions<br />
GPA: 5.0/5.0<br />
Massachusetts Institute of Technology<br />
Bachelor of Science in Electrical Science and Engineering, Phi Beta Kappa<br />
Thesis: Hyperacuity Sensing for Image Processing<br />
GPA: 5.0/5.0<br />
Research/Industrial Experience<br />
<strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong>, Vision and Modeling Group<br />
Research Assistant<br />
Advisor: Professor Alex (Sandy) Pentland<br />
Topics: modeling conversational interactions from low-level features, prosodic<br />
feature estimation, active interfaces, bayesian networks (exact and approximate<br />
inference), speech detection, wearable phased arrays, source localization,<br />
maximum likelihood tracking for deformable meshes (applied to lip tracking),<br />
finite element priors for 3D meshes, optical-flow regularization with 3D models<br />
(applied to head tracking), vision-steered beamforming.<br />
Perceptive Network Technologies<br />
Perceptual Engineer<br />
Supervisor: Dr. Julian Center, Jr.<br />
Topic: developed a real-time speech detection module.<br />
1997 -<br />
Summer, 2002<br />
(expected)<br />
1995 – 1997<br />
1991-1995<br />
1995 – present<br />
August 2000
Microsoft Research, Vision Group<br />
Research Consultant<br />
Advisor: Kentaro Toyama<br />
Topic: developed a real-time version of my 3D lip tracking work, began the<br />
development of a mesh-based smoothing algorithm.<br />
<strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong>, Speech Interfaces Group<br />
Undergraduate Research Assistant<br />
Advisor: Chris Schmandt<br />
Topics: audio interfaces, speaker identification and segmentation.<br />
Xerox PARC, Electronic Materials <strong>Laboratory</strong> (EML)<br />
Research Intern<br />
Advisors: Warren Jackson, David Biegelsen, David Jared<br />
Topic: image enhancement using position-sensitive detectors (PSD) and their<br />
mechanisms as applied to standard scanning elements.<br />
Xerox PARC, Computer Science <strong>Laboratory</strong> (CSL)<br />
Research Intern<br />
Advisor: David Goldberg<br />
Topics: handwritten character recognition for a palmtop device, pen-based<br />
interfaces<br />
Iowa Department of Transportation, Information Systems<br />
Developer<br />
Supervisors: Ron Laird, Robert Klopping<br />
Topics: developed a C++ application, BIAS (Bid Item Automation System), for<br />
entering/editing contract items, to be used by all contractors working for the<br />
state of Iowa.<br />
Research Interests<br />
Please see http://www.media.mit.edu/~sbasu for a full statement of my research interests<br />
• Machine Perception (Vision and Audition)<br />
• Speech Processing<br />
• Machine Learning<br />
• Computer-Human Interfaces<br />
Teaching Experience<br />
<strong>MIT</strong> Deparment of Electrical Engineering and Computer Science<br />
Graduate Teaching Assistant<br />
Class: Probabilistic Systems Analysis (6.041/6.431)<br />
Supervisor: Professor Dimitri Bertsekas<br />
Description: Undergraduate/Graduate introduction to probability, going from<br />
basic axioms through central limit theorems, markov chains, basic stochastic<br />
processes.<br />
Duties: Each week, taught one 40-student recitiation (undergraduates), eight 5student<br />
tutorials (interactive problem solving), held two-four office hours<br />
(undergraduates/graduates), and attended a two hour staff meeting with other<br />
TAs. Additionally graded papers, led several quiz reviews for the entire class<br />
(300+ students) and developed handouts for the students. Received an excellent<br />
evaluation in the students’ “Underground Guide:” “Recitation instructor S.<br />
<strong>Basu</strong> was extremely well-liked by his students, receiving comments such as<br />
August 1998<br />
1993-1995<br />
June, 1994-<br />
August, 1994<br />
June, 1993-<br />
August, 1993<br />
May, 1992-<br />
August, 1992<br />
Fall, 1998
"<strong>Sumit</strong> rocks the house!" from his students. He was very available, very friendly,<br />
and very helpful, taking the time to help his students whenever it was needed.<br />
He was considered an excellent teacher, who was always prepared and had<br />
good examples to clarify his points. His tutorials were "great fun" and very<br />
helpful for learning. His explanations were clear, and very helpful.”<br />
<strong>MIT</strong> <strong>Media</strong> <strong>Laboratory</strong><br />
UROP (Undergraduate Research Opportunities Program) Supervisor<br />
Supervisor: Alex (Sandy) Pentland<br />
Description: Held regular meetings with undergraduate researchers under my<br />
supervision, taught relevant theory, gave guidance in choosing research goals<br />
and coursework/career paths, engaged in many one-on-one help sessions,<br />
evaluated their progress.<br />
<strong>MIT</strong> F/ASIP (Freshmen/Alumni Summer Internship Program)<br />
Counselor<br />
Supervisor/Organizer: Professor Arthur Steinberg<br />
Duties: Worked with groups of freshmen in the F/ASIP program. Led resume<br />
workshops, simulated work scenarios, helped students develop simulated<br />
consulting reports. Also advised students in course selection and career<br />
strategies.<br />
<strong>MIT</strong> ISP (Integrated Studies Program)<br />
Guest Lecturer<br />
Supervisor: Professor Arthur Steinberg<br />
Gave a guest lecture on “Artificial Intelligence and the Future of Computing” to<br />
a group of 40 freshmen.<br />
<strong>MIT</strong> ESP (Educational Studies Program)<br />
Teacher for Songwriting Workshop<br />
Supervisor: <strong>MIT</strong> ESP<br />
Co-taught (with Regalp Sen) a two-hour workshop on songwriting skills for 25<br />
high school students. Led warm-up exercises, discussed elements of rhyme and<br />
meter, helped students combine words and music in group activities.<br />
Teaching Interests<br />
This is a subset of the possible courses I would be interested in teaching:<br />
• Introduction to Probability and Statistics (Undergraduate/Graduate)<br />
• Introduction to Pattern Recognition/Machine Learning (Undergraduate/Graduate)<br />
• Computational Perception – Vision and Audition (Undergraduate/Graduate)<br />
• Signal Processing for Voice and Music Applications (Undergraduate/Graduate)<br />
• Speech Processing/Speech Recognition (Graduate)<br />
• Machine Perception Seminar (Graduate)<br />
• Adaptive Interfaces (Graduate)<br />
• Bayesian Networks (Graduate)<br />
• Machine Learning Seminar (Graduate)<br />
Awards, Honor Societies, and Fellowships<br />
• NSF Graduate Research Fellowship – Sept.1995-Aug., 1998<br />
• Member, Phi Beta Kappa (Undergraduate Honor Society) – May, 1995-present<br />
• Member, Sigma Xi (Scientific Research Society) – May, 1995-present<br />
Fall 1995 –<br />
present<br />
Spring, 1998;<br />
Spring, 1999<br />
Spring, 2001<br />
Fall, 2001
Patents<br />
• Member, Tau Beta Pi (Engineering Honor Society) – March, 1994 - present<br />
• Member, Eta Kappa Nu (EECS Honor Society) – March, 1994 - present<br />
• Winner, <strong>MIT</strong> 6.270 Robotics Competition (team: Loren Shih, myself) – January, 1993<br />
• United States Presidential Scholar - 1991<br />
Warren B. Jackson, David A. Jared, <strong>Sumit</strong> <strong>Basu</strong>, and David K. Biegelsen. "Macrodetector-Based<br />
Image Conversion System." US Patent No. 5,790,699. Granted August 4, 1998. Available at<br />
http://www.uspto.gov/patft/index.html.<br />
Warren B. Jackson, David A. Jared, <strong>Sumit</strong> <strong>Basu</strong>, and David K. Biegelsen. "Position Sensitive<br />
Detector Based Image Conversion System Capable of Preserving Subpixel Information." US<br />
Patent No. 5,754,690. Granted May 19, 1998. Available at http://www.uspto.gov/patft/index.html.<br />
Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, <strong>Sumit</strong> <strong>Basu</strong>, and<br />
Evgeniy Gusvatin. "Method of Establishing a Communications Link Using Perceptual Sensing of<br />
a User's Presence." Filed November 10, 2000 (Pending).<br />
Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, and <strong>Sumit</strong> <strong>Basu</strong>.<br />
"Method of Extending Image-Band Face Recognition Systems to Utilize Multi-View Image<br />
Sequences and Audio Information." Filed November 10, 2000 (Pending).<br />
Refereed Journal Publications<br />
<strong>Sumit</strong> <strong>Basu</strong>, Nuria Oliver, and Alex Pentland. "3D Lip Shapes from Video: A Combined Physical-<br />
Statistical Model." Speech Communication 26, 1998. pp. 131-148.<br />
Refereed Conference/Workshop Publications<br />
<strong>Sumit</strong> <strong>Basu</strong>*, Tanzeem Choudhury*, Brian Clarkson*, and Alex Pentland. "Towards Measuring<br />
Human Interactions in Conversational Settings." In Proceedings of the IEEE Int’l Workshop on<br />
Cues in Communication (CUES 2001) at CVPR 2001. Kauai, Hawaii. *The first three authors<br />
contributed equally to this work and are listed alphabetically.<br />
<strong>Sumit</strong> <strong>Basu</strong>, Brian Clarkson, and Alex Pentland. "Smart Headphones: Enhancing Auditory<br />
Awareness through Robust Speech Detection and Source Localization." In Proceedings of the<br />
IEEE Conf. on Acoustics, Speech, and Signal Processing (ICASSP ’01). Salt Lake City, Utah.<br />
2001.<br />
<strong>Sumit</strong> <strong>Basu</strong>, Brian Clarkson, and Alex Pentland. "Smart Headphones." In Proceedings of the<br />
Conference on Human Factors in Computing Systems (CHI ’01). (Short Paper). Seattle,<br />
Washington. April, 2001.<br />
<strong>Sumit</strong> <strong>Basu</strong>, Steve Schwartz, and Alex Pentland. "Wearable Phased Arrays for Sound Localization<br />
and Enhancement." In Proceedings of the IEEE Int’l Symposium on Wearable Computing (ISWC<br />
’00). Atlanta, Georgia. October, 2000. pp. 103-110.<br />
Jacob Strom, Tony Jebara, <strong>Sumit</strong> <strong>Basu</strong>, and Alex Pentland. "Real Time Tracking and Modeling of<br />
Faces: An EKF-based Analysis by Synthesis Approach." In Proceedings of the IEEE Modeling<br />
People Workshop at the IEEE Int’l Conf. on Computer Vision 1999 (ICCV '99). Kerkyra, Greece.<br />
September, 1999.
Christopher R. Wren, <strong>Sumit</strong> <strong>Basu</strong>, Flavia Sparacino, and Alex Pentland. "Combining Audio and<br />
Video in Perceptive Spaces." In Proceedings of the First Int’l Workshop on Managing Interactions<br />
in Smart Environments. Dublin, Ireland. 1999.<br />
<strong>Sumit</strong> <strong>Basu</strong>, Nuria Oliver, and Alex Pentland. "Coding Human Lip Motions with a Learned 3D<br />
Model." In Proceedings of the Int’l Workshop on Very Low Bitrate Video Coding (VLBV ’98).<br />
Urbana, Illinois. October, 1998.<br />
<strong>Sumit</strong> <strong>Basu</strong>, Nuria Oliver, and Alex Pentland. "3D Modeling and Tracking of Human Lips." In<br />
Proceedings of the IEEE Int'l Conf. on Computer Vision (ICCV ’98). Mumbai, India. January,<br />
1998. pp. 337-343.<br />
Christopher R. Wren, <strong>Sumit</strong> <strong>Basu</strong>, and Alex Pentland."Perceptive Spaces: Learning Dynamic<br />
Models of Human Behavior." In Proceedings of the Workshop on Perceptual User Interfaces (PUI<br />
'97). Banff, Canada. 1997.<br />
<strong>Sumit</strong> <strong>Basu</strong> and Alex Pentland. "Recovering 3D Lip Structure from 2D Observations Using a<br />
Model Trained from Video." In Proceedings of the ESCA Workshop on Audio-Visual Speech<br />
Processing (AVSP'97). Rodos, Greece. 1997.<br />
<strong>Sumit</strong> <strong>Basu</strong> and Alex Pentland. "A Three-Dimensional Model of Human Lip Motion." In<br />
Proceedings of the IEEE Non-Rigid and Articulated Motion Workshop at the IEEE Conference on<br />
Computer Vision and Pattern Recognition (CVPR '97). San Juan, Puerto Rico. June, 1997.<br />
Irfan Essa, <strong>Sumit</strong> <strong>Basu</strong>, Trevor Darrell, and Alex Pentland. "Modeling, Tracking, and Interactive<br />
Animation of Faces and Heads Using Input from Video." In Proceedings of Computer Animation<br />
’96. Geneva, Switzerland. 1996.<br />
<strong>Sumit</strong> <strong>Basu</strong>, Irfan Essa, and Alex Pentland. "Motion Regularization for Model-Based Head<br />
Tracking." In Proceedings of the 13 th IEEE Int'l Conf. on Pattern Recognition (ICPR '96). Vienna,<br />
Austria. September, 1996. pp. 611-616.<br />
<strong>Sumit</strong> <strong>Basu</strong>, Michael Casey, Bill Gardner, Ali Azarbayejani, and Alex Pentland. "Vision-Steered<br />
Audio for Interactive Environments." In Proceedings of the 1996 Image Communications<br />
Conference (IMAGE'COM '96). Bordeaux, France. 1996.<br />
Michael Casey, William G. Gardner, and <strong>Sumit</strong> <strong>Basu</strong>."Vision Steered Beamforming and<br />
Transaural Rendering for the Artificial Life Interactive Video Environment (ALIVE)." In<br />
Proceedings of the 99th Convention of the Audio Engineering Society. 1995.<br />
Selected Technical Reports<br />
<strong>Sumit</strong> <strong>Basu</strong>*, Tanzeem Choudhury*, Brian Clarkson*, and Alex Pentland. "Learning Human<br />
Interactions with the Influence Model." Vismod Technical Report #539. June, 2001. *The first<br />
three authors contributed equally to this work and are listed alphabetically.<br />
<strong>Sumit</strong> <strong>Basu</strong>. “ICA: A Critical Review of Three Prominent Approaches.” Technical Report. April,<br />
2000.<br />
<strong>Sumit</strong> <strong>Basu</strong>. "Empirical Results on the Generalization Capabilities and Convergence Properties of<br />
the Bayes Point Machine" December, 1999.<br />
<strong>Sumit</strong> <strong>Basu</strong>, Kentaro Toyama, and Alex Pentland. "A Consistent Method for Function<br />
Approximation in Mesh-based Applications." Vismod Technical Report #486. 1999.
<strong>Sumit</strong> <strong>Basu</strong>. "Efficient Multiscale Template Matching with Orthogonal Wavelet Decompositions."<br />
May, 1997.<br />
Trevor Darrell, <strong>Sumit</strong> <strong>Basu</strong>, Christopher Wren, and Alex Pentland. "Perceptually-Driven Avatars<br />
and Interfaces: Active Methods for Direct Control." Vismod Technical Report #416. 1997.<br />
Invited Talks<br />
<strong>Sumit</strong> <strong>Basu</strong> and Alex Pentland, "Concept Formation in Multi-Modal Learning." In Alex Pentland,<br />
Tony Jebara, Brian Clarkson, and <strong>Sumit</strong> <strong>Basu</strong>, Learning Techniques in Audio-Visual Information<br />
Processing, a tutorial at the Int’l Conf. on Pattern Recognition (ICPR ’00) Barcelona, Spain.<br />
September 3, 2000.<br />
<strong>Sumit</strong> <strong>Basu</strong>. "Empirical Results on the Generalization Capabilities and Convergence Properties of<br />
the Bayes Point Machine." Invited talk at Tomaso Poggio’s group meeting, <strong>MIT</strong> AI Lab/CBCL.<br />
May 12, 2000.<br />
<strong>Sumit</strong> <strong>Basu</strong>, Deb Roy, Brian Clarkson, and Alex Pentland. "Learning the Structure of Human<br />
Behavior from Sensory Inputs: Language, Daily Patterns, and Conversations." At Grounded<br />
Intersensory Language Learning in Sign and Speech (GILLS ’00). Grenoble, France. March 24,<br />
2000.<br />
Musical Projects<br />
I am an avid songwriter/singer/keyboardist/guitarist, and am involved in a number of musical projects:<br />
References<br />
08:29:06, a solo album I released in June 2000 under the name deepoceanblue, available at<br />
http://www.mp3.com/deepoceanblue<br />
Sonovar/Bodybeat, an electronic music project experimenting with convolution and natural sounds<br />
as musical building blocks (joint work with Brian Clarkson). We received an <strong>MIT</strong> Council for the<br />
Arts Grant for this project in February 2000. Samples are available on my website.<br />
Additional projects are listed at http://www.media.mit.edu/~sbasu/music.html<br />
Listed in a separate document.