18.07.2013 Views

Image/Video Processing Technologies

Image/Video Processing Technologies

Image/Video Processing Technologies

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Image</strong> <strong>Processing</strong> <strong>Technologies</strong><br />

for the 21 st Century<br />

Lawrence Rabiner<br />

Professor, Rutgers University and the<br />

University of California at Santa Barbara


<strong>Image</strong>/<strong>Video</strong> <strong>Processing</strong> <strong>Technologies</strong><br />

• FAX (Bilevel)/<strong>Image</strong> Compression Standards—JBIG,<br />

JBIG2, JPEG, JPEG2000<br />

• Document Compression—DjVu, pdf<br />

• <strong>Image</strong> Understanding—image libraries, image retrieval,<br />

image search<br />

• <strong>Image</strong> (character) Recognition—OCR, handwriting<br />

• <strong>Video</strong> Compression Standards—MPEG1, MPEG2,<br />

MPEG4, MPEG7<br />

• <strong>Video</strong> Segmentation and Classification—finding faces,<br />

shot/scene/… based on content<br />

• <strong>Video</strong> Conferencing—between groups of individuals, and<br />

between individuals and groups<br />

• <strong>Video</strong> Telephony—narrowband and broadband<br />

• Visual TTS—avatars and sampled faces<br />

• <strong>Image</strong>-based Info Services—networking of multimedia<br />

information—searching and browsing


AT&T <strong>Image</strong>, <strong>Video</strong> and Multimedia<br />

Technology Services/Products<br />

Prototype<br />

Trial<br />

Developed<br />

Deployed<br />

New in 1999<br />

WorldWorx<br />

Vistium (NCR)<br />

H.261/AVC<br />

Chipset<br />

POTS <strong>Video</strong><br />

Phone<br />

Kansas City<br />

Bill Reader<br />

“FIRST” Fax<br />

Form Reader<br />

H.263/MPEG-1<br />

Chipset<br />

Digital TV<br />

(MPEG -2)<br />

NCR Check<br />

Reader/Verifier<br />

Pictorial<br />

Transcripts<br />

FusionNet<br />

NCR Check<br />

Reader-Sorter<br />

Digital <strong>Video</strong><br />

Libraries<br />

<strong>Video</strong>Talks<br />

Language ID<br />

for FAX<br />

Catalog.com<br />

DjVu Clients<br />

DjVu Web Site<br />

SALSA<br />

DjVU SDK<br />

SBTV<br />

1992 1993 1994 1995 1996 1997 1998 1999+<br />

Year


<strong>Image</strong> <strong>Processing</strong> Systems<br />

• DjVu: high quality document compression for web display of<br />

document images<br />

• Shoebox: image libraries with searchable access using audio and<br />

textual content description<br />

• DVL: digital video libraries for searchable, browsable video<br />

content<br />

• <strong>Video</strong> Talks: high quality, real-time, audio-visual presentations to<br />

remote locations and the desktop<br />

• SALSA: utilization of cable upstream capability for TV-based<br />

video telephony over 384 kb/s channels<br />

• Visual TTS: video agent-based technology for user interactions on<br />

the desktop and for portable devices<br />

• netFridge: home information center using cyber-fridge as central<br />

access and control point for information delivery and access<br />

• Broadband Phone: IP-telephone with thin client access to<br />

network-based services including telephony, image transport and<br />

display, games, video, handwriting, messaging, fax


DjVu-- DjVu--Document<br />

Document Compression


What is DjVu?<br />

• A New Method of <strong>Image</strong> Compression from AT&T Labs<br />

– separates the high-contrast detail from the general<br />

graphics<br />

– compresses each with separate algorithms<br />

– decompresses and reassembles the layers on-the-fly<br />

with a browser plug-in<br />

– progressive rendering<br />

– multi-page documents<br />

– hyperlinks


Size Comparisons<br />

TIFF/BMP 24MB<br />

GIF 4.8MB<br />

JPEG 1.7MB<br />

* Size Based on 300DPI scanned DjVu Brochure page<br />

PDF<br />

17MB<br />

DjVu<br />

55KB


DjVu Segmentation and Coding


Background<br />

graphics uses<br />

wavelet<br />

compression to<br />

compress at 100<br />

DPI using IW44.<br />

How it Works<br />

Mask created from high<br />

contrast portions of image.<br />

Compressed at 300 DPI<br />

using JB2.<br />

Foreground<br />

colors<br />

compressed at<br />

25DPI using<br />

IW44.<br />

<strong>Image</strong> recombined - now<br />

greatly reduced up to 5 to<br />

10 times smaller than<br />

JPEG and decompressed<br />

on-the-fly in browser.


JPEG<br />

DJVU<br />

(Same file sizes)


300 dpi<br />

DJVU Document <strong>Image</strong> Sizes


Declaration of<br />

Independence.djvu<br />

yellowstone.djvu<br />

> DEMO <<br />

technology<br />

article.djvu<br />

French Comic<br />

Strip.djvu<br />

stanley Catalog.djvu


• Compressors<br />

DjVu Software<br />

– Online Compression Server<br />

– Linux, Solaris, IRIX: “djvuencode”<br />

– Windows (third-party)<br />

– Windows/UNIX/Mac SDK<br />

• Plug-in Viewers for Netscape, IE, or Opera<br />

– Linux, Solaris, IRIX, OSF, FreeBSD<br />

– Windows 95/98/NT<br />

– MacOS8<br />

• Editor<br />

– Linux, Solaris, IRIX, HP-UX, (Windows soon)<br />

• Reference Library: Source Code Release


DjVu in Action<br />

http://www.att.net/<br />

http://www.feith.com/<br />

BOOTH #1604<br />

http://www.monarchis.com/


DjVu in Action (cont.)<br />

http://www.roottech.com/<br />

http://www.hp.com/<br />

http://www.cgk.de/<br />

BOOTH #3438


DjVu in Action (cont.)<br />

AI http://www.teletrade.com/<br />

http://www.umi.com/<br />

http://www.art.com/


Digital Photo<br />

Management Technology


What is Shoebox? Shoe ox?<br />

• Digital photo management system<br />

• Allows convenient and intuitive organization,<br />

annotation, and navigation of digital photos<br />

– Features:<br />

• Browse by thumbnail<br />

• Spoken or text annotation of photos<br />

• Text query on all annotations<br />

• Timeline index<br />

• Concept index<br />

• Visual similarity and content search<br />

• Import/Export via XML, publish via HTML


Underlying Technology<br />

• ODMG-compliant OODB<br />

– Text indexing<br />

– B-trees, M-trees<br />

– Extensible and scalable<br />

• Windows front end<br />

• Speech recognition<br />

• <strong>Image</strong> processing


Why here?<br />

Wireless cameras are coming: one click anywhere and it’s safe in your Shoebox<br />

Web<br />

GSM<br />

UMTS<br />

PCS<br />

GPRS<br />

...<br />

Central<br />

Office<br />

Shoebox<br />

Home Shoebox<br />

Email<br />

Broadband Service Opportunities


<strong>Image</strong> <strong>Processing</strong><br />

• Designed to give the computer access<br />

to the ‘content’ of an image<br />

– Segmentation into regions<br />

• Region description<br />

• Whole image similarity metrics<br />

• Composed queries<br />

– Learning: Neural network based semantic<br />

classification of region descriptors


Indexing paradigms<br />

• Generate one “word” per region<br />

– Use text retrieval models (B-trees)<br />

– “smudged queries” to address boundary problems<br />

• Store region properties in a metric space<br />

– M-trees<br />

• Compact summary of an entire image<br />

– Intended to support nearest neighbour browsing


Visualisation of<br />

inter-image inter image<br />

distance:<br />

• Summary vector<br />

representing:<br />

– Dominant region<br />

in each 9th of<br />

the image<br />

– Colour and<br />

texture summary<br />

• MDS algorithm<br />

used to order<br />

images


DVL-Digital DVL Digital <strong>Video</strong> Libraries


Digital <strong>Video</strong> Library (DVL)<br />

• DVL is a digital video management system that<br />

enables the content-based retrieval and intelligent<br />

browsing of video information<br />

– applies text, image, video, audio, and speech<br />

processing techniques to organize, index, and<br />

condense video information<br />

– generates multiple representations of the video<br />

contents to enable the delivery of information over<br />

a range of information appliances and a wide<br />

range of available bandwidth.


Applications of DVL<br />

• Digital media asset management<br />

– effective and efficient utilization of large archives<br />

of video material (e.g., TV content)<br />

• Content “repurposing”<br />

– automatic conversion of TV content to Web<br />

content<br />

• On-demand and content-based retrieval of video<br />

– information<br />

– entertainment<br />

– educational material


Capabilities of DVL<br />

• <strong>Image</strong> and video processing<br />

– Content-based sampling of video, image<br />

matching, image spotting, people detection<br />

• Text processing<br />

– Case restoration (capitalization), automatic<br />

hyperlink generation, parallel text alignment<br />

• Speech and audio processing<br />

– audio classification, speech recognition, speaker<br />

identification, text-speech alignment<br />

• Multimodal processing<br />

– story segmentation (using speech, text, and video)


Current Activities and Future Directions<br />

• Extraction of high-level semantic information to<br />

describe the visual content of video<br />

• <strong>Video</strong> summarization<br />

• Multimodal story segmentation and categorization<br />

• Audio-based searching and browsing<br />

• Speaker-based audio segmentation<br />

• Natural language based interfaces<br />

• Speech-based interfaces


Automated Extraction of the Semantic<br />

Hierarchy of News<br />

Linear<br />

Retrieval<br />

Audio<br />

Table of Contents<br />

Broadcast News Categories<br />

News Summary<br />

Story 1 Story 2<br />

Anchor Detailed News Reporting<br />

News Commercials<br />

Content-based Searching<br />

and Browsing<br />

Multi-Modal<br />

Categorization Segmentationusing<br />

audio and video<br />

and text<br />

<strong>Video</strong><br />

Text This is the broadcast content transcribed by human. It is used to illustrate the construction of semantics using automated techniques based on multimedia<br />

Broadcast news programs: across multiple media; linear in time; flat structure.<br />

...<br />

Topic detection<br />

and categorization<br />

Story extraction<br />

by text processing


WorldNet DVL Home <strong>Video</strong> Trial<br />

DVL Demo


<strong>Video</strong> Talks


A Comprehensive Multimedia Conferencing System<br />

A <strong>Video</strong>Talks Conference Room<br />

☯ VT Interactive - high-quality<br />

multimedia interaction<br />

☯ VT Desktop - multimedia<br />

presentations on desktop<br />

☯ VTJukebox - multimedia<br />

presentations recorder/player<br />

• VTonDemand - VoD with<br />

indexing<br />

• VT Lite - low-bandwidth<br />

(modem) access to <strong>Video</strong>Talks<br />

☯ completed • prototype


Current Networking<br />

Desktop<br />

Desktop<br />

multicast<br />

virtual channel<br />

unicast<br />

Unicast to Conference Rooms:<br />

--presenter and audience audio/video<br />

Multicast to Offices:<br />

--mixed audio/video from presenter/audience<br />

--vugraphs and pointer<br />

Florham Park LAN<br />

LAN<br />

Mcast Tunnel<br />

ATM<br />

SW<br />

Mcast Tunnel<br />

Conference Room<br />

IP /ATM<br />

Guaranteed QoS connection<br />

ATM<br />

SW<br />

Desktop<br />

IP /ATM<br />

LAN<br />

Newman<br />

Springs LAN<br />

Audience: M-JPEG,<br />

VBR, 500-1000 Kb/s<br />

Presenter: M-JPEG,<br />

VBR, 500-1000 Kb/s<br />

NetVG: M-JPEG<br />

(640 x 480 pixels)<br />

with 4 second refresh<br />

at 100-200 Kb/s<br />

Conference Room<br />

Desktop


<strong>Video</strong> Codecs<br />

• Developed in AT&T Labs - Research<br />

• Low delay (


NetVG *<br />

World’s only high resolution viewgraph presentation system featuring a live pointer<br />

Document Camera<br />

Ethernet<br />

Network Interface<br />

<strong>Video</strong> Capture<br />

VGA Interface<br />

Mouse Interface<br />

PC<br />

Mouse (used as a pointer)<br />

Projector<br />

• High resolution, color viewgraphs<br />

• Full-motion pointer synchronized with A/V<br />

• Paper, transparency, or computer presentations<br />

• Real-time annotation<br />

• No security risk<br />

* Patent pending<br />

Display<br />

Pointer


VT Desktop - Desktop Viewing<br />

• Software only (a soundboard is assumed)<br />

• IP Multicast based<br />

• 320x240, 10 -15 fps H.320 video with mu-law PCM audio using:<br />

– MBONE tools for audio/video, or<br />

– IP/TV viewer from Cisco<br />

• NetVG viewer<br />

• High packet loss tolerance<br />

• Not interactive but, text based or telephone based questions are<br />

possible<br />

• Total bandwidth < 400 Kbps<br />

VTJukebox<br />

• On-demand re-multicasting of recorded multimedia presentations<br />

• Can be received using the same software as VT Desktop<br />

• The playback is available to everyone on the intranet<br />

• Requests are played immediately if a channel is available, otherwise queued<br />

• Number of channels depend on available bandwidth<br />

• Improved performance through intelligent recorder/player deployment


VTonDemand - Indexed presentations on the desk top<br />

• Viewgraphs based indices for on demand playback of a presentation<br />

– derived automatically<br />

– displayed on a web browser<br />

– presentation starts from the selected viewgraph


SALSA-- SALSA--<strong>Video</strong><br />

<strong>Video</strong> Telephony over<br />

Cable Systems


SALSA* - A Prototype to Enable Upstream <strong>Video</strong> over Cable TV<br />

• Targets enabling video services using the enhanced upstream<br />

bandwidth to be provided by the multiplexed fiber passive coax<br />

network (LightWire) technology and guaranteed QoS over IP.<br />

• Potential service examples include:<br />

• <strong>Video</strong>phone with family and friends<br />

• <strong>Video</strong> dating<br />

• Personal TV studio at home<br />

• Virtual gatherings, video-telecommuting, virtual town hall<br />

• <strong>Video</strong> messaging<br />

• Interactive shopping: personal tailoring/fashion, customized home shopping assistant,<br />

home design/interior decorator<br />

• Monitoring/surveillance: babysitter camera, elderly citizens’ watch, children at parks and<br />

swimming pools, remote medical monitoring and diagnostics<br />

• Auctions/garage sales<br />

• Distance learning, remote tutoring<br />

• Interactive customer service<br />

• Remote interviewing<br />

• Small business monitoring<br />

• Remote video archiving, video clip-art library<br />

• Play along game shows<br />

• ...<br />

* System for Audio/Visual Live Services & Applications<br />

<strong>Video</strong> Telephony;<br />

Distance Learning;<br />

Home Monitoring


SALSA<br />

• Architectural Elements<br />

– Based on a reduced, off-the shelf PC platform as an add-on<br />

component for a set-top box (STB)<br />

– Software-only codecs<br />

• packet loss resilient coding (H.263+/MPEG-4 and RTP over UDP/IP)<br />

• layered coding for economical use of QoS (optimal split between<br />

guaranteed QoS and best effort bandwidth)<br />

– UI is through the STB<br />

– All signaling and communications is IP based<br />

– Multipoint ready<br />

• Advantages<br />

– Low cost<br />

– Stable<br />

– Easy to use<br />

– Best possible quality


SALSA <strong>Video</strong> Codec & Logical<br />

Transport<br />

H.263+ / MPEG4<br />

SW Only<br />

Layered Codec<br />

High Priority Bitstream,<br />

Guaranteed QoS Delivery<br />

< 100 Kb/s, fixed rate<br />

IP Network<br />

with<br />

Guaranteed QoS<br />

Low Priority Bitstream,<br />

Best Effort Delivery<br />

~ 300 Kb/s, variable rate<br />

H.263+ / MPEG4<br />

SW Only<br />

Layered Codec


First working prototype:


Microphone<br />

Camera<br />

DOCSIS Modem<br />

SALSA Block Diagram<br />

Analog<br />

USB<br />

10baseT<br />

audio,<br />

video,<br />

data<br />

Echo Canceller<br />

plug-in card<br />

VGA/NTSC<br />

500 MHz<br />

Pentium<br />

Motherboard<br />

(compress/<br />

decompress)<br />

SALSA Box<br />

Audio<br />

Baseband<br />

<strong>Video</strong><br />

RS232<br />

Set Top Box<br />

(CISCO<br />

DCT 5000)<br />

TV<br />

NTSC<br />

Salsa <strong>Video</strong><br />

comparison of<br />

salsa and polycom


STB Main Screen Featuring <strong>Video</strong>Phone


Simple <strong>Video</strong>Phone UI on the STB


Auditory and Visual TTS


TTS Demos<br />

U.S. English Female:<br />

U.S. English Male:<br />

Spanish Female:<br />

TTS: AT&T Lucent Acuvoice


Visual Text-to Text to-Speech Speech Synthesis<br />

• Personalized friendly agents provide an entertaining and<br />

effective user experience.<br />

• Subjective Tests confirm:<br />

– Agents are preferred over text and audio interfaces<br />

– Agents are more trusted<br />

than text and audio interfaces<br />

• Applications:<br />

– Virtual Customer Service<br />

– Virtual Newscaster<br />

– E-commerce<br />

– Games


Non-AT&T Non AT&T Players<br />

• Computer/<strong>Video</strong> Game Industry<br />

– Motion capture; emphasis on fast rendering<br />

– Results are still far from looking natural<br />

• Animation Movie Studios<br />

– Synthetic worlds<br />

– Photo realistic manual animations<br />

• Several Talking Head Companies<br />

– Synthetic models<br />

– Marginal lip synchronization


AT&T Synthetic and Sample-Based<br />

Sample Based<br />

• Synthetic models<br />

Models<br />

• Sample-based models


Two Rendering Techniques<br />

Synthetic<br />

models<br />

parametrized<br />

shapes<br />

Sample-based<br />

parametrized<br />

textures<br />

+ -<br />

Keeps correct<br />

appearance under full<br />

range of views<br />

Reproduces photo<br />

realistic appearances;<br />

fast<br />

hard to reproduce<br />

minute skin details<br />

like wrinkles, that<br />

look absolutely natural<br />

Range of views<br />

limited by planar<br />

approximation of parts


AT&T VTTS<br />

• Synthetic (3D) face animation part of MPEG-4 and H-anim<br />

standards<br />

• Most advanced data-driven talking heads<br />

– Sample-based coarticulation and head movements.<br />

– Face recognition without sensors.<br />

• Only sample-based models look natural<br />

– Synthetic models improve fast in quality, but are still orders of<br />

magnitude away from looking natural;<br />

• Option to combine sample-based heads with synthetic<br />

environments to get best of both worlds<br />

• Strong AT&T patent portfolio


AT&T VTTS Challenges<br />

• Business Challenge:<br />

– Need to establish business needs: where and under what scenarios<br />

is VTTS likely to have an impact?<br />

• Technical Challenges:<br />

– Sample-based VTTS needs to be speeded up by at least 10 times<br />

– Not clear how to combine 3D (e.g., full-body animation) and<br />

sample-based (e.g., talking head) technologies<br />

– “Easy” plug-in as a component of MMUI output<br />

– Scripting too low level: need to develop a high-level scripting/text<br />

tagging language that allows anticipatory planning of gestures<br />

(“reach full pointing amplitude at this syllable in the input text;<br />

‘normal’ speed, but don’t care when you have to start”; “reach<br />

maximum gaze at object 13 on page at this syllable in the input<br />

text; start 1 second earlier”)


Text<br />

Text to<br />

Speech<br />

Synthesizer<br />

Conversation<br />

module<br />

phonemes<br />

stress<br />

VTTS Process<br />

emotions<br />

Coarticulation<br />

Model/Library<br />

Movements<br />

Emotions<br />

Model/Library<br />

Lip shapes<br />

Emotions<br />

Movements<br />

( = FAPs)<br />

Rendering<br />

3D model<br />

Samplebased<br />

model


Model-Independent Model Independent Animation in<br />

• Facial Animation Parameter Units (FAPU)<br />

– Low-level Faps (3-68)<br />

– High-level Faps<br />

• Visemes<br />

• Expressions<br />

MPEG-4<br />

MPEG<br />

ES0<br />

MW0<br />

ENS0<br />

MNS0<br />

IRISD0


Model-Independent Model Independent Animation<br />

• Facial expressions<br />

Anger<br />

Joy<br />

Disgust<br />

Sadness<br />

Fear<br />

Surprise


Sample-Based Sample Based Model<br />

• Concatenate snippets of video to<br />

synthesize talking heads<br />

• Reduce the number of samples to<br />

store by decomposing recorded<br />

head into sub-parts.<br />

• Use a background image of the<br />

whole head onto which parts are<br />

warped.<br />

• Feathering (transparency gradient<br />

at border) helps smooth blending<br />

• Smooth transitions of each object<br />

(e.g., mouth shape) across unit<br />

boundaries by using advanced<br />

morphing techniques


Parameterizing Samples<br />

• Record a “unit selection”<br />

VIDEO database<br />

• Label database with a set of<br />

parameters:<br />

– phonetic information (Audio TTS)<br />

– measured features<br />

• Define a low dimensional space<br />

that is quantized, creating a<br />

“codebook”.<br />

• Populate each “bin” with many<br />

samples.<br />

• Use unit selection approach from<br />

Audio TTS to select appropriate<br />

units on the fly


Head Movements<br />

• Background sequence (4)<br />

– typical short movements<br />

(nods, …)<br />

• Match mouth sequence to<br />

the background sequence<br />

– Extract mouth sequence<br />

(1) -> (2)<br />

– compute pose on background<br />

sequence (4)<br />

– warp sample to match pose (3)<br />

– “Feather” (3) + (4) => (5)


Giving Machines High Quality Voices and Faces<br />

‘Natural Speech’


VTTS Perspective<br />

• In the future…<br />

– Multimedia Customer Care


Recent VTTS<br />

Au Clair de la Lune Virtual Secretary


Communicator


netFridge-- netFridge--a<br />

a New<br />

Communications Paradigm?


Why a netFridge?<br />

• The kitchen is the focus of the home<br />

• The refrigerator is the focus of the<br />

kitchen<br />

• Lots of the family’s stuff get posted<br />

on the fridge door:<br />

– notes, artwork, photos, postcards,<br />

coupons<br />

• Why not have one in cyberspace?


Kitchen2005<br />

• Always-on high-speed internet<br />

connection to every room of the house<br />

• Large flat-panel touch screen with pen,<br />

camera and microphone mounted on<br />

the fridge door<br />

• But people aren’t necessarily much<br />

more technologically savvy


Benefits to Consumer<br />

• “easy to use” security/access control<br />

• simple post-it like messaging in<br />

visual media (like an offline chat<br />

room)<br />

• sharable across households<br />

• easy access via telephone<br />

• download to PDA<br />

• simple interface hides complex<br />

technology


Smith Family<br />

Page<br />

Schedule Family Events<br />

Smith Family<br />

Page<br />

Check the<br />

family calendar<br />

netFridge<br />

( o<br />

Check the<br />

grocery list<br />

Web TV<br />

Smith Family<br />

Page<br />

Keep up with<br />

extended family


• Notes<br />

What’s in the Fridge?<br />

– Typed<br />

– Handwritten<br />

• Pictures<br />

– Doodles<br />

–Photos<br />

–Faxes<br />

• Sounds<br />

• Calendar


WorldNet Beta Site trial of netFridge sm


Browser<br />

access<br />

IE 4, Netscape 4<br />

(JavaScript, Java<br />

required)<br />

Need special camera<br />

control app (in VB)<br />

to use PC camera.<br />

POTS<br />

access<br />

PhoneWeb<br />

Server<br />

(PML)<br />

Internet<br />

Array Systems (PC) box<br />

with Dialogics board(s)<br />

running SCO Unix<br />

netFridge Architecture Overview<br />

netFridge Server<br />

Web Server<br />

fridge.cgi<br />

donote.cgi<br />

calendar.cgi<br />

family.cgi<br />

personal.cgi<br />

admin.cgi<br />

Apache 1.3; should easily port<br />

Static Site Content<br />

HTML pages<br />

<strong>Image</strong>s (gifs, jpegs)<br />

Java Applets<br />

fridgelib.pl<br />

Fridge Databases<br />

Jones Family fridge<br />

Smith Family fridge<br />

Armstrong Family fridge<br />

Unix or Win32 platform; tested on Linux, IRIX, Solaris, Windows 95/98<br />

fridgedb.pl<br />

Fridge Metadatabase<br />

Global map of fridge databases;<br />

Text box<br />

fridge names ⇔ telephone exts.<br />

data (a single binary file)<br />

•All note/calendar event<br />

text and parameters<br />

•All global fridge data<br />

and parameters<br />

•All per-user data<br />

Uploaded objects in separate<br />

files (images and sound)<br />

Each family fridge database is in its own directory


How does this benefit AT&T?<br />

• We surely won’t be in the business of<br />

supplying the hardware<br />

• We may not even be in the business of<br />

supplying the software<br />

•But we ought to be in the business of<br />

supplying the service<br />

– keep it multi-modal, network-centric


Fridge Door<br />

Add a new note<br />

View deleted notes<br />

Turn brief mode on<br />

Grocery List<br />

Xmas Wish List<br />

Needed for tomorrow<br />

Kim needs notebook paper<br />

Window on the World<br />

Stock Exchange<br />

T45¼ IBM 120¼ GRC 37¼<br />

News 3 Articles<br />

Sports<br />

Traffic<br />

Calendar Family Personal Options Leave Fridge<br />

The Brown netFridge<br />

Calendar Events for Tuesday, April 27, 1999<br />

PTA Meeting 6pm at Kim’s School<br />

Go to calendar<br />

Kid Cam<br />

Family<br />

Photo<br />

Album<br />

File Drawer<br />

We Won<br />

Dentist appointment<br />

Pick up Jimmy<br />

Urgent fax<br />

Notes<br />

A fax came in for<br />

you it looked important<br />

from Kesler and Kesler<br />

it is on your desk in the<br />

rec-room<br />

Family Chat<br />

Brown’s Family Room<br />

Grandmas’ recipe for pie<br />

Alert Monitor<br />

Page<br />

Home Alarm<br />

Family Connect<br />

Joe stop by for cards tonight - Dave<br />

I saw Sarah yesterday give me a<br />

call - Barbara


The Broadband Phone


Philosophy<br />

It’s a phone not<br />

a computer<br />

Architecture<br />

100% network centric<br />

Cute feature<br />

Both parties can see<br />

the same thing<br />

Sound bite<br />

Simple phone<br />

Smart network<br />

Broadband phone<br />

Enhancing everyday communications<br />

No operating system<br />

No web browser<br />

No downloadable code<br />

Nothing to go wrong<br />

Broadband<br />

IP Network<br />

IP telephony<br />

The screen comes from the<br />

network<br />

The services are on the screen<br />

The network is the phone<br />

Remote graphics protocol<br />

Enhanced applications<br />

Home shopping<br />

Photo Albums<br />

Fax & Mail<br />

Web browsing<br />

Reservations &<br />

Information<br />

Chat ‘n’ draw<br />

Live video


Summary<br />

• <strong>Image</strong> and <strong>Video</strong> <strong>Processing</strong> technologies are<br />

beginning to mature and reach the point where<br />

they can be utilized in mass market, high<br />

volume services<br />

• <strong>Image</strong> and <strong>Video</strong> <strong>Processing</strong> technologies will<br />

drive broadband technologies into the home,<br />

the office, and for mobility applications<br />

• Real-time <strong>Image</strong> and <strong>Video</strong> Signal <strong>Processing</strong><br />

will enhance the rate at which improvements<br />

are made to the technology, at the same time<br />

lowering the cost of service and increasing the<br />

penetration of these services

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!