Content-Based Image Retrieval

Content-Based Image Retrieval Content-Based Image Retrieval

from isical.ac.in More from this publisher

12.07.2015 Views

Page 2 and 3: Outline of the Talk2IntroductionA C
Page 4 and 5: IP/PR Approach to CBIREach image is
Page 8 and 9: Research Issues: An IP/PR View8•
Page 10 and 11: A Colour Feature RepresentationMeth
Page 12 and 13: Distance CriterionWeighted Minkowsk
Page 14 and 15: Image databases and ground truth•
Page 16 and 17: 16Retrieved images for query image
Page 18 and 19: Relevance Feedback (RF)• Human In
Page 20 and 21: FRM continued……weight−type1:w
Page 22: FRM continued……22Weight - type
Page 25: 25Retrieved images at 7rf by PCA25-
Page 29: Effect of RF on feature vectors 148
Page 35 and 36: A Summary of above Results• For a
Page 37 and 38: Objectives:• To study the variati
Page 45 and 46: Some Experimental Findings• Preci
Page 47 and 48: • For both ImageDB1000 1 and Imag
Page 49 and 50: 49Visually different leopards
Page 51 and 52: Experimental Setup and Results51Ima
Page 54 and 55: 54Instance-based RF usingCluster De
Page 56: RSM-CD (Contd.)In the above equatio
Page 60 and 61: 60DB3: Retrieved images at 7rf with
Page 62 and 63: Conclusion and Future Directions•
Page 64 and 65: 64Discussions
Page 66: References (Contd.)665. G. Das and

Outline of the Talk2IntroductionA Colour Feature Representation MethodRelevance FeedbackFeature Re-weighting Method (FRM)Issues of‣Dimensionality Reduction and‣Sample SizeRelevance Score Methods(RSM and RSM-CD)Discussion

Typical User Queries1. retrieve images which are mostly red.2. retrieve images which are red and coarsetexture at the bottom.3. retrieve images of sunset.3

IP/PR Approach to CBIREach image is described by its visual features, e.g.,colour, shape, texture.Feature representation- e.g. colour can berepresented by colour histogram, colour moments,…Each image is described by an M-dimensionalfeature vector.A similarity measure is used to find the similaritybetween a query image and database images.Images are ranked in order of closeness to queryand top K images are returned to the user.4

CBIR- System ArchitectureImage DatabaseFeature ExtractionFeature DatabaseQuery ImageFeature ExtractionSimilarity MeasureRetrievedImages5

Research Issues: An IP/PR View8• Feature Representation• Proper selection of features and theirrepresentation• Use of multiple features• Integration of features with spatial features• Reduction of Semantic Gap• The gap between the low level representation ofan image and the actual high level semanticcontent• Reduction in Retrieval time• Reduction in feature dimension• Efficient indexing

Performance Evaluation• Commonly Used Criteria: Precision and RecallPrecision =No.of Relevant Retrieved imagesNo.of Retrieved imagesNo.of Relevant Retrieved imagesRecall =No.of Relevant images in DB• In our experiments we will use Precision only9

A Colour Feature RepresentationMethod10Which colour model? We use HSV model which isperceptually uniform.Which colour representation? We use Colour CooccurrenceMatrices (CCM) of H, S, V space to constructa feature vector. [Shim and Choi, 2003, Huang, 1998]What is a CCM? In a CCM,P = [ p ij ], p ij indicates the probability of a pixel havingcolour level i co-occurring with another pixel havingcolour level j, at a relative position, say, d. [Haralick, etal.1973]Why CCM? It not only gives pixel information but alsospatial information of an image.

A compact feature vector• We used all diagonal elements of CCM• And a single Sum-average value to represent allnon-diagonal elements as per following formula:Sum _ndiag=L − 1 Li= 1 j=i+1∑∑( i+j)p ijwhere i , j are row and column numbers.11

Distance CriterionWeighted Minkowski distance to measure (dis-)similarity betweenquery image, Q and database image, I:D(I,Q)M= ∑i=1wi*|fiI−fiQ|where,w isifiIandweight forfiQthiarethifeature components offeature componentandMisI and Q resp.,dimension offeature vector.With no RF, equal weights are applied for all feature components.

A compact feature vector (Contd.)• As we considered pixel pairs in bothhorizontal and vertical directions, (H,S,V)CCMs are symmetric.• For H=16, S=3, V=3,Original dimension: 148-D (16+120+3+3+3+3)• Reduced dimension: 25-D (16+1+3+1+3+1)13

Image databases and ground truth• ImageDB2000: 10 categories (Flowers, Veg & Fruits,Nature, Leaves, Ships, Faces, Fishes, Cars, Animals,Aeroplanes), Each category contains 200 images.• ImageDB2020: 12 (Flowers, Leaves, Faces, FarmAnimals, Cars, Natural Scenes, Aeroplanes, Cougars,Crocodiles, Flamingo, Vegetables & Seafood); thenumber of images per category varies from 96 to 376.• DB3: 98 categories, total 8365 images (Caltech-101)• DB4: 43 categories, total 19511 images (Corelcollection) [Giacinto and Roli, 2005]14

Performance Comparison of 25-D and148-D for DB2000 and DB2020Performance for DB2000 Performance for DB202015

16Retrieved images for query image 1400.jpgwith 148-D: a total of 9 relevant images retrieved

17Retrieved images for query image 1400.jpgwith 25-D: a total of 12 relevant images retrieved

Relevance Feedback (RF)• Human Intervention - Relevance Feedback:a proven way to reduce the semantic gap.• The user marks the retrieved images asRelevant and Non-relevant.• This information is fed back to the systemand it tries to improve the systemparameters.• Result? Improved retrieval accuracy.18

RF using Feature Re-weighting Method(FRM)Weighted Minkowski distance to measure (dis-)similarity betweenquery image, Q and database image, I:D(I,Q)where,fiIM= ∑wandi=1ifiQ*|arefiIthi−fiQ|featurethwiis weight for i feature componentand M is dimension of feature vector.componentsofIandQ resp.,19With no RF, equal weights are applied for all feature components.

FRM continued……weight−type1:wt+1i=σσtK , itrel , iHereσtK , i= standard deviationover K retrieved images and20σtrel,i= standard deviation overthe relevant images, in tthiteration

FRM continued……weight− type2 :wt+1i=σtitrel , iδ[Wu and Zhang, 2002]tHereδimages located inside the dominant range of relevant samples,and ∑ Ftl=1= total number of non - relevant images among theretrieved images, for the iσtrel,ii∑l=1= 1−t∑l,Uitl=1ψFl , Uil , Ui,t∑ψl=1l,Uith= number of non - relevantfeature component, and= standard deviation over the relevant images, intthiteration21

FRM continued……22Weight – type 3 [Wu and Zhang, 2002, Das and Ray,2006]:tw i+ 1= δti* σσtK , itrel,iwheret+1ththw i: weight of i feature in (t + 1) iterationtσK,i: SD over K retrieved imagestσrel,i: SD over relevant imagestδi: ratio of irrelevant images outsidedominant range (min & max of releventimages) over all irrelevant images

24Retrieved images at 7rf by 25-D :all 20retrieved images are relevant

25Retrieved images at 7rf by PCA25-D:retrieved 14 relevant images

Dimensionality Reduction and SampleSize Issues: Effect of RF• Effect of RF on feature vectors 148-D, 25-Dand PCA25-D• Effect of Noise on Feature Vectors

Effect of RF on feature vectors 148-D,25-D and PCA25-D: ExperimentalResults

Effect of Noise

A Summary of above Results• For all three datasets and all three feature vectors,weight-type3 produced highest precision values.• For all three datasets our 25-dimensional feature vectorproduces much better retrieval accuracy as compared tothe one (with same dimension) obtained using PCA• For ImageDBCaltech, precision value with 148-D ishigher as compared to 25-D.• The presence of noise in the raw images reduces theretrieval accuracy for all three databases.

Dimensionality Reduction and SampleSize Issues: Effect of Varying SampleSizeHughes Phenomenon [Hughes, 1968]:

Objectives:• To study the variation in precision value withthe change in relevant class size whilekeeping non-relevant class size constant.• To study the improvement with relevancefeedback as the relevant class size is varied.

Experimental Results

Some Experimental Findings• Precision increases with increase in R/NR in a nonlinearfashion and tends to saturate at higher valueof R/NR. This is irrespective of database size andthe type of images in the database.• The precision curves tend to saturate at highersample size.• The improvement with RF is not so significant withlower sample size as opposed to the higher samplesize.

• For 25-D and PCA25-D, with real data set,the variation of precision with R valuesfollows that of synthetic data pretty closely,unlike with 148-D. This means theassumption of feature independence infeature re-weighting method is more realisticwith 25-D as compared to with 148-D.

• For both ImageDB1000 1 and ImageDB10002, with real data, 25-D performs the best forall relevant class sizes used. This meanswith respect to accuracy and onlinecomputation time, our 25-D featurerepresentation is a better choice ascompared to 148-D one.

Instance-based Approach• Limitation of FRM: Estimation of classparameters are not so accurate whennumber of feedback samples is low. This isworse when feature dimension is high.• Better solution? Instance-based approachwhere a new instance is classified based onits similarity to a class of similar instances.48

49Visually different leopards

Relevance Score Method (RSM)• Database images are associated with a RelevanceScore (RS) and ranked in descending order .• RS of image I is [Giacinto and Roli, 2006]:RS(I )=1+1dR(I )dN(I )dR(I) minimum distance of I from relevant set50dN(I) minimum distance of I from non-relevant setValue of RS(I) lies between 0 and 1.

Experimental Setup and Results51Image databases and ground truth• DB1: 10 categories, total 1000 images• DB2: 10 categories, total 1000 images• DB3: 98 categories, total 8365 images (Caltech-101)• DB4: 43 categories, total 19511 images (Corel collection)[3]Two feature vectors: 25-D (from Colour Co-occurrenceMatrices in HSV space) [1], 9-D (first three colourmoments in HSV space)Accuracy measured by (Scope = 20):Precision =No. of relevant retrieved imagesNo. of retrieved images

Results with 25-D feature vectorDB1DB21. RF increases accuracy significantly.522. RSM performs better than FRM. For DB1, at 7rf, precision with RSMis 7.6% more than that with FRM. For DB2, this figure is 6.3%.

54Instance-based RF usingCluster Density (RSM-CD)

55RSM-CD (Contd.)

RSM-CD (Contd.)In the above equation for RS(I),dC(I) = average distance of image I fromthe cluster of relevant images,|R| = cardinality of relevant set, anddist(I,R i ) = distance of image I from the relevantimage R i .56

59DB3: Retrieved images at 0rf with 9-Dfeature vector with RSM method – only 3relevant images

60DB3: Retrieved images at 7rf with 9-Dfeature vector with RSM method –6 relevant images

61DB3: Retrieved images at 7rf with 9-Dfeature vector with RSM-CD method –10 relevant images

Conclusion and Future Directions• Irrespective of feature vectors and databasesused, application of FRM, RSM, and RSM-CD improve retrieval accuracy significantly.• RSM methods perform much better thanFRM.• RSM methods perform very well in DB4.However, this is not the case with DB3 inspite of the fact that this has more number ofsemantic categories than DB4.62

• Further research is needed to isolatecontribution from each factor in order toestablish the goodness of one method overanother.• Detailed statistical analysis is required withrespect to dimensionality reduction andsample size issues.

64Discussions

References651. Young Rui, Thomas S. Huang, Shih-Fu Chang, ImageRetrieval: Current Techniques, Promising Directions andOpen Issues, Journal of Visual Communication and ImagePresentation, Vol. 10, No. 4, April 1999.2. R. M. Haralick, K. Shanmugam, and I. Dinstein, “Texturalfeatures for image classification,” IEEE Transactions onSystems, Man, and Cybernetics, pp. 610–621, November1973.3. S. Aksoy and R. M. Haralick, F.A. Cheikh and M. Gabbouj, “Aweighted distance approach to relevance feedback,” inInternational Conference on Pattern Recognition, Barcelona,Spain, September 2000.4. S.-O. Shim and T.-S.Choi, “Image Indexing by modifiedcolour co-occurrence matrix,” in International Conference onImage Processing, Vol. 3, September 2003.

References (Contd.)665. G. Das and S. Ray. A compact feature representation andimage indexing in Content-Based Image Retrieval. InProceedings of Image and Vision Computing New Zealand2005 Conference (IVCNZ 2005), pages 387–391, Dunedin,New Zealand, 28-29 November 2005.6. Jing Huang, Colour-spatial Image Indexing and Applications,PhD thesis, Cornell University, 1998.7. G. Das and S. Ray. Feature re-weighting in Content-BasedImage Retrieval. In Proceedings , of International Conferenceon Image and Video Retrieval, pages 387–391, Arizona StateUniversity Tempe AZ, July 13-15 2006.8. G. Giacinto and F. Roli. Instance-based relevance feedbackfor image retrieval. In Advances in Neural Processing systems17, pages 489–496, Cambridge, MA, 2005.

Content-Based Image Retrieval

Content-Based Image Retrieval ... View more Content-Based Image Retrieval

Delete template?

Save as template ?

Content-Based Image Retrieval Content-Based Image Retrieval