30.07.2015 Views

Information retrieval on mind maps – what could it be ... - Jöran Beel

Information retrieval on mind maps – what could it be ... - Jöran Beel

Information retrieval on mind maps – what could it be ... - Jöran Beel

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

III.DOCUMENT SUMMARIZATIONSearch engines usually display summarized data for eachsearch result. This <strong>could</strong> <strong>be</strong> the document’s t<strong>it</strong>le, URL, or ashort extract of the document’s text. Academic search enginesadd<strong>it</strong>i<strong>on</strong>ally display data such as the author, publishing date orthe abstract (see Figure 2). This does not always deliversatisfying results. In Figure 2, for instance, the extract is notvery informative, <strong>it</strong> equals basically the t<strong>it</strong>le. Alternatively totext extracts, some researchers attempted to automaticallycreate abstracts [17-19] or summarizing documents based <strong>on</strong>user generated data such as hyperlinks [20], social annotati<strong>on</strong>s[21] and annotated bibliographies [22].Figure 2. Example of summary data <strong>on</strong> Google ScholarMind <strong>maps</strong> <strong>could</strong> <strong>be</strong> used to complement summarizati<strong>on</strong> ofdocuments. Most <strong>mind</strong> mapping tools allow to link nodes in the<strong>mind</strong> map w<strong>it</strong>h documents <strong>on</strong> the user’s hard drive or to link anode to a webpage. The node’s text, and the text of parentnodes, <strong>could</strong> <strong>be</strong> seen as a summary for the linked document.Figure 3 illustrates this approach: The node w<strong>it</strong>h the red arrowand gray background links to the PDF file of the scholarlyarticle ‘Are your c<strong>it</strong>ati<strong>on</strong>s clean?’ [23]. This article deals w<strong>it</strong>hproblems of c<strong>it</strong>ati<strong>on</strong> analysis. In this example the node which islinking to the PDF and <strong>it</strong>s parent nodes summarize the article’sc<strong>on</strong>tent well:C<strong>it</strong>ati<strong>on</strong> Analysis -> Problems -> Technical Problems –>problem: different authors w<strong>it</strong>h the same nameCertainly, <strong>on</strong>e occurrence in a <strong>mind</strong> map would not <strong>be</strong>sufficient for a thoroughly summary. But if several users wouldlink a document in several <strong>mind</strong> <strong>maps</strong>, this <strong>could</strong> add up to adescent summary, highlighting <strong>what</strong> readers found mostrelevant in the document.<strong>be</strong> found for a keyword search even if document A does notc<strong>on</strong>tain the keyword, but document B, which is linking todocument A. Usually this kind of link analysis is applied toscholarly l<strong>it</strong>erature and webs<strong>it</strong>es. However, <strong>it</strong> seems likely thatthe same c<strong>on</strong>cept <strong>could</strong> <strong>be</strong> applied to <strong>mind</strong> <strong>maps</strong>.Figure 4. Enhance keyword based search enginesIf a <strong>mind</strong> map links to a document, the words of the linking(and parental) node <strong>could</strong> <strong>be</strong> assigned to the linked document.Figure 4 illustrates this: The <strong>mind</strong> map c<strong>on</strong>tains a node called‘expert search’ and child nodes link to documents related toexpert search (those w<strong>it</strong>h the red arrows). However, many ofthese documents do not c<strong>on</strong>tain the term ‘expert search’, butother expressi<strong>on</strong>s such as ‘expert finder’, ‘expertisemanagement’ or ‘skill management’. If search engines wouldanalyze <strong>mind</strong> <strong>maps</strong> and treat them as ‘neighbored’ documents,recall in document <str<strong>on</strong>g>retrieval</str<strong>on</strong>g> <strong>could</strong> <strong>be</strong> increased.V. (DOCUMENT) RECOMMENDER SYSTEMSOne comm<strong>on</strong> recommendati<strong>on</strong> approach is to recommendthose <strong>it</strong>ems which are related to <strong>it</strong>ems a user likes (<strong>it</strong>em basedrecommendati<strong>on</strong>s). For scholarly l<strong>it</strong>erature and webs<strong>it</strong>es,relatedness often is determined via c<strong>it</strong>ati<strong>on</strong> analysis andhyperlink analysis respectively. The same c<strong>on</strong>cept <strong>could</strong> <strong>be</strong>applied to <strong>mind</strong> <strong>maps</strong>.Secti<strong>on</strong> 1Secti<strong>on</strong> 2Node (No Link)Link 1Link 2Node (No Link)Node (No Link)High RelatednessTopicSecti<strong>on</strong> ...Link 3.........Low RelatednessNode (NoLink)Node (No Link)Secti<strong>on</strong> iNode (No Link)Link 4Figure 5. Expected Link Relatedness (Illustrati<strong>on</strong>)Figure 3. Mind <strong>maps</strong> as document summary and for determining wordrelatednessIV.KEYWORD BASED SEARCH ENGINESWhen searching for documents, usually a keyword isentered and the search engine returns those documentsc<strong>on</strong>taining the keyword. Various algor<strong>it</strong>hms exist to calculatehow relevant a document is for a certain keyword search (e.g.tf-idf and BM25(f)), but usually <strong>on</strong>ly words c<strong>on</strong>tained in thedocuments are c<strong>on</strong>sidered. Only few approaches c<strong>on</strong>siderwords of ‘neighbored’ documents add<strong>it</strong>i<strong>on</strong>ally [24].C<strong>on</strong>sidering neighbored documents means, document A <strong>could</strong>The basic idea of <strong>what</strong> we call ‘Mind Map C<strong>it</strong>ati<strong>on</strong>Analysis’: when two documents A and B are linked by a <strong>mind</strong>map, document B <strong>could</strong> <strong>be</strong> recommended to those users likingdocument A. This c<strong>on</strong>cept <strong>could</strong> <strong>be</strong> enhanced w<strong>it</strong>h comm<strong>on</strong>c<strong>it</strong>ati<strong>on</strong> analysis approaches. For instance, if two documents arelinked in high proxim<strong>it</strong>y, their relatedness can <strong>be</strong> expected to<strong>be</strong> higher than two documents linked in lower proxim<strong>it</strong>y [25,26]. Figure 5 illustrates this c<strong>on</strong>cept: Link 1 and 2 are in directproxim<strong>it</strong>y. Therefore, the linked documents can <strong>be</strong> expected to<strong>be</strong> highly related. Between link 3 and 4 is a higher distance, sotheir relatedness is likely to <strong>be</strong> lower.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!