software quality metrics for rhetorical structure theory ... - Paas.com.pk

Pakistan Journal of Science (Vol. 62 No. 3 September, 2010) 

SOFTWARE QUALITY METRICS FOR RHETORICAL STRUCTURE THEORY 

BASED INFORMATION RETRIEVAL SYSTEMS 

M. Shoaib, S. Arshad, S. Jabeen and M. P. Tariq * 

Department of Computer Science & Engineering, Lahore, Pakistan, * Department of CS, Virtual University of 

Pakistan 

Corresponding author: tariq_cp@hotmail.com 

ABSTRACT: Currently, several information retrieval models exist, for example, Boolean 

model, vector space model, probabilistic model etc. but none of them provides exact information 

required to the user. The search engines based on such models emphasize on the syntactic aspects 

of the query rather than capturing the semantics. Most of the work done in the domain of 

information retrieval is based on the keyword based indexing technique. Searching based on this 

keyword based indexing technique leads to an irrelevant result set and poor performance of search 

engines. The world of information retrieval is revolutionized by the rhetorical structure theory 

because it improvised the performance of search engines by introducing the semantic based 

retrieval, hence, reducing the irrelevant search to its minimum. In this paper, we present and 

validate design metric for rhetorical structure theory based information retrieval systems so as to 

measure the relevance retrieval and to improvise the retrieval performance. 

Keywords: Keyword based indexing, dynamic weight, text based information retrieval, tokens, discourse structure 

INTRODUCTION 

In Text Based Information Retrieval (TBIR) 

systems, when the user enters a query ‘q’ against a 

collection of documents ‘c’ then each document ‘d’ is 

examined and is given a weight according to the criteria 

of how well it satisfies the semantics of the query ‘q’ or 

in other words, how well requirement of the user is 

fulfilled. For every instance of triple , the 

weight assignment attributed to a document ‘d’ is done 

by a function evaluation. Generally speaking, 

an Information Retrieval (IR) system (Arnt et al., 2004 

and Petratos, 2006) performs various retrieval tasks 

such as document indexing, ranking and classification. 

Document ranking is achieved by sorting all documents 

in the collection on the basis of assigned weights 

(Papka et al., 2008). The core job of an IR system is the 

parsing process. The tokens produced in the parsing of 

documents and queries are called terms. The query 

provided to the IR system in natural language is parsed 

into a set of terms. The terms derived from a document 

are used to build an inverted list which is used as an 

index to the document collection. Normally, in IR 

systems it is assumed that if there is co-occurrence of a 

term in the query and a document then the document is 

relevant to the query. Also the co-occurrences of 

multiple terms of a query in a document contribute to 

the degree of relevance of that document. Current 

information retrieval techniques (Liu et al., 2009) are 

able to retrieve only 30% of the relevant information. 

Currently, the information retrieval systems are based 

on key-based indexing technique in which keywords 

are assigned static weights using some retrieval models 

such as extended Boolean Model, Vector Model or 

Probabilistic Model etc. The words carrying different 

meaning in various contexts can result in retrieval of 

irrelevant information (Shoaib and Shah, 2006). 

Rhetorical Structure Theory (RST) (Mann and 

Thompson, 1986, Mann and Thompson, 1987, Mathkour, 

et al., 2008, Taboada and Mann, 2006) is considered as a 

descriptive linguistic approach to identify the textual 

relations with the intention of generating text. The success 

of RST can be seen from the wide application of this theory 

in various domains ranging from discourse analysis to 

psycholinguistics, theoretical linguistics and computational 

linguistics (Jun'chi and Tsujii, 1994, Mathkour, et al., 

2008, Taboada and Mann, 2005). This theory is remarkably 

used in various applications of computational linguistics, 

for example, text parsing and generation, machine 

translation and easy scoring and most importantly in 

natural language processing (Mann and Thompson, 1987, 

Marcu, 1997).This theory organizes the text by means of 

relations that exist between various sections of the text and 

establishes a coherency in which every section of the text 

plays a role or function with respect to other sections of the 

text. The resulting relations are termed as coherence 

relations, discourse relations, conjunction relations and 

Rhetorical Relations. There may be 30 different relations in 

various sections of text (Taboada, and Mann, 2005, 

Taboada, and Mann, 2006). RST identifies two different 

types of text units: The Nucleus and the Satellites. Nuclei 

are of prime importance whereas the satellites contribute to 

the nucleus and are considered secondary. In elaboration 

relation the nucleus is the part of the text containing basic 

information and satellites are parts of the text containing 

177


additional or supporting information about the nucleus. 

RST relations on the text are applied recursively until 

all text units are associated with one of the RST 

relations (Taboada, and Mann, 2006). 

With the gigantic increase in the volume of the 

web contents, it has become a burning need of the day 

to improvise the performance of the search engines so 

that they may be able to retrieve the semantically 

relevant information in response to the user query. 

Currently most of the search engines are based on the 

keyword based searching techniques which are 

considered to be static due to the reliance on the syntax 

only. These IR systems are unable to handle the various 

contexts in which the same term appears in different 

documents. RST based IR system cope up with this 

problem by capturing the semantics of the terms using 

the dynamic weight assignment techniques. 

In this paper, we present various design 

metrics for RST based IR systems to measure the 

retrieval relevancy and improve performance of the IR 

systems. We also present validation of the proposed 

metrics by taking various case studies. We have made 

base of our research work the architecture of RST based 

IR system presented in (Shoaib and Shah, 2006). Case 

studies to validate the proposed design metrics have 

also been taken from the work presented by (Shoaib 

and Shah, 2006). 

The rest of the paper is organized as follows: 

In section 2, we describe the architecture of RST based 

IR system based upon which we present the design 

metrics and their validation. In section 3, we present 

materials and methods. In section 4, we present results 

and discussions and in section 5 we conclude our work 

and present future directions. 

Architecture of RST based IR system: Since, we 

have selected the architecture of RST based IR system 

proposed by Shoaib and Shah (2006) as a base of our 

research work, therefore, in this section we describe this 

architecture in detail. They proposed an indexing 

technique with dynamic weight assignment technique 

based on RST. They used the punctuation marks and 

cue phrases (Marcu, 2000) (words that connect two or 

more text spans) to define the rhetorical relations. They 

constructed RST tree whose leaves represent the text 

spans and the internal nodes represent the rhetorical 

relations. Keywords extracted from the text spans are 

assigned dynamic weight such that the keywords in the 

text span closer to nucleus are assigned priority 

weights. With this dynamic weight assignment 

technique the precision rate has been clearly 

improvised. The system developed on the basis of this 

approach is able to capture the semantics of a document 

in an efficient way. The architecture (given by Shoaib 

and Shah, 2006) of their proposed model consists of 

four different modules: 1) Segmentor 2) Rhetorical 

Relation Finder 3) Rhetorical Tree Parser and 3) RST 

based Indexer (fig. 1). 

Fig-1: Architecture of RST based IR systems proposed 

by Shoaib and Shah (2006) 

The purpose and functionality of each component 

is defined as followed. 

Segmentor: Text structure is usually organized into words, 

sentences and paragraphs. There is a correlation between 

these different text spans which is needed to be identified. 

To handle the structure in the text there is a need to identify 

the boundaries in the structure. This is the primary function 

of text segmentation. Text segmentor provides the 

structural information which is utilized for text analysis. 

Rhetorical relation finder: In the next step the indexer 

automatically finds out the rhetorical relations. For this 

purpose, concept of cue phrases and punctuation is used to 

get the relationship between every text span. 

RST tree: From the list of text spans and the list of 

relations generated in the previous stages Rhetorical Tree is 

generated whose nodes represent the relations and leaves 

represent the text spans. Based on the height of the tree, 

initial weight is assigned to text spans. The indexer utilizes 

the keywords and their corresponding initial/dynamic 

weight to populate its knowledge base. 

Indexer: Indexer uses the concept of strong and weak 

nodes to assess the initial weight. This initial weight 

assessment is used for dynamic weight assignment. The 

initial weight is taken between 0 and 1. Root node is 

assigned 1 whereas the nucleus and satellite to the parent 

node are assigned 0.9 and 0.5 respectively. These weights 

are variable. The weight assigned to child node is 

generalized by the following formula. Initial weight of 

child node = Weight of child node * Weight of parent 

node. After calculating the initial weight of all children 

nodes these weights are associated with the index terms 

existing in the text spans. On the basis of initial weight 

178


assessment and term frequency, the actual weight of 

index terms is calculated as follows: 

Actual weight of the index term = Initial 

weight assessment* Term frequency. Indexer maintains 

the knowledge base by saving the document ID, 

vocabulary ID and actual weights. Document ID keeps 

track of which word belongs to which document. 

Vocabulary ID assures that there is no redundancy of 

words in the knowledge base thereby consuming less 

space. Dynamic weight indicates the semantic based 

occurrence of important index terms. 

MATERIALS AND METHODS 

In this section, we propose a number of 

metrics related to following basic components of RST 

based IR system. 

1. Segmentor 

2. Rhetorical relation finder 

3. Rhetorical tree parser 

4. RST based indexer 

Number of segments metric: Number of Segments 

(NoS) is one of the proposed metrics to improvise 

performance of IR systems. It is for segmentor that is 

one of basic components of RST based IR systems. 

This metric calculates the total number of segments into 

which a text document can be divided. This metric is 

defined by taking the summation of no. of cue phrases 

and punctuation marks identified (since these two 

parameters are used to divide a text into various 

segments). A segment is defined as any elementary unit 

such as a keyword, a line of text or a paragraph itself. 

There is no way defined yet to validate the 

correct number of segments in advance. The output of 

the first component of the architecture is the list of the 

individual segments. This list is further fed into the next 

component which is the rhetorical relation finder. This 

component works on the list of segments to identify the 

rhetorical relations correspondingly. The benefit of 

computing the no. of segments in advance gives the 

exact idea of how much memory will be required to 

store the segments. And also, the no. of segments will 

be helpful to judge the structure of the RST tree in 

advance and hence the corresponding rhetorical 

relations. 

Mathematically, the No. of Segments metric is 

defined as follows: 

Rhetorical relation distance metric: There can be two 

ways to measure the distance between two text spans. First 

method is to calculate the physically linear distance 

between two text spans. This syntactically linear distance 

means the no. of consecutive sentences that occur between 

two text spans. The other way is to compute the semantic 

distance between text spans. This distance can be 

determined by computing the rhetorical relation distance 

(RRD) metric. This metric determines the semantic 

distance between two text spans in terms of no. of internal 

nodes of a discourse graph. The nodes of the graph 

represent the text spans. The computation of rhetorical 

distance helps in analyzing the strength of a rhetorical 

relation. As greater the value of the RRD metric, weaker 

would be the rhetorical relation and vice versa. Hence, 

RRD = 0 means strongest relation (Direct relation) 

=1 means good relation (Indirect relation) 

>1 means weak relation (Indirect relation) 

A weaker rhetorical relation indicates that there is 

an indirect relation between two nodes where each node is 

representing a text span. A Rhetorical relation will be 

considered stronger if it is a direct relation between two 

nodes. Rhetorical relation strength can help in the 

construction and improvisation of the rhetorical structure 

tree. 

Suppose there are two nodes in the RST graph N i and N j 

and the level of any node is represented as L with the 

indexes i and j representing the specific level number. We 

assume that index Lj always indicate a deeper level as 

compared to level Li. Index k represents the particular node 

number at a specific level then mathematically the 

rhetorical relation distance between these two nodes will be 

Rhetorical Relation Distance (RRD) = L j (Nj k ) - 

L i+1 (Ni k ). 

Where, 

NoC = Number of Cue Phrases identified 

NoP = Number of Punctuation Marks Identified 

Where Lx>2 and x represents the number of the level 

At various depths the node having non sibling parents at 

same levels, the RRD will be computed as: 

L x = x + (x-1) 

L1 = 1 + 0 = 1 case 1[common parent] 

179


= 3 + 2 = 5 case 3[non- 

L2 

2[sibling parent] 

L3 

sibling parent] 

L4 

And so on… 

= 2 + 1 = 3 case 

= 4 + 3 = 7 As Above 

Metrics for indexer: This section defines some 

important measurements for the indexer component of 

the architecture. 

Indexer size metric: There are various measures of the 

size of indexer. Some researchers have shown that the 

size of indexer is taken in terms of number of web 

pages crawled, while others emphasize on how much 

computer storage is required to support the index for 

measuring the indexer size(Brown, 1995, Henzinger, et 

al., 1999).Our proposed indexer size metric adopts a 

different approach. We have measured the size of the 

indexer in terms of the number of key terms taken from 

various documents that exist in a document corpse and 

stored in the indexer as index terms. 

So the indexer size metric (S_idx ) is defined 

as, “The sum of index terms taken from individual 

documents of the document collection”. 

Mathematically, 

Where‘d’ represents a particular document in 

the document corpse ‘D’. T_idx j represents the index 

term taken from a particular document and j is the 

counter for total number of index terms taken from a 

particular document. 

Indexer term relevancy metric: The efficacy of the 

indexer mainly depends upon the degree of relevancy of 

the index terms. We define the index term relevancy 

(ITR) metric as 

“The degree of relevancy of the index term 

according to the various context in which the term is 

used in different documents” 

Indexer ensures the semantically relevant 

results by assigning the initial weights to the nodes of 

the RST tree. Then it computes the dynamic weights of 

the index terms based on the initial weights and term 

frequency. Mathematically, 

Where TF i indicates the term frequency of an 

index term and W_init i represents the initial weight of 

an index term. 

User query relevancy metric: User Query Relevancy 

(UQR) metric can be defined as 

“Corresponding to a user’s query how much relevant 

results are returned by indexer”. 

The UQR can be computed in terms of the major 

operation performed in the indexer searching process. This 

is the process of finding the index terms with maximum 

dynamic weight so we compute this major operation using 

the max() function. When the user enters a query in natural 

language form then the max() function is computed against 

each term in the user entered query to search out the most 

relevant index terms (terms having maximum dynamic 

weights).Mathematically 

Quality of indexer metric: Another important metric that 

is useful for measuring the effectiveness of the indexer is 

the Quality of Indexer (QI) metric. When a user enters a 

query for which there can be multiple answers to choose 

from, then the most important issue related to indexer 

effectiveness is whether it indexes the information that are 

more useful from user perspective or not. Another reason 

for considering the quality of the indexer is the fact that the 

size of indexer is growing at a slower rate as compared to 

the rate of increase in amount of the web contents. Due to 

storage and processing power limitations of indexer it 

becomes an important issue for the indexer to index the 

most useful and relevant documents. In addition to this, the 

search engines should index the useful documents in some 

ranked order so that the user may get the most relevant 

information rather than getting too many results most of 

which are irrelevant to the user query. Hence, the indexer 

with high quality indexes is the growing need of the users. 

Q(I) in a broader perspective can be considered at two tiers: 

a. The overall quality of the documents indexed 

b. The average quality per indexed document 

Suppose every document indexed by the search 

engine is given a weight w (d). For convenience, we 

assume the weights are scaled so that the sum of all 

weights is 1.If we refer I to the indexer and │D│ to the set 

of all documents indexed by a search engine, then we 

define the quality of the indexer Q(I) as 

Note that since w is scaled, we always have 0 ≤ Q(I) ≥ 1 . 

Let │D│ indicates the set of documents indexed by I then 

the average quality per indexed document is: 

Q avg (I) = Q(I) / │D│ 

This average quality per indexed document is a very good 

measure of how well an indexer selects documents for 

indexing purpose. 

RESULTS AND DISCUSSIONS 

In this section we validate the various metrics 

defined for the architecture of RST based IR systems: 

180


Validation of NoS metric: We validate Number of 

Segments (NoS) for the following documents. 

Table- 1. Documents taken as a sample text used by 

Shoaib and Shah (2006). 

We have taken two documents (Doc 0 and Doc 1) as 

shown in table 1. 

For Document 0: NoC = 1 NoP = 3 

For Document 1: NoC = 8 NoP = 4 

Then the total number of segments will be computed as 

follows: 

= [([1 + 3] + [8 + 4])] 

= [(4 + 12)] 

= 16 

The same no. of segments are shown by (Shoaib and 

Shah, 2006) which is in accordance to our NoS metric. 

Validation of RRD metric: Taking an example as a 

case study from the work presented by (Bosma, 2005) 

the discourse structure shown in figure 2 is converted to 

the RST graph in figure 3. 

The nodes of the RST graph in figure 3 represent 

the text spans in terms of nuclei and satellites. The arrows 

represent the relation between the nucleus and the satellite 

with the head pointing from the nucleus to the satellite. 

From this RST graph the RRD between the node 3C and 

3A can be calculated as the 

RRD between 3C and 3A = L j (Nj k ) - L i+1 (Ni k ) 

= 2 0 – (0+1) 0 

= 1 

Hence, the number of internal nodes between 3C 

and 3A is 1 which shows that the rhetorical relation is good 

between 3C and 3A. 

Whereas, 

RRD between 3C and 3D = L j (Nj k ) - L i+1 (Ni k ) 

= 1 0 – (0+1) 0 

= 0 

Hence, the rhetorical relation is direct or strong 

between 3C and 3D 

Exceptional Cases 

Case No. 1 

If two nodes at same level have a common parent 

node then the RRD will always be equal to 1. So, for nodes 

3D and 3B in figure 4, RRD between 3B and 3D = 1 

Fig-4: RST Graph 2 

Fig- 2: Discourse Structure taken from (Bosma, 

2005). 

Case No. 2 

The second exceptional case states that the RRD 

between two nodes 3F and 3A (figure 4) having sibling 

nodes as parent, will always be equal to 3. Hence 

RRD between 3F and 3A = 3 

Fig- 3: RST Graph 1 

Fig- 5: RST Graph 3 

181


Table-2. Indexed terms in the knowledge base used by Shoaib and Shah (2006) 

182


Fig- 6: Top ten results returned by the Google search engine 

Case No.3 

The third exceptional case states that the RRD 

between two nodes 3H and 3E (figure 5) existing at 

level 3 and having non-sibling nodes as parent, will be 

RRD between 3H and 3E: 

L x = x + (x-1) 

L3 = 3 + 2 

= 5 

Hence, the number of internal nodes between 

3H and 3E is 5, which indicates that the relation 

between node 3H and 3E is indirect and weak. 

Validation of S_idx metric: We suppose that the 

document corpse consists of two documents as shown 

in table 1. The size of indexer is calculated as follows: 

We have │D│ = 2 

For Doc 0: 

D0(T_idx) = 19 

For doc 1: 

d1(T_idx) = 21 

S_idx = 0 + [ 19 + 21 ] 

S_idx = 0 + 40 

S_idx = 40 

Hence, the size of indexer is 40 index terms 

which is in accordance to the indexer proposed by 

(Shoaib and Shah, 2006). 

Validation of ITR metric: To validate ITR metric we 

take the case study as shown in table 2. Dynamic 

weights vary each time according to the context in 

which the index term is used in various documents. For 

example, the user provides the query to search out 

information about “lyrics day”. In the indexer, as shown 

in table 2, the index term “day” is assigned the dynamic 

weight 5.39 in document 0 but it is assigned dynamic 

weight 2.24 in document 1 according to the context in 

which it is used. Higher the value of the dynamic weight, 

the more the relevancy. Therefore, document 0 is returned 

in response to user query. 

Validation of QI Metric: Indexer quality metric is 

validated by the following case study: Suppose the user 

queries the Google search engine to get information about 

the keyword “dooms day” with the intention to get the 

information about the Day of Judgment. The result of the 

Google search engine in response to the user query is 

shown in the figure 

The search engine retrieved 5520000 results 

against the user query. For the case study we are taking 

only the top ten results shown in the ranking order. 

After analyzing the contents of the documents the 

following scaled up weights are assigned to the top ten 

documents: 

d1= 0.10 d2= 0.075 d3= 0.050 

d4= 0.20 d5= 0.025 d6=0.075 

d7= 0.30 d8= 0.025 d9=0.050 

d10= 0.075 

then the quality of indexer is computed as: 

Q(I) = (0.10 + 0.075+ 0.050+ 0.20 + 0.025 +0.075+ 

0.30 + 0.025 + 

0.050 + 0.075) 

Q(I) = 0.9 

Which indicates that 0 ≤ Q(I) ≥ 1 

The value of Q(I) closer to 1 means the indexer 

has indexed valuable and relevant documents 

Proceeding further we get the average quality per 

indexed document as 

Q avg (I) = w(I) / │D│ 

= 0.9 / 10 = 0.09 

183


Conclusion and Future Work: We have 

proposed a number of metrics for the RST based IR 

systems. Number of Segments(NoS) shows that the 

correct number of segments can be validated in advance 

leading to better decision on the storage capacity and 

better understanding of the rhetorical relations and the 

RST tree construction. The next metric we define is 

related to the semantic distance computation between 

any two nodes of the RST graph. Then we propose a 

number of metrics related to the most important 

component of the RST based IR architecture i.e. 

Indexer. We propose the size of indexer metric based 

on the number of key terms indexed and validate it to 

show that the size of indexer metric is directly related to 

the efficiency of the indexer. Indexer Term Relevancy 

metric is based on the value of maximum dynamic 

weight. The degree of the relevancy indicates better 

performance of the indexer. Higher the degree of term 

relevancy better will be the efficiency of the indexer. 

Extending the same concept, we propose and validate 

the user query relevancy metric which computes the 

efficacy of the indexer to return most relevant results in 

response to the user query. Finally, we propose the 

quality of indexer metric based on the weight assigned 

to the various documents indexed. 

A challenging issue regarding the 

enhancement of this research is to automate the results 

of the metrics so that when the user enters a query these 

metrics are automatically computed to show the 

performance of the search engine. Another issue which 

can be addressed in future is to conduct the comparative 

study of the RST based IR systems based on the 

comparison of the performance of the RST based 

systems with the other Information Retrieval systems 

using the metrics defined. An automated model can be 

designed for the comparative analysis of the RST based 

IR systems with other IR systems. This model will have 

an interface like the search engine to accept the user 

query and to return the comparative performance 

statistics of the RST based IR systems. 

REFERENCES 

Arnt, A., S. Zilberstein, J. Allan and A. Mouaddib. 

Dynamic Composition of Information 

Retrieval Techniques, Journal of Intelligent 

Information Systems, 23:67-97 (2004). 

Bosma, W. Query Based Summarization using 

Rhetorical Structure Theory, 15th Meeting of 

CLIN, 29-44, (2005). 

Brown, E. W. Execution Performance Issues in Full- 

Text Information Retrieval, Technical Report, 

95-81, (1995). 

Henzinger, M. R., A. Heydon, M. Mitzenmacher and 

Najork, M. Measuring Index Quality Using 

Random Walks on the Web, Comput. Netw, 31: 

1291-1303 (1999). 

Jun'chi. F and J. Tsujii. Breaking down rhetorical relations 

for the purpose of analyzing discourse structures, 

15th International Conference on Computational 

Linguistics, Kyoto, 1177-1183 (1994). 

Liu, P., Z. Zhu and L. Zhao. Research on Information 

Retrieval System Based on Ant Clustering 

Algorithm, Journal of software, 4:1032-1036 

(2009). 

Mann, W. C. and S. A. Thompson. Rhetorical Structure 

Theory: description and construction of text 

structures, Information Sciences Institute, 

Nijmegen, The Netherlands, ISI/RS-86-174,1-15 

(1986). 

Mann, W. C. and S. A Thompson. Rhetorical Structure 

Theory: A Framework for the Analysis of Texts, 

IPRA Papers in Pragmatics 1:79-105 (1987). 

Mann, W. C. and S. A. Thompson. Rhetorical Structure 

Theory: description and construction of text 

structures, In Natural Language Generation: 

Recent Advances in Artificial Intelligence, 

Psychology, and Linguistics (G. Kempen, Hrsg.), 

Boston/Dordrecht: Kluwer Academic Publishers, 

85-96 (1987). 

Marcu, D. The Rhetorical Parsing of Natural Language 

Texts, The Proceedings of the 35th Annual 

Meeting of the Association for Computational 

Linguistics, (ACL'97/EACL'97) Madrid, Spain, 

96-103 (1997). 

Marcu, D. The Theory and Practice of Discourse Parsing 

and Summarization, 1st Edn., The MIT press, 

Cambridge, MA., ISBN- 10: 0262133725, 268, 

(2000). 

Mathkour, H. I., A. A. Touir and W. A. Al-Sanea. Parsing 

Arabic Texts Using Rhetorical Structure Theory, 

Journal of Computer Science, 4:713-720 (2008). 

Papka, R., J. P. Callan, and A. G. Barto. Text Based 

Information Retrieval Using Exponential Gradient 

Descent, proc. NIPS, X, 3-9 (2008). 

Petratos, P. Information Retrieval Systems: A Perspective 

on Human Computer Interaction, Issues in 

Informing Science and Information Technology, 

3:511-518 (2006). 

Shoaib, M. and A. A. Shah. A new indexing technique for 

IR systems using Rhetorical Structure Theory, 

journal of computer science, 2006. 

Taboada, M. and W. C. Mann. Applications of Rhetorical 

Structure Theory, Discourse Studies, 8:567-588 

(2005). 

Taboada, M. and W. C. Mann. Rhetorical Structure 

Theory: Looking Back and Moving Ahead, 

Discourse Studies, 8:423-459 (2006). 

184

software quality metrics for rhetorical structure theory ... - Paas.com.pk

Create successful ePaper yourself

Delete template?

Save as template?