



You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Automated Filipino Verbal Sentence Evaluator<br />

Jennefer B. Jore<br />

Associate Software Engineer<br />

Cybergate 1, Robinson’s Pioneer,<br />

Boni Ave., Manadaluyong<br />

Philippines<br />

jennefer.b.jore<br />

@yaccenture.com<br />

Ana Ruby B. Ramos<br />

Associate Software Engineer<br />

Cybergate 1, Robinson’s Pioneer,<br />

Boni Ave., Manadaluyong<br />

Philippines<br />

mam.ana.ruby.r.cordero<br />

@accenture.com<br />

Qurrata-Ayn K. Karim<br />

3rd author's affiliation<br />

1st line of address<br />

2nd line of address<br />

Telephone number, incl. country code<br />

Ayn_karim@yahoo.com<br />

Erlyn Q. Maguilimotan<br />

Faculty,<br />

Computer Science Dept.<br />

College of Science and Information<br />

Technology<br />

Ateneo De Zamboanga University<br />

erlynqm@yahoo.com<br />

Ebony C. Domingo<br />

Chairperson,<br />

Computer Science Dept.<br />

College of Science and Information<br />

Technology<br />

Ateneo De Zamboanga University<br />

domingoeboc@yahoo.com<br />


Grammar acquisition is an important part of language acquisition<br />

and learning for human beings. Many projects have been designed<br />

to assist in the grammar development of people by having<br />

automated checking of grammars both for fixed word order and<br />

free word order languages. The Filipino language is a free word<br />

order language. It exhibits the problem of discontinuous<br />

constituents. Several approaches used to treat this problem use a<br />

hierarchical syntactic structure that resulted to parsing and<br />

processing delays. One approach that also treats this problem<br />

called Tagalog Free-Word Order (TagFWO) Parser uses a flat<br />

syntactic structure. This approach is able to solve the problem of<br />

discontinuous constituents syntactically. However, the semantic<br />

side is not treated by this approach. The aim of this research then<br />

is to develop a system that evaluates a Filipino verbal sentence by<br />

checking the syntactic structure and semantic relation of the<br />

constituents of the sentence.<br />

The Automated Filipino Verbal Sentence Evaluator is a system<br />

capable of evaluating Filipino verbal sentences based on its<br />

grammar. It uses a Parser in checking the grammar structure, and<br />

the Lexical Functional Grammar (LFG) formalism for the<br />

grammar relation. Grammar structure takes into account the<br />

syntax of the sentence by verifying if such sentence structure<br />

valid in the system. Grammar relation considers the functional<br />

relationship of each constituent in the sentence by checking if the<br />

doer in the sentence has the capability to do such action.<br />

The system is trained on a set of Filipino 33 verbal and nonverbal<br />

sentences (grammatical and ungrammatical). The results<br />

showed that the grammatical verbal sentences were all evaluated<br />

properly with their corresponding detailed user evaluation<br />

feedback. The grammatical non-verbal and ungrammatical<br />

sentences are rejected and outputted a corresponding error<br />

message.<br />

The method developed in this research has resolved issues on<br />

syntactic and semantic relations in tagalong verbal sentences.<br />

However the issues of lexical ambiguities and deeper semantic<br />

interpretations have not yet been included in this research. This<br />

study can further be enhanced to embrace a more complex verbal<br />

system of the Filipino language considering other parts of speech.<br />

General Terms<br />

Algorithms, Languages.<br />

Keywords<br />

Grammar checker, Filipino, natural language processing, artificial<br />

intelligence, text processing.<br />


Language systems consist of words arranged in certain learned<br />

ways (grammar and syntax). Internationalization of language<br />

systems is developed through recognition of syntactic structures<br />

or grammar of a language. Syntactic analysis is the process of<br />

determining the syntactic structure of a sentence according to<br />

grammar rules. This analysis is vital for the recognition of the<br />

grammatical correctness of a sentence [14].<br />

Syntactic analysis is subdivided into structure and relation. One<br />

major application of structural syntactic analysis is parsing. This<br />

method is the decomposition of scanned tokens in an input stream<br />

(a sentence in a language) into components based on phrase<br />

structure grammar rules. Grammar is defined as a system of rules<br />

and principles that determine the formal, legal and semantic<br />

properties of sentences [5] and the description of the signals<br />

which lead to the understanding of a language. Most studies<br />

conducted in the field of syntactic analysis are concentrated on<br />

parsing algorithms.<br />

Parsing algorithms are given higher priority than grammatical<br />

relation is because researchers in this field are seeking a universal<br />

model on syntax for both free and fixed word order languages.<br />


However, fixed word order languages are considered in most<br />

investigations [9].<br />

Fixed word order languages are languages that have a strict<br />

ordering of constituents [3] and is said to be configurational,<br />

while free word order languages do not follow any rule for the<br />

ordering of the constituents and is said to be non-configurational.<br />

Non-configurational means the verb, as the head of the sentence<br />

structure, along with the other constituents in the structure can be<br />

treated as sisters [7]. In a configurational setting the verb and<br />

other constituents cannot be treated as sisters. A separate verb<br />

node is required. Current approaches on free word order<br />

languages are based the configurational approach and thus,<br />

resulting to problems of capturing discontinuous constituents<br />

which are present in free word order languages [3].<br />

Current treatment to this problem is already available. One<br />

approach to this problem is scrambling approach which involves<br />

transformations of a constituent from its original position to other<br />

positions until the right position is found [3]. However, this<br />

approach creates parsing delays due to searching of the adjacent<br />

constituents. Another approach is the sortal hierarchy of types,<br />

which was modeled using the German language [11].<br />

Unfortunately, this approach cannot be applied to Filipino<br />

because it is unsuitable for representing adjacent constituents.<br />

Another approach is the discontinuous dependency parsing which<br />

was applied to Russian and Latin [1]. This approach is applicable<br />

to other languages; however, it is time consuming. It backtracks<br />

and finds an alternative solution thus, exhibits non-determinism<br />

due to the lack of a predictive capability [3].<br />

One research in the Philippines on syntactic analysis is called<br />

Tagalog Free Word Order (TagFWO) Parser by Editha D.<br />

Dimalen [3]. TagFWO Parser is a web-based implementation of a<br />

new technique to address the problem of discontinuous<br />

constituents in a free word order language, Tagalog. It uses flat<br />

syntactic structure that differs from the current approaches that<br />

uses a hierarchical syntactic structure. It uses the concept of Head<br />

Specifier and Head Complement rules to handle the constituency<br />

of tagalong language. It is appropriate for Tagalog language and<br />

require less computing time in contrast to other existing approach.<br />

However, the above-mentioned approaches are focused on the<br />

syntactic structure of the sentences and less on the grammatical<br />

relations. A study by Kroeger [7] showed the insufficiency of<br />

phrase structure rules to capture the syntactic relations and the<br />

importance played by grammatical relations for the Filipino<br />

language. Filipino is the national language of the Philippines. This<br />

language is characterized to be non-configurational. As nonconfigurational,<br />

Filipino does not follow fixed ordering of words<br />

in sentence constructions. Thus, phrase structure rules are not<br />

considered to be sufficient to address the non-configurationality<br />

of the language.<br />

Syntactic relationships and grammatical relations in Filipino are<br />

signified by case markings and verbal affixations [8]. These<br />

syntactic attributes contribute working out what is to be means in<br />

a sentence. The affixations in the Filipino language signify<br />

semantic criteria and categories. Phrasal structures do not succeed<br />

in understanding lexical structures in words but only<br />

componential functions within phrases [6].<br />

Grammar formalism is needed in order to capture syntactic<br />

relationships and grammatical relations of each constituent in a<br />

sentence in any natural language like Filipino. The Lexical<br />

Functional Grammar (LFG) is able to capture both of these<br />

syntactic attributes. Dimalen [6] made use of Head-driven Phrase<br />

Structure Grammar (HPSG) formalism. However, according to<br />

the author, LFG is simpler while retaining the same capabilities of<br />

HPSG. This research then developed an automated grammar<br />

checker for Filipino verbal sentences, which used LFG grammar<br />

formalism.<br />


Filipino verbal sentences are sentences that contain a verb or verb<br />

form in the predicate position. The verbal form of the predicate<br />

determines the role of the noun(s) in the sentence. This depends<br />

on the affix in the verb which tell whether the noun is being an<br />

actor, object, instrument, etc.<br />

One interesting feature of the Filipino language is its focus<br />

system. This means that the role of the noun in focus is reflected<br />

in the verb. Focus is the feature of a verbal predicate that<br />

determines the semantic relationship between a predicate verb and<br />

its topic [12]. There are two types of focus that occur on a basic<br />

Filipino sentence: Actor-focus, the focus is on the actor or doer,<br />

and Goal-focus, does not focus on the actor. There are different<br />

classes of goal-focus. However, Schachter and Otanes [12]<br />

pointed out that only two from these classes are found in basic<br />

Filipino sentence: Object focus, and Directional-focus. The use of<br />

this different focus is based on their affixes.<br />

The verb is based on the use of affixes. The affix is a way of<br />

packaging in some extra information into a word. Filipino uses<br />

affixes in a similar way to indicate tenses of a verb, if an action is<br />

completed or not. In addition to this, Filipino uses affixes to<br />

indicate the role of the focus of the sentence. In other words,<br />

affixes are used to determine what the focus is doing in the<br />

sentence.<br />

2.1 LFG as a grammar checker<br />

A grammar checker was developed to address the problem on<br />

word order, subject-verb agreement and pragmatically in correct<br />

constituent orders of German sentences. This project made use of<br />

LFG and supplemented with rule components for analysis of<br />

ungrammatical input. LFG is composed of constituent-structure<br />

containing the linear hierarchical constituent order and functional<br />

structure representing functional relations and grammatical<br />

features by means of attribute value matrices. Having rule-based<br />

grammar checker with LFG, this project was able to parse<br />

unrestricted input and identify correct errors. However,<br />

orthography and morphological error identified are still<br />

unresolved [4].<br />

LFG has two structures for representing different levels of<br />

linguistics information: constituent structure (c-structure) and the<br />

functional structure (f-structure). The c-structure in LFG<br />

represents the external structure of a sentence in the form of a<br />

phrase structure tree [15]. It shows the syntactic constituents of<br />

the sentence. It relies on the grammar rules defined by the LFG. It<br />

is the more concrete level of linear and hierarchical organization<br />


of words into phrases [2]. It contains lexical and functional<br />

categories. A sample c-structure is shown in Figure 1 applying<br />

phrase structure rules for the sentence “natulog ang bata” .<br />

checks for the capability of the doer to do the task which is the<br />

verb. It checks the lexical entry of the verb if such object is<br />

accepted to it. It also checks the relationship between the two<br />

nouns through the verb. Since the verb accepts an object and the<br />

doer has the capability to do the action based from the lexical<br />

entry, then, f-structure considers this sentence as grammatically<br />

correct.<br />

Figure1. Sample c-structure with Functional Schemata<br />

The functional schemata ( SUBJ) = and = show in symbols<br />

the role of each string play in a sentence (Mangulimotan, 2001). f-<br />

structure does not have direct mapping from cstructure. It is<br />

constructed from instantiation. Thus, the arrows symbols assume<br />

referential values that point to their values () and to which<br />

immediately dominates them () [10].<br />


The overall flow of the system is shown in figure 3. An input<br />

sentence is passed on to the Lexical Analyzer module. There are<br />

three applications that process the sentence in this module. The<br />

first application is called Tokenization which separates each word<br />

of the sentence as a unique entity called token. Once tokenized,<br />

the first token which should be the verb, is passed on to the<br />

second application called Word Stemming. This application<br />

determines the root word of the verb by extracting the affixes. At<br />

the same time, it checks the validity of the root word form using<br />

the lexicon. Based from the extracted affix, the focus type of the<br />

sentence can be determined [13]. The remaining tokens are also<br />

checked if such word exists in the lexicon. The final application<br />

for this module is Tagging. Each token is tagged with the proper<br />

part-of-speech tags which are passed on to the parser.<br />

The f-structure models the internal structure of a language and the<br />

functional roles of each constituent or word order in producing<br />

the meaning of the sentence [2].Each word is designated a set of<br />

categories like subject, object, topic, focus, aspect, case, number,<br />

gender, and other important lexical attributes. This is how f-structure<br />

checks the grammaticality sentence “bumili ang bata ng isda” ( Figure 2).<br />

Figure 2. Sample f-structure<br />

The verb Bumili is considered to be in actor focus since it has the<br />

affix um and thus, making the actor as the subject of the sentence.<br />

The determiner ang determines the subject. The noun bata which<br />

is preceded by the determiner ang and the focus signifies the term<br />

as the subject. Thus from the relationship alone of these three<br />

constituents, the f-structure can immediately identify the subject.<br />

The verb Bumili is an actor focus for it has the affix um. As a rule,<br />

actor focus requires an actor to make the sentence complete. The<br />

object is an optional in the sentence. However, in this sentence, an<br />

object phrase is included. To check if the phrase is an object, a<br />

determiner ng is checked after the subject. The noun isda which is<br />

preceded by ng signifies the term as the object. f-structure does<br />

not only rely on checking the subject and doer rather it also<br />

Figure 3. Architectural Design<br />

The parser verifies the grammar structure through the grammarrule<br />

specified in the system. It is the syntactic structure that is<br />

evaluated by the parser first through the grammar syntax rules<br />

provided by the system. The semantic side is evaluated by Lexical<br />

Functional Grammar (LFG).<br />

LFG evaluates the semantic of the sentence by means of<br />

grammatical relations. Each word in the sentence has their<br />

respective lexical information defined in the lexicon. After LFG<br />

evaluates, the systems outputs a user feedback that states the<br />

evaluation process of the system.<br />


Filipino verbal sentence is the main study of this research. The<br />

following rules that were adopted in different Balarilang Filipino<br />


ooks were used as a basis for determining grammatically correct<br />

and wrong Filipino verbal sentences.<br />

Figure 4. Grammar Rules<br />

This research initially made use of 16 verbs and 38 nouns chosen<br />

randomly from the Handbook of Tagalog Verbs by Teresita V.<br />

Ramos [12]. These were made part of the lexicon. The system<br />

was tested and evaluated using different Filipino verbal and nonverbal<br />

sentences. There were seven (7) grammatically correct<br />

Filipino verbal sentences that was successfully evaluated by the<br />

system. Taking all the possible orderings of the 7 sentences, it<br />

resulted to thirty-three (33) combinations in all due to free-word<br />

ordering. The system has been able to evaluate the sample<br />

sentences. Grammatically correct verbal sentences were<br />

acknowledged with a detailed evaluation as an output of the<br />

system while grammatically wrong sentences were also<br />

acknowledged and given with the necessary information for being<br />

incorrect.<br />


The Automated Filipino Verbal Sentence Evaluator has resolved<br />

the issues on syntactic and semantic relations. However, the<br />

issues of lexical ambiguities and deeper semantic interpretations<br />

have not yet been included in this research. But, with LFG’s<br />

ability of employing semantic relation rules, it is possible to<br />

resolve the issues on lexical ambiguities and deeper semantic<br />

interpretations. However, this requires changes in the semantic<br />

rule and is subject to further investigations.<br />

This study can further be enhanced to embrace a more complex<br />

verbal system of the Filipino language. Other Filipino parts of<br />

speech may also be considered as an additional scope to the study.<br />

In line with this, an automated Filipino essay evaluator can be<br />

developed through this advance studies.<br />

Kroeger [7] has said that Philippine-type languages exhibit<br />

structural similarities. This means that it is possible for the system<br />

to be also used for other Philippines languages and requires only<br />

additional entries in the lexicon. Moreover, this research has made<br />

a very significant contribution in the field of Natural Language<br />

Processing especially in the different researches and studies<br />

conducted for the Filipino language.<br />


[1] Covington, M.Discontinuous Dependency Parsing of Free<br />

and Fixed Word Order. Available:<br />

http://www.ai.uga.edu/ftplib/ai_reports/reports.txt, 1994.<br />

[2] Dalrymple,M. A Lexical Functional Grammar. Available :<br />

http://users.ox.ac.uk/~cpgl0015/lfg.pdf, 2001.<br />

[3] Dimalen, E. Algorithm for Consituent Structures of Tagalog.<br />

MS Thesis, De Lasalle University Professional Schools, Inc.<br />

Manila, Philippines, 2003.<br />

[4] Fortmann, C.and Frost, M. An LFG Grammar Checker for<br />

CALL. Available: ftp://www.ims.uniuttgart.de/pub/Users/forst/Fortmann:Forst-ICALL04.pdf<br />

[5] Fries, P.The 31st International Systematic Functional<br />

Congress. Doshisha University,Kyoto, Japan.<br />

vailable:http://www1.doshisha.ac.jp/~mtatsuki/ISFC31/pages<br />

/abstract_plenary.pdf, 2004.<br />

[6] Hoopman, H., Sportiche, D. and Stabler, E.. An Introduction<br />

to Syntactic Analysis and Theory. Available:<br />

http://www.linguistics.ucla.edu/people/sportiche/isat.pdf,<br />

2002.<br />

[7] Kroeger, P..Phrase Structure and Grammatical Relations in<br />

Tagalog.Dissertations in Linguistics. Stanford, CA: Center<br />

for the Study of Language and Information.xiv,240p, 1993.<br />

[8] Lupyan, G. Modelling Syntactic Devices: An Explanation of<br />

Language Evolution from Connectinist and Memetic<br />

Perspectives.<br />

Available:http://www.isr/uiuc.edu/~amag/langev/paper/lupya<br />

n02modeling.html, 2002.<br />

[9] Maegard, B.Machine Translation.<br />

Available:http://www.cs.uregina.ca/Research.Techreports/95<br />

09.ps, 2002.<br />

[10] Manguilimotan, E.(2001). Syntactic Representation of<br />

Tausug Verbal Sentences. MS Thesis, MSU-Iligan Institute<br />

of Technology, Iligan City, Philippines, 2001.<br />

[11] Oliva, K.The Proper Treatment of Word order in HPSG.In<br />

the Proceedings of the 14 th International Conference on<br />

Computational Linguistics, Nantes.<br />

Available:http//www.acl.ldc.upenn.edu/C/C92?c92-<br />

1031.pdf, 1992.<br />

[12] Ramos, T.Handbook of Tagalog Verbs. University of Hawaii<br />

Press, 320 pp., 1986.<br />

[13] Schachter, P. & Otanes, F. Tagalog Grammar Reference.<br />

University of California Press. Berkeley, CA, 1972.<br />

[14] Tablante, N..The Predictive Value of Knowledge in<br />

Grammar in the Writing Proficiency of the Freshmen<br />

Engineering Students, 1997.<br />

[15] Wong, S.(2001). Lexical Functional Grammar.<br />

Available:<br />

http://www.fi.muni.cz/usr/wong/teaching/mt/notes/node15.html.is<br />

o-8859-1, 2001.<br />


Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!