22.01.2015 Views

Military Communications and Information Technology: A Trusted ...

Military Communications and Information Technology: A Trusted ...

Military Communications and Information Technology: A Trusted ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

268 <strong>Military</strong> <strong>Communications</strong> <strong>and</strong> <strong>Information</strong> <strong>Technology</strong>...<br />

is being carried out through the social media [11]. The large involvement of terrorist<br />

activities in the internet indicates the urgency to monitor internet data for<br />

security <strong>and</strong> defense issues.<br />

As the use of web technologies is growing its information content is growing,<br />

too. Therefore, the probability that information relevant for military purpose<br />

is present in the internet is increasing, too. Hence, it is essential to be able to extract<br />

information not only from traditional intelligence sources but also from the internet.<br />

Due to the vast quantity of possibly important documents, intelligence tools<br />

are needed [1]. These tools must help analysts to efficiently <strong>and</strong> rapidly receive<br />

relevant documents from the internet <strong>and</strong> collected data sources, translate them<br />

if necessary, extract critical information <strong>and</strong> analyze them.<br />

III. Natural language processing for intelligence purposes<br />

Natural language processing (NLP) is an active research field that combines<br />

computer science with linguistics. Through the application of different techniques<br />

from computer science, natural language text or speech is processed. As this paper<br />

focuses on the processing of text, in the rest of this paper only “text” will be used,<br />

although in all cases, NLP research exists that also looks into the corresponding<br />

processing of speech.<br />

In general, NLP approaches are either based on rules or on machine learning.<br />

Rule-based approaches apply a set of h<strong>and</strong>-written rules that indicate how to<br />

process the input text. Machine learning is a technique from the field of artificial<br />

intelligence (AI). NLP approaches that are based on machine learning usually apply<br />

statistical models to analyze the input. Such models are automatically learned (or<br />

trained) based on example data (training data). After training, the system is able to<br />

analyze new input. This means, the system derives the most probable analysis based<br />

upon the trained statistical model. For example, in the course of statistical machine<br />

translation (SMT), translations are generated according to the translation model. This<br />

model is trained on a parallel corpus, a collection of texts that represent translations<br />

of each other in both languages of interest. During training the system learns how<br />

to translate source language text into target language text based on the probability<br />

distribution of the training data. The trained SMT system is then able to find the most<br />

probable translation in the target language given the source language input text.<br />

Both rule-based <strong>and</strong> statistical NLP techniques have different advantages <strong>and</strong><br />

disadvantages. Rule-based NLP technology is usually more accurate than statistical<br />

approaches, as it processes everything based on rules. Statistical systems always<br />

output the most probable result which is not necessarily the correct result. Statistical<br />

approaches, however, also have different advantages especially with respect to military<br />

applications: in general, statistical systems have a higher coverage, they tend to<br />

be more robust, <strong>and</strong> they can be produced rapidly <strong>and</strong> more cost efficiently. Language<br />

is highly irregular <strong>and</strong> dynamic. This leads to the fact that it is almost impossible to

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!