Chapter 4 - DSpace at Waseda University

Analyzing Real-Time Performance Problems in 

Embedded Linux 

組込みLinuxにおけるカーネルのリアルタイム 

性能に関する問題の分析 

A DISSERTATION 

SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE 

AND THE COMMITTEE ON GRADUATE STUDIES 

OF WASEDA UNIVERSITY 

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS 

FOR THE DEGREE OF 

DOCTOR OF ENGINEERING 

July 2010 

Ki Duk Kwon

Abstract 

These days embedded systems are used in various fields as home appliances and smart 

phones, PDA or automobile. Embedded systems, which are usually designed to perform 

specific purpose, are significantly developing with embedded software technology. Among 

embedded software technologies, embedded operating systems have changed innovatively. 

All the embedded systems are currently operate without any problems, there is a possibility 

for an unexpected error to occur. There are no one-hundred percent perfect systems. Even 

commercial embedded systems have a possibility to occur kernel problems. 

When developing embedded systems, usually problems can be categorized as two groups – 

user level [59] and kernel level. In a user-level problem, it is not that hard to fix since there 

are a lot of tools for developing and debugging. In contrast, when a problem occurs in kernel 

level, it is much more difficult to fix up than in user-level. This is because tools for kernel 

development usually provide minimum functions and in many case the functions are not 

helpful to fix the problems. Due to these characteristics of an embedded system, it is not easy 

iii

to propose a solution for the problem which occurs during project development. Moreover 

the structure of embedded systems is becoming more complex swiftly. Therefore, in order to 

analyze and solve problems in systems, a framework which makes performance measurement 

and analysis is urgently needed. 

In this paper, we propose a system architecture called Kernel Analysis System (KAS) which 

analyzes the event log of embedded kernel. KAS figures out problems in kernel quickly and 

has three main layers. First, is the Detection Layer. In this layer, the KAS finds out problems 

by checking all events that occurred in the kernel and counting there number. Second, is the 

Separation Layer. In this layer, the KAS separates only the events related to problems from 

all executed events. Third, is the Analysis Layer. In this layer, the KAS analyzes the problems 

by calculating all the events’ running time and the number of error occurrences so as to figure 

out the cause of the problems. KAS cannot fix up every problem. Currently, KAS tested 

problem analysis of HRTimer, but it is not possible to analyze other problems. However, it 

proved by using KAS along with the analysis of kernel timer, which is one of the most 

important and difficult area in kernel, that it is possible for developer or administrator to 

analyze timer problems quickly and efficiently. 

iv

Contents 

Abstract ……………………………………………………………………..…. v 

1. Introduction 

1.1 Motivation ……………………………………………………………………............ 3 

1.2 Challenge ……………………………………………………………………………. 4 

1.3 Contribution …………………………………………………………………………. 6 

1.4 Outline ………………………………………………………………………………. 7 

2. Background 

2.1 Embedded System ………………………………………………………………….… 8 

2.2 Linux Kernel …………………………………………………………………………. 9 

2.3 Embedded Linux ……………………………………………………………………. 12 

2.4 System Monitoring …………………………………………………………………. 14 

v

2.5 Event Log …………………………………………………………………………… 15 

2.6 Problem Analysis …………………………………………………...………………. 17 

2.7 Linux Timer ………………………………………………………………...……… 18 

3. Related Work 

3.1 Event log ………………………………………………………………………….... 23 

3.1.1 Event Logging ……………………………………………………………….. 23 

3.1.1 Event Log Monitoring ……………………………………………………… 25 

3.2 System Monitoring …………………………………………………………………. 26 

3.3 Performance Analysis Tools ………………………………………………...…… 29 

3.4 Linux Trace Toolkit next generation ………………………………………..….. 31 

4. System Framework 

4.1 Introduction …………………………………………………………………….........35 

4.2 Kernel Analysis System …………………………………………..………..…..... 36 

4.2.1 Detection Layer ……………………………………………………….……. 40 

4.2.1.1 When developing Embedded System ………………………………... 43 

4.2.1.2 Used by users ……………………………………………………..….. 43 

4.2.1.3 Process Flow of DL ……………………………………………..…… 44 

4.2.2 Separation Layer …………………………………………………………….. 46 

4.2.3 Analysis Layer ……………………………………………………………... 47 

vi

4.3 Kernel Analysis System Algorithm …………………………………….................... 48 

4.3.1 Important function and parameter ………………………………………….… 50 

4.4 Trace Point and Event Log ……………………………………................................ 52 

4.5 LTTng and KAS ………………………………………………………………….… 53 

4.6 Summary ………………………………………………………………….………... 55 

5. Case Study 

5.1 Timer Latency …………………………………………………………………........ 57 

5.2 Preemptive vs. Non-preemptive ……………………………………………............ 59 

5.2.1 Preemptive Kernel …………………………………………………………... 59 

5.2.2 Non-preemptive Kernel ………………………………………………….….. 61 

5.3 High Resolution Timer ………………………………………………………….…. 62 

5.4 Latency Policy ………………………………………………………………….….. 64 

5.5 Evaluation …………………………………………………………………….……. 65 

5.5.1 Result of KAS ………………………………………………………………... 67 

5.5.1.1 Result of DL …………………………………………………………... 68 

5.5.1.2 Result of SL ………………………………………………………..….. 69 

5.5.1.3 Result of AL ……………………………………………………….….. 70 

5.5.2 Analysis of HRTimer Latency ………………………………………………... 72 

5.6 Summary ………………………………………………………………………….... 78 

vii

6. Conclusions and Future Work 

6.1 Conclusions …………………………………………………………………….…... 79 

6.2 Future Work …………………………………………………………………..…….. 80 

Appendix 

6.2.1 Real-Time Architecture of KAS …………………………………………...... 81 

6.2.2 KAS and CABI …………………………………………………………….... 82 

A.1 RTOS ……………………………………………………………………………..... 83 

A.2 RT-Linux …………………………………………………………………………... 85 

A.3 Real-Time Scheduling …………………………………………………………….. 85 

A.4 CABI …………………………………………………………………………...….. 88 

Bibliography …….………………………………………………..…………………… 91 

Acknowledgements ………………………………………………...……………… 100 

Publication List …………………………………...……………………………….... 101 

viii

List of Tables 

2.1 General monitoring tools by Resource …………………………….………….... 14 

4.1 Important parameters in KAS ………………………….…………………………...... 50 

4.2 Important functions in KAS ………………..……………………………………… 50 

5.1 Result of Detection Layer ……………………………………………………………... 66 

5.2 Result of Analysis Layer ……………..………….…………………………………… 68 

5.3 Result of Analysis Layer ………………………………………………………….. 69 

ix

List of Figures 

2.1 User-space vs. kernel-space ………………………………..…………………………... 11 

2.2 Process of general event logging ………………………………………………….......... 16 

2.3 The execution of local timer soft interruption handler ……………………….……... 19 

3.1 flow chart of the event log …………………………………………………………....... 24 

3.2 For example free command …………………………………………………….……..... 26 

3.3 Process viewer for Linux ……………………………….………...……………..……... 28 

3.4 Mevalet viewer’s execution ……………………………………………………………. 29 

3.5 LTTV viewer’s execution ………………………………………………………….…… 31 

3.6 Event logging sequence of LTTng ………………………………………………….….. 32 

4.1 Normal method of event logging and analysis …………………………………………. 36 

4.2 Kernel Analysis System Architecture …………………………………………….…….. 37 

4.3 Event log process flow of KAS ……………………………………………………….… 37 

4.4 Problem definition ……………...…………………………………………………….… 44 

4.5 Process flow of detection layer ………………………………………………….……... 44 

x

4.6 Event log separated in SL …………………………………………………………….… 45 

4.7 Process flow of Separation Layer ……………………………………………………… 46 

4.8 Process flow of analysis layer ……………………………………………………….…. 47 

4.9 Dependency of each module is KAS …………………………………………………... 47 

4.10 Pseudo code of KAS …………………………………………………………….……. 48 

4.11 parameter and event log …………………………………………………………...….. 50 

4.12 Relation between trace point and event log ………………………………………..… 51 

4.13 An example showing the usage of trace point in LTTng and KAS …………………... 52 

4.14 Basic trace point offered by LTTng ………………………………………………….. 53 

4.15 Procedure of problem analysis process of LTTng and KAS ………………….……… 53 

5.1 Task Preemption Latency Model ……………………………………………………..... 56 

5.2 Priority Task Latency Model ……………………………………………………….….. 57 

5.3 Process of interrupt of preemptive kernel ………………………………………….….. 58 

5.4 3 Process of interrupt of non-preemptive kernel ………………………………….…… 59 

5.5 Hrtimer latency model …………………………………………………………….…… 62 

5.6 Source code of setitimer ………………………………………………………….…….. 64 

5.7 set up setitimer …………………………….……………………………………….. 64 

5.8 Periodic process of HRTimer is set by 100μs ………….…………………….... 64 

5.9: Source code of DL for line information …………………………………….….… 66 

5.10: Result of Separation Layer …………………..…………………………….………… 67 

5.11: Event log of part where delay occurred …………………………………………….. 70 

xi

5.12: One of the reasons of HRTimer latency ……………………………………..……..… 71 

5.13: Result of analysis of HRTimer latency ………………………………………….…… 72 

5.14: Kernel source of softirq modified HRTimer ……………………………………......... 72 

5.15: Result of experiment on Linux-RT and general Linux in 100μs …………………..… 73 

5.16: Result of an experiment on Linux-RT and general Linux in 1ms …………………..... 74 

5.17: Result of an experiment of HRTimer latency in Linux-2.6.24 …………………….. 74 

5.18: Result of experiment of HRTimer latency in Linux-2.6.24-changed-softirq ……….... 75 

5.19: Result of experiment of HRTimer latency in Linux-2.6.24-rt-patched …………….… 75 

6.1: Real-Time Architecture of KAS ……………………………………………………….. 80 

6.2: KAS and CABI ………………………………………………………………………... 80 

A.1 Classification of real-time scheduling algorithm ……………………………………… 85 

A.2 Control the consumptions of the resources by CABI ………………………….………. 88 

xii

Chapter 1 

Introduction 

Today, thanks to the rapid development of hardware, not only embedded hardware but also 

embedded software have developed drastically. In addition, a lot of embedded system’s 

studies are being performed remarkably especially in ubiquitous systems or cyber physical 

systems (CPS) [62]. Recently a smartphone [57] or PDA phone, for example, is spreading 

rapidly. A PDA phone, which is a mobile phone with a PDA’s functions, is equipped with 

high-performance CPU and general-purpose operating system (OS) to enforce various 

functions, including multimedia features. A smartphone, which is a mobile phone offering 

PC-likely advanced capacities, supports not only PDA’s functions but also remote control, 

Internet, user-friendly interface - a touch screen and a handwriting input. Moreover, since it 

supports wireless internet, various functions such as E-mail, web browsing, fax, banking 

services and games became available. Some of them has already started to standardize its


functions or equipped with its own OS. For these innovations, people prefer to use a 

smartphones, and the role of smartphone is increasing, substituting for computers. 

In terms of hardware, there is the Moore's Law, which is one of the most important laws in 

history of computer hardware. Moore's law describes a long-term trend in the history of 

computing hardware, in which the number of transistors that can be placed inexpensively on 

an integrated circuit double approximately every two years. Although the Moore's law has 

satisfied in the past several decades, these days it seems to be at a breaking point. Over the 

past year the efficiency of semiconductor manufacturing has greatly increased, and the 

internal structure of semiconductor has also become much integrated. Therefore there is only 

some possibility to make it more integrated. On the other hand, software has a lot of 

development potential. Even the same hardware, a developer can set up different kinds of 

software in it. Moreover, even the same software, depends on the developer, the performance 

[4] and occurred errors are different. Therefore, the software is becoming important as time 

goes by. 

In general, software consists of OS, middleware and application. Because each group has 

different features and functions, every group is much important as itself. However, OS is 

remarkably important since it controls and manages all the hardware on embedded systems to 

keep running applications without any problems. Therefore, it is the fundamental of computer 

science. Many OSes such as Unix, Linux and Windows are currently used in personal 

computers. And, there are many OSes for embedded systems, too. 

For example, there is Android [53] from Google, iOS4 [54] from Apple, embedded 

Windows from Microsoft and embedded Linux. Especially, embedded Linux has been widely 

2

3 


used because it retains many powerful features of general Linux – multi-tasking, a variety of 

network environments, different types of file systems, system scalability and it is also 

provided for free. However, embedded systems compared with the general systems have a 

disadvantage. For example, there are not enough skilled developers and many constraints on 

hardware. And then, a market of embedded systems wants the fast development cycle but it is 

hard because of the lack of skilled developers. These days embedded systems became one of 

the most important areas on all the industries. Therefore, debugging [55] a various problems 

and improving the performance of embedded systems also will be an important area in the 

short future. 

1.1 Motivation 

Recently in many fields, embedded OSes have been used such as home appliances and 

mobile phones or PDA. There are important reasons why embedded OSes are using widely. 

In general-purpose OS a user needs to perform the various functions, but in an embedded OS 

a user needs to perform only the minimum required functions. Therefore, using only minimal 

resources to configure the system make the cost lower. Moreover, embedded OSes are 

usually made by a specific purpose. For example, Satellite or missile control needs the 

stability of a real time system. In this case, an embedded system such as real-time operating 

system (RTOS) [14] is suitable rather than a general-purpose OS. 

Some people think that the development of an embedded system can be faster than that of 

a general system. However, we should not only focus on rapid application development 

(RAD) of Software Engineering. Fast development can be great, but sometimes it causes


huge losses in cost. For instance, an electric power’s system error resulted in power outage in 

the whole of New York City. Also people tend to think that the high quality program is 

related to the development period. However, if the period of development is more delayed 

than expected, the quality degradation is easy to occur. Furthermore shortening the period of 

development usually causes many problems. Therefore, developers are needed a strong 

analysis ability to complete a project without any problems. In this paper, we will focus on 

how to trace problems in kernel and how to solve the problems efficiently. 

1.2 Challenge 

When developing embedded systems, usually problems can be categorized in two groups – 

the user level [59] and the kernel level. To solve a user-level problem, there are a lot of tools 

for debugging. In contrast, when a problem occurs in the kernel level [7], it is much more 

difficult to fix up than in the user-level. This is because tools for kernel development usually 

provide minimum functions and in many cases the functions are not helpful to fix the 

problems. Moreover, even though all the embedded systems are currently operate without 

any problems, there is a possibility for an unexpected error to occur. There are no one- 

hundred percent perfect systems. Even commercial embedded systems have a possibility to 

occur kernel problems. Sometimes these problems usually do not find when developing 

embedded systems. After commercialized as a product, however, it still has a possibility to 

occur kernel problems. Furthermore, this unexpected small error causes inconvenience for 

many users and a possibly life-threatening problem. 

An embedded system’s project is usually complex and requires developers with a high level 

4

5 


of comprehension about hardware and software, compared to the general software’s project. 

In addition, an embedded system’s project has a lot of hardware constraints. Due to these 

characteristics of an embedded system, it is not easy to propose a solution for the problem 

which occurs during project development. These days various solutions for these problems 

are proposed. 

Among the many solutions for analysis problems describes how to use the event log. 

� Kernel processes many events such as memory-related events, system call events, 

network-related events in a very short time. These events help to analyze problems 

and suggest a solution to developers. Hence, we re-define the logging information to 

analyze these events information. 

� Generally, there are two main ways to analyze event information. First, it is to 

visualize kernel events. This way can analyze the problem approximately but not 

exactly. Another way is to print the event log in the text mode. It is very efficient 

when a developer needs to analyze a problem exactly. Therefore, by using the test 

mode’s advantage, we will find problems and suggest solutions. 

� It requires much time and effort to find problems in the kernel mode compared to in 

the user mode [63]. Therefore, the tool for detecting the kernel problem and fixing it 

up is very important in the embedded system development. 

While the embedded system development, there are a lot of important factors such as 

hardware, development time, development costs. Same as these factors, the analysis tool is 

the one of the most important factor in the development environment for embedded systems.


1.3 Contribution 

When developing a project, there are a variety of SDK and various debugging methods [12] 

[23] [38] [40]. However, while developing embedded systems, developers should solve 

problems by using limited tools and debugging methods [36]. Debugging and performance 

tuning [41] [56] are an important part in the system development. The system developer for 

debugging and performance tuning needs error messages that generated by file system 

errors, network errors [34], hard drive errors, and memory errors in the systems. In addition, 

it is important to catch where the problem happened and what it is. The proposed system 

using the event log is the following contributions. 

� Among the huge event logs, we can find the problem easily. 

� Separating the error data from whole data and we can analyze only the error data. 

� Analyzing the event logs, we can analyze the cause of the problem. 

We would like to suggest solutions to solve the problems in the kernel and to improve 

performance by finding and analyzing problems effectively. 

1.4 Outline 

The dissertation is structured as follows. Chapter 1 talks about the research motivation, 

challenges and contributions. 

In Section 2, we explain all the background knowledge which is related to our work. We 

6

7 


explain about embedded systems, the Linux kernel and embedded Linux. In addition, we 

describe system monitoring, event log and problem analysis to solve the problems. 

In Section 3, we explain related work, reviews and discussions. We explain the tools for 

system monitoring and event log analysis. In Section 4, we propose a new system framework. 

Our system framework has three main layers: Detection Layer, Separation Layer and 

Analysis Layer. We explain each layer in details. In Section 5, we analyze the kernel timer by 

using the kernel analysis system that we propose. We analyze the high resolution timer 

latency problem. Finally, in Section 6, we explain a conclusion and suggest possible future 

directions.


Background 

2.1 Embedded System 

As developing of technologies in the field of electricity, electron, and computers, there are 

many kinds of applied equipment in our daily life, for example, T.V, refrigerators, microwave, 

washing machines, cellular phones, computers, PDA, cyber home care systems in apartment, 

elevator systems, ATM, and airport traffic control systems, etc. These various technologies 

are closely related with our daily life and also helpful in our daily life. 

The embedded system [26] is an electronic control system that is combined between 

hardware and software. All applied equipments operated in our daily life such as electronic 

devices, home appliances, and control units are composed of not only a simple electric circuit 

but also microprocessors. The embedded system is built in programs to operate dedicated

9 


functions via microprocessors. An early version of embedded systems was very simple. It 

was built into an 8bit/16bit controller and it still has been used. As recently embedded system 

industries are using in more powerful microprocessors and digital signal processing (DSP) 

chips. It is necessary to show embedded OSes in order to control these large systems. 

Early embedded systems operated by sequential program without OS, and it was out of 

sequential program when occurred interrupts. Therefore, there was no necessity of using OS 

and it was wasted system resources. However, recently the embedded system is larger than 

before and it is to increase the system complexity by networks and multimedia, etc. Therefore 

embedded system is hard to operate sequential program. These changes cause the necessity of 

OS in embedded systems and also its system cannot ignore real-time characteristic, therefore, 

embedded systems the used real-time OS. The products that adapted the real-time OS are 

more increasing now. In the field, many embedded systems use real-time OSes according to 

its purposes. 

2.2 Linux Kernel 

Linux is a member of the large family of Unix-like operating systems. A relative newcomer 

experiencing sudden spectacular popularity starting in the late 1990s, Linux joins such well 

known commercial Unix operating systems as System V. Linux was initially developed by 

Linus Torvalds in 1991 as an operating system for IBM-compatible personal computers 

based on the Intel 80386 microprocessor. Linus remains deeply involved with improving 

Linux, keeping it up to date with various hardware developments and coordinating the 

activity of hundreds of Linux developers around the world. The Linux kernel is located in


memory and to manage system devices and memory, processes, I/O devices. Every system 

has the kernel and it affects whole performance [4] of the system by its kernel performance. 

Therefore kernel is important such as the embedded system industries. 

The most important feature of the Linux kernel is that users can modify the kernel by 

themselves. The Linux kernel also distributes type of sources and it can download through 

distributed package and ftp or BBS user group such as other Linux programs. The 

environments for compile easily set up by using few well-made scripts and easily find 

documents in the internet. The policy of open sources is one of the reasons and the Linux 

kernel and user groups were achieved quantum leap. Linux’s open mind can make rapid and 

strong kernel even if other OSes are fettered by commercialism. 

The following points are Linux kernel’s strengths [24]. 

� No royalty: Linux can be downloaded from the Internet without a continental free. It 

is to decrease development costs. 

� Open source: it is possible to expand OS. 

� As Linux system is stable, the possibility of error is low. 

� Linux can be used in a variety of types of hardware 

� Safe: Security model used Linux based on ideas used UNIX security, famous for its 

toughness and proven quality. 

� An immediate modification is possible when the kernel bugs occurred. 

10

Here are Linux kernel’s weaknesses. 

� Limited develop environment. 

� A large number of different Linux distributions. 

� Is open source products can be trusted? 

11 


The Linux kernel is composed of two major modes as the user mode and the kernel mode. 

Figure 2.1 shows running the Linux kernel mode. The first is the user-space that applications 

are running. And a second is the kernel-space that kernel modules and device drivers are 

running. The signals such as system calls, ioctl() exchange between the user-space and the 

kernel-space. And also the signals such as H/W interfaces or protocols exchange between 

hardware and kernel-space. 

Figure 2.1: User-space vs. kernel-space


2.3 Embedded Linux 

An embedded OS has to supports developing environment such as middleware, library, 

development tools, and analysis tools for analysis kernel problems. The Linux kernel that is 

the commonly OS among embedded system OSes. Nevertheless, embedded systems such as 

cellular phone and real-time applied product have been used by the RTOS because of its 

required time constraint such as hard real-time systems. However, rapidly improved 

performance of embedded systems causes limitation of systems based on RTOS. Therefore 

Linux OS that strengthened real-time characteristic engages public attention again. 

Embedded Linux means simply Linux that used in the embedded system. Early embedded 

Linux was developed with small memory and low performance processor. Therefore 

embedded Linux has been minimized its size and functions, and customized because of 

limitation to be built in small memory. Above conditions are essential factor of embedded 

Linux. Nevertheless, the embedded Linux was applied in various products. 

There are lots of reasons that Linux get the spotlight in the embedded system industry. 

Three big reasons are as follows. First, there is no royalty and licensing cost. Open source 

licensing agreement is one of reasons for being Linux today. Second, it supports functions 

that RTOS could not support for various devices such as smartphone and PDA etc. Gradually 

embedded devices are going to change by various demands of memory size, wireless internet, 

and hard disk etc. It means that the demands could not be existed from RTOS are going to be 

more bigger and bigger such as safety, various graphic user interface (GUI), memory security, 

and support personal information etc. If developing of the embedded system, there is no need 

12

13 


to program in order to operate above functions. We can reuse various application libraries, 

and device drivers from Linux. The last, it is more flexible to programmers can select the 

development environment and debugging environment [55] than other embedded OSes. 

However, embedded Linux has weakness of stability because it has no official quality 

testing. Also there is lack of developers in the embedded system. In order to develop various 

products by using embedded Linux, engineers are needed such as device driver developer, 

embedded application developer, and GUI program developer. 

Even if embedded Linux has more minimized and light weighted than general Linux, the 

kernel is larger than RTOS. It was difficult to use embedded Linux in the embedded system. 

However recent embedded system operates similar to Pentium computer’s performance 

through the high clock speed. Therefore, embedded Linux is going to be more useful and 

practical. 

2.4 System Monitoring 

Generally system monitoring [44] [58] is finding problems in the system. System 

monitoring tools that how the kernel uses system resources efficiently, or why problem has 

occurred in the CPU [61], memory, disk I/O, and network etc. However, system monitoring is 

not simple to find problems. For example, think about disk problem and what kind of check 

lists are needed to analyze its problem. 

� How much disk space remains?


� How often CPU accesses to I/O process per second? 

� How many I/O process reads/writes? 

� How much data reads/writes? 

Examples of above questions are very small piece of system monitoring and in order to study 

disk drive’s function, more various monitoring is needed. 

In the field, there are monitoring programs as follow. The monitoring programs of Table 2.1 

are automatically set up when install the Linux kernel. On the other hands, there are a lot of 

monitoring tools that not installed automatically such as sar, iostat, nmap, netcat, and ntop. 

Table 2.1: General monitoring programs by resources 

Resource Monitoring program 

CPU top, ps, uptime, vmstat, pstree, iostat, sar 

Memory free, vmstat, sar 

Disk I/O df, du, quota, iostat, sar 

Network ping, netstat, traceroute, tcpdump, nmap, netcat, ntop 

File Lsof 

In general, system monitoring [58] has to monitor regularly when the system is working 

normally. Then the system administrator can analyze any problems easily and quickly when 

the problems are occurring. In other words, the system monitoring is necessary for non- 

problem systems. 

14

2.5 Event Log 

15 


In general, event log [35] is a record while running the Linux kernel. These events are 

recorded by sequential order and the network information is recorded as well. In briefly, it 

provides the facts of that, “when, where, what, who, and why.” These event logs provide a 

standard for analysis problems. Also it makes use of prevention before the problem occurring. 

In addition, event logs are using problem verification of real-time and verification of network 

status. For example, if the Linux system is down in the middle of operation, everything is 

going to waste. How does it explains and how does it prevents such repeated problems. 

In general, analysis of event logs progresses as follows. 

� Collection: To collect logs with various methods. 

� Storage: To transmit events to the one place and save them. 

� Analysis: To analyze events with various methods. 

� Finding of the causes: To find the causes of problems on the basis of data analysis. 

In Figure 2.2, it is the most general way to log the events by an event logging tool. The log 

server gathers the event information (such as network event, system call, interrupt etc.). 

Event log is necessary to find the cause of problems and to make solution, but it is difficult 

to analyze problem. Because event log produces another type of log according to the tools 

and logged event information is huge. Also it takes a lot of time to analyze logs because the 

amount of log and facts of logs is huge.


2.6 Problem Analysis 

Figure 2.2: Process of general event logging 

Every professional developer says to focus on more time in the process of reading code 

than writing code. In other words, it is to make more efforts and time on 

improvement/review/bugging than before writing codes. If it was not simple hobby or 

homework, problem analysis is very important work. 

In the past UNIX period, every system programmer is the same as a system manager and 

their work is equal to each to each. However, in these days, their work divided in each field. 

By diving each field, the strength is more focus on their specialized field whereas the 

weakness is hard to analysis of problems from between each field. The marvelous 

investigational technique of problems is the proper balance among demand of fast solution, 

improvement of skills, and efficient practical use of experts. When problem occurred, must 

collect information and record it. 

Brief definitions of the lists of are as follows. 

16

� The exact time the problem occurred 

� Dynamic operating system information 

� What we were doing when the problem occurred 

� A problem description 

� Anything that may have triggered the problem 

� Technical investigation (Symptom and Cause) 

17 


Symptom in the technical investigation (Symptom and Cause) is external evidence of 

problem. These symptoms are classified under five categories as follows [16]. 

� Error 

� Crash 

� Hang (or very slow performance) 

� Performance problem 

� Unexpected behavior/output 

It is easy solve the problems after collecting information of above problems and classifying 

the problems. 

2.7 Linux Timer 

Linux Kernel Timer has two main works.


� To count time accurately. 

� To manage the deadline 

The Linux kernel makes timer function operate by using timer interrupts periodically. 

Especially, the function to manage the time limit is useful. For instance, it is effective to use 

in re-sending for networks, re-executing for non-responded devices, polling process for the 

device which cannot make interruption. 

There are mainly two types of Linux timer. 

� Global timer 

� To manage system time. 

� To make interrupt periodically. 

� Alarm function. 

� CPU local timer 

� To execute for certain CPU. 

� To occur in each CPU periodically. 

� Acknowledge the interrupt on local APIC. 

In Figure 2.3, it is the sequence to execute local timer interruption. Generally, if a local timer 

interrupt occurs then a local timer soft interrupt are executed. If the local timer interrupt 

occurs the interrupt handler is executed. And then, the local timer soft interrupt occurred and 

soft interrupt hander is running. 

18

Figure 2.3: The execution of local timer soft interruption handler 

And then, there are many kinds of timers in the Linux kernel. 

19 


� RTC: Every system has a real-time clock that runs in itself regardless of any other 

chips. After booting, the Linux kernel reads RTC and sets up the present time. 

� TSC: The 80x86 micro-processor has a clock pin which receives signals from the 

outer oscillator. Whenever the CLK pin receives the signals, the signals are saved in 

the 64 bit Time Stamp Counter register. 

� PIT: PIT is a counter which triggers an interrupt when it reaches the programmed 

count. There are one-shot mode and periodic mode. One-shot timers interrupt only 

once, and then stop counting. Periodic timers interrupt every time when they reach a 

specific value. 

� APIC: Local time of CPU. APIC generates an interrupt once or in a cycle such like 

PIT. However APIC sends the interrupt only to its own processor. 

� ACPI PMT: ACPI Power Management Timer. ACPI PMT is built in the ACPI main


board. Its clock signal is set up with approximately 3.58MHz and it increases 

counters in every clock. 

� HRTimer: HRTimer [28] provides high resolution (nanosecond) timers and exploits 

the system dependent timers/clocks. 

20


Related Work 

There is a report that more than 90 percent of the computer systems in global are based on 

embedded systems. In our routine, embedded systems are widely used these days. Embedded 

systems have developed since a computer has invented and extended. However, in recent 

years, much attention has been given to the embedded systems because it is becoming 

complicated. In other words, the technology of semiconductor and network has evolved 

rapidly. In addition, the technology of software was developed a lot such as multimedia and 

internet technology, etc. For example, the smartphone have started various works that are 

music player, movie player, game and internet, etc. as more than a simple message transfer 

function. The various functions have increased embedded systems complexity [9]. In the past, 

embedded systems were simple hardware. However, nowadays embedded systems hardware 

increased complexity by advanced hardware and many needs of users. Along with advanced


hardware such as SoC (Systems on Chip) technology, embedded software became very 

complex software. These technologies cause many innovations such as smartphone, PDA, 

netbook and tablet PC, etc. 

On the other hand, these innovations make system complicated, and increases errors and 

bugs continuously when the developing of the embedded systems. Moreover, these problems 

need a lot of time and effort to fix up. In addition, it became very important to way to fix up 

the errors and bugs because it is closely related to the performance and stability of the system. 

Therefore, most developers and system managers are analyzing event logs to figure out the 

best solutions for the problems. 

3.1 Event log 

Event logging and event logs monitoring play an important role in modern IT systems. 

Today, many applications, operating systems, network devices, and other system components 

are able to log their events to a local or remote log server. For this reason, event logs are an 

excellent source for determining the health status of the system, and a number of tools have 

been developed over the past 15-20 years for monitoring event logs in real-time [45]. 

3.1.1 Event logging 

The events that occur in the system depend on the status of the system, it is always 

changing. When a system component encounters an event, the component could emit an 

22

23 


event message that describes the event. For example, when a disk of a server becomes full, 

the server could generate a time stamped “disk full” message for appending to a local log file 

or for sending over the network as an SNMP trap. Event logging is a procedure of storing 

event messages to the event log, where event log is a regular file that is modified by 

appending event messages. (Although sometimes databases of event messages are also called 

event logs) Log client is the system component that emits event messages for event logging. 

In this thesis, the term event has often been used for denoting event messages when it is clear 

from the context. 

In modern IT systems, event logs play an important role: 

� Since in most cases event messages are appended to event logs in real-time as they 

are emitted by system components, event logs are an excellent source of information 

for monitoring the system, 

� Information that is stored to the event log can be useful for analysis at a later time, 

e.g., for audit procedures or for retrospective incident analysis. 

Event logging can take place in various ways. In the simplest case the log client keeps the 

event log on a local disk and modifies it when an event occurs. Unfortunately, event logs will 

be scattered across the system with this logging strategy, each log possibly requiring separate 

monitoring or other analysis. Furthermore, the strategy assumes the presence of a local disk 

which is not the case for many network nodes (e.g., switches and routers). 

Figure 3.1 centralized logging infrastructure. This is the flow chart of the event log which


shows how useful the logged event logs are for a system developer. 

3.1.2 Event Log Monitoring 

Figure 3.1: flow chart of the event log 

Because of the importance of event logs as the source of system health information, many 

tools have been developed over the past 15-20 years for monitoring event logs in real-time. 

Swatch [Hansen and Atkins, 1993] was the first such tool and is still used by many sites. 

Swatch [47] monitors log files by reading every event message line that is appended to the 

log file, and compares it with rules where the conditional part of each rule is a regular 

expression (rules are stored in a textual configuration file). If the regular expression of a 

certain rule matches the event message line, Swatch executes the action part of the rule. 

Actions include sending a mail, executing an external program, writing a notification to the 

system console, etc. Swatch has also an option for ignoring repeated event messages for a 

given time interval. 

Another popular tool for event log monitoring is Logsurfer [Ley and Ellerman, 1996]. Like 

24

25 


Swatch, Logsurfer [46] uses a rule-based approach for event processing, employs regular 

expressions for recognizing input events, and monitors log files by comparing appended 

message lines with its rules. Apart from executing actions immediately when certain event 

messages are observed, Logsurfer also supports contexts and dynamic rules. Context is a 

memory-based buffer for storing event messages, and Logsurfer can report the content of a 

context through an external program. Dynamic rule is a rule that has been created from 

another rule with a special action. 

In addition to commonly used Swatch and Logsurfer, a number of other tools exist for 

monitoring event logs in real-time, and the interested reader is referred to the Log analysis 

website [48] for more information. Apart from standalone monitoring tools, some systems 

and network management platforms like HP OpenView Operations (formerly called ITO) 

[64] and Tivoli Risk Manager [65] have also capabilities for monitoring event logs. 

Nevertheless, in order to use these the capabilities, the whole platform must be deployed 

which is a complex and time-consuming task. 

3.2 System Monitoring 

System monitoring is to check how to system is working on. If a system works very slowly, 

system manager should figure out what is the cause and how to fix it up. This is not rare for a 

system manager. The system management starts with checking the system’s condition 

periodically. The monitoring of the system is very important because it is needed when 

problems occur. In addition, if the manager misinterprets the monitoring, it makes incorrect 

error report and needs a lot of time to fix up. For instance, in Linux, many people use order


free to check memory. However, if there is a problem like below, how can you solve it? The 

system which does not running, however the system has used 503M memory of system. The 

manager thought that there is a problem in the system and reboots the system. However, it 

still said, “Used”. 

Figure 3.2: For example free command 

Reading a disk is very slow compared to memory. If many people access the system and 

execute order “ls”, where “ls” is one of the simple orders of reading, the system becomes 

very slow. In this case, if the memory which has read the information from disk saves the 

information temporarily, the system will work better. It called “disk buffering”, and buffer 

cache is widely used for it. If the size of cache is fixed, even the memory is huge, there will 

be a memory lack problem and occur swapping. It causes the system time-consuming. In 

Linux, systems automatically control empty space to buffer cache in order to improve the 

efficiency of the system. For instance, in Figure 3.2 the usable memory is “free + buffers + 

cached”, and “-/+ buffers/cache:” means the control. 

General Linux system managements and man pages still do not figure out these parts. Based 

on the principle of kernel and OS, the result has to be analyzed. Therefore the system 

monitoring and the analyzing of the result are very important from this view. 

26

27 


The most widely used system monitoring tool is Nagios [66], which is the for network 

monitoring. Nagios is able to monitoring host and network. The internet service is possible 

the remote control for the local server monitoring. For example, the system monitor can 

execute Nagios by using network and manage monitoring reports as connecting to central 

management server. 

There is another system monitoring tool, which is mostly used in Linux, order top. Order 

top is to print out the condition of the process, CPU and memory, and operating time and 

average loading number. There are two orders – ntop and htop. 

Ntop [67] is a free network monitoring software. Ntop displays network usage information 

in a similar fashion to the top command output. The current version of ntop features both 

command line and web-based user interfaces, and is available on both UNIX and Win32 

platforms. Ntop focuses on: 

• Traffic measurement, 

• Traffic monitoring, 

• Network optimization and planning, and 

• Detection of network security violations. 

Htop [68] is similar to the top command with few additional features. The main difference is 

that you can use a mouse to interact with the htop command output. Figure 3.3 shows htop, 

an interactive process viewer for Linux. It is a text-mode application (for console or X 

terminals).


Figure 3.3: Process viewer for Linux 

3.3 Performance Analysis Tools 

Debugging and tuning [60] are one of the most important parts in the system development. 

After development, they are still important because there is a possibility to occur unexpected 

errors. Therefore, the tools for the performance analyze play a critical role to find and fix up 

problems. 

One of the most famous tools, there is Linux kernel state tracer (LKST) [11]. It is an event 

tracer [21] which records the kernel’s condition information. For instance, it records various 

kinds of kernel information such as contact switch, signal transmission, interrupt, memory 

allocation, packet transmission. Among them, there are two critically important functions. 

� Process root trace: help to grasp where the problem has happened and what is going 

on. 

28

29 


� LKST log tool: It is the tool for analyzing log data that have function to suggest 

solution for problems. 

Kernel function trace (KFT) [2] [49] is a kernel function tracing system. The KFT system 

provides for capturing these callouts that was add instrumentation to every function entry and 

exit and generating a trace of events, with timing details. KFT is excellent at providing a 

good timing overview of kernel procedures. The trace data contains some general information 

regarding PID, start time and end time, the times are in time stamp counter (TSC) ticks. 

System Director Mevalet [8], which is developed by NEC JAPAN, helps to analyze system 

performance analysis. It detects problems early and prevents them in advance. Mevalet is 

able to express system’s behavior by CPU, DISK and Network. In addition, Mevalet can 

analyze the bottle neck problem and the performance tuning problem in an embedded system. 

It is not needed to modify application because of Mevalet patched in OS level, and there are a 

lot of choices to select languages and middlewares. 

Figure 3.4 Mevalet viewer’s execution


In Figure 3.4, through Mevalet viewer’s execution, a user can check a lot of information such 

as the process name and CPU processing time and Inter-Process Communications (IPC). 

Finally, there is a utility problem, NMON [22] from IBM. Generally, NMON is very useful 

to monitor a system because it shows a lot of information rather than order top in Linux. You 

can download it at the IBM homepage. NMON has designed for Linux professions to 

monitor performance and analyze AIX Operation system. 

It provide mainly these information 

� CPU using information 

� Memory using information 

� Ratio of disk I/O, transfer, ratio of R/W 

� Free storage of file system 

� Ration of network I/O, transfer, ratio of R/W 

Based on information, NMON output draws a graph, and makes a graphic file. 

3.4 Linux Trace Toolkit next generation 

Through the executing trace, LTTng [3] [5] [20] [32] [39] [42] [59] analyzes the system 

exactly. Executing traces shows a lot of information such as task handling time, period, an 

assigned process information. In addition, it calculates the delay time of application programs 

or the time for a certain program to read disk. 

30

It is very useful for these purposes. 

� To understand the system problems. 

31 


� To analyze system performance by the monitoring system and application program. 

� To analyze the communication network among processes. 

Moreover, LTTng is different with strace [69] or gprof [70] or Dtrace [18] in that it shows 

whole system including inside of the kernel. 

By using LTTng, we can copy and record the events occurring inside of the kernel such as 

thread, fork, interrupt, signal, and memory information, etc. from the kernel space to user 

space quickly. In addition to using LTTV (Linux Trace Tool Viewer) [3] [5], we can record 

and review the event log visually, and the overhead is reduced from 1.54 to 2.28 [3]. 

Figure 3.5: LTTV viewer’s execution 

Figure 3.5 shows LTTV after event logging. It provides time information by the nanosecond. 

In addition, it analyzes each CPU’s event.


Figure 3.6: Event logging sequence of LTTng 

Figure 3.6 shows an event logging sequence. To execute logging events, it adds a trace point 

to extract an event in the kernel first. And then, it executes the LTTng daemon and event 

logging, and save information. 

32


Infrastructure Framework 

In this chapter, we will be describing Kernel Analysis System (KAS) to solve problems 

occurring in kernel. Infrastructure of KAS is composed of three main layers. Each layer has 

following as: 

� Detection Layer (DL): In this layer, problems occurred in the kernel is found by 

using event log and saved the line information (start line and end line), making it 

easier when used by other layer. Also, by counting overall problem occurrence, it is 

possible to check the entire problem occurrence ratio. 

� Separation Layer (SL): In this layer, a problem occurred as event log is divided from 

whole event logs. By dividing event logs, a developer and administrator can check 

the problem easily and possible to find out what event has really occurred. 

� Analysis Layer (AL): In this layer, by using problem occurred an event log, we can 

easily and quickly detect the cause of problem by displaying execution time,


execution times, and whole latency time of each event. 

4.1 Introduction 

As mentioned above, embedded systems often used in daily life mostly. One of most 

important factors in embedded systems is the characteristic of real-time. Real-time 

characteristic is the most important measure that divides the general Linux kernel and the 

embedded kernel. In other words, an embedded system has a strong real-time characteristic. 

Automobile brake system can be a simple example of an embedded system. The automobile 

brake system is one of real-time systems that must not allow the delay. If latency of the 

automobile break system occurred that causes a traffic accident. Also, real-time must be 

assured for home appliances used, such as microwaves and washing machines. Time is an 

important factor for the navigation system of an airplane or a weapon system. As mentioned 

above, most of embedded systems are created by concerning real-time characteristics as one 

of important factors. From small and light devices to very large devices, most of embedded 

systems have to guarantee the deadline and when delay occurs, there is high a possibility of 

serious accident. 

Therefore, there is a need for a tool to analyze a latency problem of timers [33] and other 

problems occurring in the kernel. An excellent kernel analysis tool is the most important for 

problem solving and application development. There are a lot of kernel analysis tools in other 

to analyze the Linux kernel. Some are provided as commerce products and some are provided 

as open sources. Kernel analysis tools are essential to any kernels but most of kernel analysis 

tools are not complete. Very basic analysis tools are partly provided by Linux and it is 

34

impossible to analyze every problem by the tool. 

35 


An analysis tool is changed according to development environment. If the main purpose is 

network analysis, there are well-known tools such as ethereal [72], MRTG, and Ntop. In 

addition, there are open resource programs such as Nagios and JFFNMS [71]. There are 

nmon, strace, and many other usable tools for memory or other monitoring tools. 

Undoubtedly, before solving problems by using mentioned tools, it is syslog [48] that can be 

very simply checked in Linux. Mostly, log is recorded in /var/log/message but by changing 

established value in /etc/syslog.conf log can be saved in a certain place. Syslog file is a text- 

based message log recorded by the syslog daemon. By watching this file periodically, it is 

possible to trace important hint on common system stability such as lack of disk space, 

memory lack, I/O error, device failure. 

As mentioned above, although there are analysis tools and event logging tools, these tools 

only have ability to save logged event information in a text or to show them in a viewer. 

Therefore, in this thesis we propose KAS which can analyze the cause of a problem quickly 

and efficiently by using even information that has been logging. 

4.2 Kernel Analysis System 

Normally, kernel analysis tools show or output status information of kernel (CPU 

utilization, memory information, time information, etc.) in text type. Sometimes, an event 

that has been logging is displayed by using a viewer which makes it easier for a developer or 

an administrator to see. However, the analysis is not easy for a developer and a system 

administrator. In a case of text mode, it is difficult to find out where the problem has occurred


due to its large amount. Moreover, as information displayed in a viewer, event information is 

normally outputted in nanosecond, making it difficult to find the problem and the cause of 

problems. Figure 4.1 shows the normally used an event analysis method. An administrator or 

a developer analyzes by choosing between two analysis methods, text or viewer. 

Figure 4.1: Normal method of event logging and analysis 

If the problem and the cause of problem were quickly and efficiently analyzed by using 

event information occurred from the kernel, the development time of the embedded system 

will decrease and the reliability and stability will increase. 

Figure 4.2 shows the architecture of Kernel Analysis System (KAS). 10-20 years ago, the 

embedded system was developed mainly about simple work and a number of processes 

cannot be running in one system. However, recently it is essential to process many programs 

(mail, internet, music player, movie player, game, etc.) to be processed in one embedded 

system. Increasing of complexity will increase the possibility of the problem occurrence in 

the kernel. As increased system complexity is become harder and harder to solve it. Therefore, 

to analyze problems that occur in the kernel, a solution can be found by analyzing event 

information. 

36

Figure 4.2: Kernel Analysis System Architecture 

37 


For example, in order to analyze the timer latency, not only timer event but also all the 

information regarding to events (for example, system call, interrupt, thread, memory etc.) that 

occurred in the kernel must be analyzed. If we want to analyze the specific problem, we have 

to input the hook point into the kernel source for logging the event information. 

Figure 4.3: Event log process flow of KAS 

Figure 4.3 shows the flow of the event log process in KAS. We can analyze the problem 

more efficiently than normal event log solving methods (example Figure 4.1).


Advantage of analysis by using KAS is following. 

� Fast Problem Diagnosis 

Normally, due to a large amount of logs when diagnosing by looking at text and 

viewer, it consumes a large amount of time and effort, but if a developer uses KAS, 

it is possible to find problem quickly. 

� Reduction of development time 

When developing a system, it takes more time to analyze an error, a bug, or 

performance improvement than coding works. Therefore, if we can quickly find 

the errors and the by using KAS it is possible to decrease developing time. 

� Occurrence rate of bug and error 

If it is possible to diagnose and solve an error or a bug accurately when developing 

a system, we can decrease the occurrence rate of problems. Moreover, as one error 

or bug can be the cause of occurrence of another error or bug it is very helpful to 

decrease the problem occurrence rate of the whole system by solving one problem 

accurately. 

� After development 

Even though every problem was solved during the development period of 

embedded systems, there is 80% possibility of an occurrence rate of problem in 

commercialized embedded systems. In other words, there is high possibility of 

occurrence rate of problem when an embedded system is used by normal 

38

39 


customers. Therefore, there is problem occurred in commercialized embedded 

system, a developer and a system manager can minimized their damage quickly by 

analyzing problem with KAS. 

� Increase of system`s stability 

By solving the embedded system`s problem occurred in the development period, 

the system stability can get better. 

We suggest a system which can analyze kernel events, find out problems for the kernel 

and propose an effective solution. Because developing in an embedded system is in cross 

development environment, it differs from developing in a server or PC. Therefore, if a timer 

problem occurs, more time and effort is needed to fix up in an embedded system compared 

in server or PC environment for a system developer of an embedded system, the system we 

suggest would enhance the convenience in development and the stability in the system. 

4.2.1 Detection Layer 

If some problems occur, it is the most important thing to figure out the cause of the 

problems. Thus, it is important to find out where the problem came from when the problems 

occur in the embedded kernel which is used to invent embedded system. Embedded Linux 

which is frequently used in an embedded systems sees several events as system moves. It is 

hard to know the best way of debugging when problems occur in this kind of complicated 

systems. A number of inventors and experts are actually looking for the way to find out and 

solve the problem quickly. It happens to take 6 months solving one problem during the 

development period or it can be solved immediately sometimes because there are various


ways of debugging according to the level and features of the problems. While there are some 

problems which ask for a lot of time to analyze the problems. On the other hand, there are 

also problems which can be debugged simply. However, most of the real projects do not 

include simple problems. Therefore, the important issue here is that how to solve the problem. 

� Reproduction of problematic situation 

In order to solve the problem, it is important to be well aware of the way to 

reproduce the problem. How the error occurs and how it causes the systemic 

problems are very vital. However, the problems occurred from an embedded 

system varies from the simple problem resulted from the error of one source code 

to the complicated problems which are accumulated by each of single problems 

causing butterfly effect. For instances, in case of embedded Linux OS, although the 

memory leak problem does not have any problems during the short term period test, 

it might occur in long, repeated time test. In this situation, it is hard to link cause 

and result unless creation of memory leak is checked. If a program is written to be 

locked after giving 1Kbyte to function in device driver, it would take long time for 

a system to stop due to memory lack. Therefore, how fast it can be replayed is a 

good stepping-stone for debugging. 

� Understanding of a problem clearly. 

By working with various engineers it is possible to see many features of 

engineers. Some developers concentrated in ‘Copy & Paste’, too sensitive that 

lines up space of program line, and a fast developer, etc. Among these developers, 

it is one who knows specific details about created codes that have a highest ability 

40

41 


in debugging. Without a question, the system engineers need some conditions to 

debug the Linux system. 

� To understand the Linux system deeply. 

� To understand the relationship between hardware and software. 

� To have patience for solving of problem. 

� An accurate analysis of problem. 

� Using a debugger 

When a problem occurs during developing, a developer must be done quickly by 

using a debugger. If one is skillful with a debugger program, it is possible to do the 

debugging quickly. 

� Approach for problem solving 

To understand the problem of complicated embedded systems, it is better to start 

analyzing when driver or application is minimized and gradually approaching to 

problem than analyzing while numerous service programs, such as application, 

device driver, are running It is undoubted story. To do debugging process in large 

OS, it is easy to look into the problem by analyzing fractionally. 

We have mentioned various methods for debugging. A skillful developer will know above 

methods very well. However, not every developer is skillful and it is possible for the 

developer with many experiences to spend long time solving the problem if the problem is 

very complicated. Therefore, DL is a step that check the problem in a system and if a


problem has occurred it find where it was occurred. If looking at the period when a problem 

has occurred, there are chances that the problem may occur when embedded systems are 

developing and when a user is using embedded systems. 

4.2.1.1 When developing embedded system 

Commonly, embedded devices used by users are products created by a developer. These 

products were completed after going through a number of tests by many developers and 

adjusting many bugs and errors. Then there needs a way to solve numerous problems when a 

developer develops a system. It is possible to solve the problem by using lots of existing 

debugging methods but there needs to be a method to solve the problem much more quickly 

and efficiently. KAS does the event logging by using LTTng and solve the problem based on 

the logged data. Analysis method of text-based logs is accurate but takes too much time. 

Because there are large amount of data to analyze and to need knowledge of event 

information. When one does event logging for 5 to 10 minutes by LTTng, the amount of test 

data is a few gigabytes to over 10 gigabytes. It is possible to check this kind of logs by 

looking at it from the top to bottom to find out why a problem has occurred. The traditional 

debugging is the slowest way to find bugs. However, no one knows how long it will take to 

debug serious problems. It might take days, weeks, or months. Therefore, it is important to 

find the cause of the problem quickly and efficiently when a problem has occurred. 

4.2.1.2 Used by users 

According to the Ganssle Research Group of United States of America, they say 

“80% of all embedded systems are delivered late” and “New code generally has 50 to 100 

42

43 


bugs per 1000 lines”. It means there is a possibility of a problem occurrence in 

embedded systems. Generally, as the development period of an embedded system is 

pretty short, it lacks enough test and verification. Therefore, we need to consider how 

to solve such a problem. The answer is to solve quickly using event logs when a 

problem occurred. It is a job of a developer or an administrator to find out the cause 

of the problem. 

4.2.1.3 Process Flow of DL 

In DL, a problem is found by logged event information. To find a problem by using KAS, it 

is needed to define the problem which one would like to find out (for instance, for timer: 

algorithm on whether a timer has passed deadline or not). It is easy to define the timer latency 

problem because we only need to check whether it passed the deadline of a process. In Figure 

4.4 HRTimer_Tick means the deadline (expired time of task) of a high-resolution timer and 

HRTimer_Latency means the whole time before the high-resolution timer expired (including 

latency). Therefore, the problem of a timer can be defined as whether running time of a 

certain process has passed or not passed to the deadline. 

Figure 4.5 is the flow of DL’s processing. Firstly, as shows in Figure 4.4, DL defines a 

problem and checks the whole data from the top to bottom and find out problems. If the 

problems are not found, it continues to check without any results but if there is a problem, it 

saves the line information and checks how many times the problem has occurred. When the 

search is finished to the bottom, Separation Layer, which is the next layer, will be processed.


4.2.2 Separation Layer 

Figure 4.4: Problem definition 

Figure 4.5: Process flow of detection layer 

Commonly, to find a problem from the Linux kernel, a developer or administrator analyzes 

text-based low data or image information using data viewer to analyze the problem. Among 

44

45 


them, the most accurate method is to analyze the text-based data. However, to analyze by the 

text-based data has low efficiency in the usage of time. Therefore, in SL, to separate the 

problem event logs from a whole event log is a great help to a programmer. Separated data 

can be used for an administrator or a developer to have the accurate problem diagnosis 

Figure 4.6: Event log separated in SL 

Figure 4.6 is event logs when the problem occurred that was separated from the whole event 

logs. The method of separation is based on line information that received from DL. By 

reading the entire event log from the top to bottom, we can separates event logs that matches 

with line information of DL. 

Figure 4.7 shows the process flow of SL. After reading result line information (start line and 

end line) of DL, SL compares the start line and the whole event log line. After reading the 

end line from line information, SL starts the separation work. If line information is read until 

the end of the line than KAS executes AL.


4.2.3 Analysis Layer 

Figure 4.7: Process flow of Separation Layer 

Normally, a log is provided by analysis tools but they do not analyze problems. However, 

Analysis Layer (AL) of KAS does not accurately analyze the cause of the problem. Analysis 

is a job of an administrator or a developer. However, by analyzing the result of AL, it is 

possible to find out where, why, and when the problem has occurred. To do this, it is possible 

by using the statistics information done in AL. Figure 4.8 shows the process flow of AL. First, 

based on the result from DL and SL, AL reads the result of DL and the result of SL. Next, it 

checks how many times each event has occurred and their execution time. Also, it calculates 

and saves the latency time of the occurred problem in event logs. Of course, if a developer or 

an administrator needs more information than mentioned in this thesis, we can take needed 

information by the modification program of AL. 

46

4.3 KAS Algorithm 

Figure 4.8: Process flow of analysis layer 

47 


Processing the order of each layer is decided in KAS. As the result of each layer is used in 

other layer, the process of one layer must be finished to run another layer. 

Figure 4.9: Dependency of each module is KAS 

Figure 4.9 shows the dependency of each layer. A result after processing DL is used by SL. 

A result after processing DL and SL is being used by AL. Of course, when an administrator


and a developer analyzes a problem, they need to analyze accurately by using every result 

from each layer. In some the cases, the cause of a problem can be found just by analyzing 

results from SL, but to have the accurate analysis, it is recommended to use every result from 

each layer. 

Figure 4.10: Pseudo code of KAS 

48

49 


In Figure 4.10, the pseudo code shows the relation between each layer. First, in the 

detection layer, KAS checks whether the kernel problem occurred or not. Although it was e 

xplained in Figure 4.9, if it happened, the detection_problem() function saves information 

of the location and the number of times errors occurred. Next, in the separation layer, the 

separation_data() function separates the events of from the whole event log by using 

position_data (line information). After that, the save_separation_data() function saves the 

information. Finally, in the analysis layer, the analysis() function analyzes the information, 

and the analysis_save_data() function unifies and save the data analyzed. A problem solution 

can be more easily and effectively found by analyzing the cause of the problem using the 

results from the three steps defined above. 

4.3.1 Important Function and Parameter 

In this section we will be explaining about various parameter functions used by KAS. 

We will introduce important parameters among parameters declared by KAS. First, the 

most important parameter is event_name. The variable is read before analyzing by KAS, it 

decides what events will KAS analyze by reading event_name variable. Next we have the 

event_time variable. This parameter is a variable that saves the performance time of every 

event. By looking at event_time, one decides how much time each event used. After that, 

there is the event_description variable. This variable is a variable that saves information of 

each event except for event name and time information (PID, syscall_id, CPU_id, etc.).


Table 4.1: Important parameters in KAS 

Parameter Description 

char *event_name; variable of logging event`s name 

double event_time; variable of processing time of each event 

char *event_description; 

Information on each event 

(PID, syscall_id, CPU_id, etc.) 

Figure 4.11 shows how parameters explained in Table 4.1 matched with actual event 

information value. 

Figure 4.11: parameter and event log 

Table 4.2 show three most important functions. As mentioned above, KAS is largely divided 

in to three layers. The problem_detection() function is the main function in DL and the 

Table 4.2: Important functions in KAS 

Function Description 

void problem_detection(); Function to find occurred problem. 

void separation_data(); 

Function that separated event log from the whole event 

log. 

void problem_analysis(); Function that analyze event logs by using result of SL. 

50

51 


problem_separation_data() function is the main function in SL. For lastly, the 

problem_analysis() function is the main function in AL. 

4.4 Trace Point and Event Log 

Most of commonly used system performance tools and kernel analysis tools can be logging 

after adding trace point to the kernel source. Looking at Figure 4.12, a trace point is added to 

the kernel source and added trace point is recognized during a running event log tracer 

daemon and an added event is logged. 

Figure 4.12: Relation between trace point and event log 

When there is a wanted event log is KAS, a trace point is added to LTTng. Then, the event 

log daemon of LTTng log event. Figure 4.13 is an example of adding the trace point. First, to 

trace event information from the kernel, the kernel source is modified. After that, add the 

event name that was added to the kernel source to event_name.h file. By doing this, it is


possible to event logging using the LTTng and KAS can analyze event logs. 

Figure 4.13: An example showing the usage of trace point in LTTng and KAS. 

4.5 LTTng and KAS 

As simply mentioned in Chapter 3, LTTng is a performance monitoring tool that is 

currently used by various corporations and research centers such as Ecole Polytechnique de 

Montreal, Google, IBM research, Autodesk, Wind River, Montavista and STMicroelectronics. 

The tools are possible to log numerous information from the Linux kernel. If there is wanted 

information, it is possible to modify the kernel source. 

Figure 4.14 is the event information list of basic logging done by LTTng. Basic integration is 

integrated in one group with related information and a specific event is described in the group. 

52

53 


There are 18 groups that are described above and events that are defines to be in a group is 

118. Basically, it is possible to analyze 118 events. However, it is possible to log other events 

by adding a trace point. 

Figure 4.14: Basic trace point offered by LTTng 

Figure 4.15: Procedure of problem analysis process of LTTng and KAS 

Figure 4.15 is the illustrating process flow of before and after KAS. Normally, LTTng does 

event logging and after that analyze event logs by Linux Toolkit Viewer (LTTV). However, 

the proposed system analyzes KAS after event logging. Whatever analysis is done using 

LTTV or KAS, the problem analyzed lastly by a developer and an administrator. Therefore,


KAS’s dependency with LTTng is low. It is possible to modify DL of KAS and analyze 

between SL and AL when other event logging tools log the events. 

4.6 Summary 

In this chapter, we described the infrastructure of KAS. KAS is composed of three layers 

and each layer saves the result of event log analysis. By analyzing the result of every layer, it 

is possible to have exact analysis on problem occurred in the Linux kernel. Therefore, we 

proposed the KAS that is one of the event log analysis methods to analyze the kernel problem 

efficiently. 

54


Case Study 

The kernels such as Linux, Windows, Mac OS, Micro-kernel operates as time. Therefore, it 

is the timer which is the most important factor of kernel. Occurrence of timer delay can be 

the problem of kernel itself, but it can be problem of middleware or application. Especially, 

in case of RTOS or embedded system, timer is much more important. 

In this thesis, we measured the latency of High Resolution Timer (HRTimer) [6] in the 

Linux kernel. We found out and analyzed the latency of HRTimer in kernel by using KAS. It 

was proved that when analyzing by proposed system, it is possible to find problem quickly 

and analyze accurately.


5.1 Timer Latency 

When a process is calling a function of the Linux kernel, it uses a system call. However, 

when hardware is calling the Linux kernel, it uses interrupt. When the kernel receives 

interrupts it stops its process and operates an interrupt handler. It is clear that a priority is 

given to the interrupt. Request of interrupt handler with higher priority will stop the lower 

priority task and it will resume when finished the higher priority task. All kernels operate by 

interrupt (hardware interrupt or software interrupt). As a timer also operates inside the kernel, 

it is also operate by the interrupt. The timer controller will generate interrupts periodically. 

Commonly, Linux timer interrupts utilize a global timer interrupt and a local timer interrupt. 

Timer latency means to miss deadline. There are two reasons the timer latency. Firstly, 

latency arises as there are many required tasks to run after occurrence of interrupt. Figure 5.1 

is shown, each latency required from hardware interrupts to be scheduled. 

Figure 5.1 Task Preemption Latency Model 

� Interrupt Latency: Latency before starting of Interrupt Service Routine (ISR) after 

occurred hardware interrupts [30]. Hardware latency, interrupt disable latency, 

interrupt vectoring latency, interrupt dispatch latency are included. 

56

57 


� Interrupt Service Routine Latency: Until running an interrupt service routine after 

occurrence of interrupt. 

� Scheduler Latency: Until reaching scheduler after handling interrupt service routine. 

� Scheduling Latency: Latency from start of scheduler to it ends. 

� Task Preemption Latency: Until starting higher priority task after stopping lower 

priority task. 

Secondly, there is a possibility that a delay may occur during a task is running when higher 

priority task is occurred compare to current running task. Figure 5.2 shows the latency due to 

priority. Normally, the latency such as Figure 5.2 is occurring frequently in preemptive kernel. 

Figure 5.2 Priority Task Latency Model 

5.2 Preemptive vs. Non-preemptive 

The Linux kernel version before kernel 2.6 is non-preemptive kernel [15] [27] and after 

kernel 2.6 can choose between preemptive and non-preemptive. It is impossible to stop the 

process in non-preemptive kernel when the process entered from user mode [63] to kernel


mode [31]. In opposite, preemptive kernel controls to process that can be stopped forcefully 

by using scheduling policy or other interrupt when process is working as kernel mode. FCFS 

(First-Come-First-Served) is the representative non-preemptive scheduling and Round-Robin 

is the representative preemptive scheduling. 

5.2.1 Preemptive Kernel 

The importance of embedded kernels [1], as same as all other OS is to have preemption. If 

the Linux kernel has a preemption function that is the preemptive kernel. The preemptive 

kernel, the real-time characteristic, means there will be guarantee on deadline of high-priority 

task. As the response time of the real-time kernel in embedded systems is directly related to 

the safety and reliability of the systems, it is needed to minimize interval between response 

times by using the preemptive kernel. 

Figure 5.3 Process of interrupt of preemptive kernel 

Figure 5.3 shows the order of interrupts in the preemptive kernel. If the interrupt of a high 

58

59 


priority task occurs while a low priority task is running, the low priority task goes to the sleep 

mode and high priority task starts working. The preemptive kernel can run the low priority 

task after high priority task ends. 

5.2.2 Non-preemptive Kernel 

Non-preemptive section is very important in general-purpose OS and RTOS as priority is 

not allowed in non-preemptive section. RTOS is decided the performance by response time in 

non-preemptive section. Whatever there is a request of high priority task, it cannot be 

performed immediately. There are some problems with response time. When the latency of a 

certain non-preemptive section is 10 seconds, the real-time task can be running after 10 

seconds. Therefore, problem of non-preemptive task is solved in general Linux by using 

locking of critical section. 

Figure 5.4 Process of interrupt of non-preemptive kernel


Figure 5.4 shows the interrupt process in the non-preemptive kernel. Unlike the preemptive 

kernel, a low priority task keeps running even though high priority task occurs during the 

working of low priority task. After the low priority task ends, the kernel runs the high priority 

task. 

5.3 High Resolution Timer 

We already have a timer subsystem (kernel/timers.c), why do we need two timer 

subsystems? Normally, the most fine-grained time supported by the timer in Linux kernel is 

1ms. However, embedded Linux needs much more fine grained time. Therefore, any system 

engineers are trying to integrate high-resolution and high-precision features into the existing 

timer framework. However, general Linux timer cannot support accuracy of microseconds. 

HRTimer provides microsecond resolution with lower overhead and controls time more el 

aborately than other timer. It is not possible to use HRTimer in every system. To use 

HRTimer supported from hardware. The HRTimer system allows a user space program to be 

wake up from a timer event with better accuracy, when using the POSIX timer APIs. Without 

this system, the best accuracy that can be obtained for timer events is 1 jiffy. This depends on 

the setting of HZ in the kernel. In the 2.4 kernel, HZ was set to 100, which means that the 

best accuracy you could get on a timer wakeup in user space was 10 milliseconds. 

In other to use HRTimer needs as follows: 

� Need to verify that the kernel has support for this feature for your target 

processor (and board). 

60

� Need to configure support for it in the Linux kernel. 

� Set CONFIG_HIGH_RES_TIMERS=y in kernel config. 

� Compile the kernel. 

The timer that support microsecond APIs are as follows: 

61 


� timer_settime(): sets the time until the next expiration of the timer is specified by 

timer-id. 

� setitimer(): system provides each process with an interval timer. When the timer 

expires, a signal is sent to the process, and the timer expired. 

� nanosleep():set up high resolution sleep. 

� ualarm(): cause the SIGALRM signal to be generated for the calling process after 

the number of microseconds. 

� usleep(): cause the calling thread to be suspended from execution until either the 

number of real-time microseconds 

The HRTimer is not occurring timer interrupt periodically. The period of HRTimer can set 

up a programmer. 

5.4 Latency Policy 

In this thesis, we describe about timer latency and policy of HRTimer to measure HRTimer


latency. Figure 5.5 shows the model of the HRTimer latency. When a local apic interrupt 

occurs software interrupts are occurred after an interrupt handler takes action. If HRTimer 

expired, it is stopped by kernel_timer_itimer_expired. 

Figure 5.5 Hrtimer latency model 

We define the HRTimer latency model as follows: 

� 

� 

lapic 

T : The time from occurring HRTimer hardware interrupt to occurring hardirq 

handler to be expired. 

softirq 

T : The time from occurring softirq to until processing softirq handler 

expired 

� . T : The time from after softirq handler to until expired HRTimer. 

When there is delay in timer, following formula is required to check how delay was 

occurred. 

� Formula (1): 

time 

T Means a time of the HRTimer’s execution which is the all of time 

for HRTimer processing time and HRTimer latency. 

� Formula (2): Checking whether time latency occurred or not by comparing 

62 

tick 

HRT to

time tick 

T ( HRT is time set by a programmer). If 

time latency happened. 

5.5 Evaluation 

63 


latency 

HRT > 0, the HRTimer consider 

time lapic softirq expired 

T = T + T + T 

(1) 

latency time tick 

HRT = T - HRT 

(2) 

This section addresses the specification of experiments set up and evaluation of HRTimer 

latency. The system is with a 1.83GHz Intel Pentium 4 uniprocessor and 1GB RAM, on 

which is running a Linux kernel 2.6.24. 

First of all, we apply the LTTng patch to the Linux kernel in order to collect event logs. 

We use the setitimer() system call to send a SIGALRM signal to processor when a timer is 

finished, the function of setitimer() occurs interrupts in the process itself at certain future 

time. Figure 5.6 shows the setitimer function. First, it is possible to input ITIMER_REAL, 

ITIMER_VIRTUAL, and ITIMER_PROF as a real-time timer. In this experiment, we use the 

ITIER_REAL argument which is a real-time timer that is not related to running of process and 

generates SIGALRM after time out. The second argument is possible to set time value and 

time out is generated after the set time value. Also, it is more accuracy than alarm and 

possible to set exact time value.


Figure 5.6 Interface of setitimer 

Figure 5.7 set up setitimer 

Figure 5.7 shows how setitimer operates. Setitimer establishes two time values. it_value 

sets the first period of operation and it_interval sets the value of operation time after the first 

period. 

We set 

tick 

HRT as 100μs and 1ms and set the cycle of repetition as 100,000 with heavy 

background load. Figure 5.8 shows the repetition process when HRTimer period is set to 

100μs. From first period to n-th period it is operated continuously and when measuring every 

cycle calculated the HRTimer latency separately. 

Figure 5.8 Periodic process of HRTimer is set by 100μs 

64

The 


HRT analysis is based on a loop as follows: 

� reads the hardirq for HRTimer 

� reads softirq of HRTimer 

� reads itimer_expired time 

� computes formula (1) and formula (2) 

5.5.1 Result of KAS 

65 


In this section, we explain the results analyzed by KAS. The result of each layer is outputted 

in text type data. 

5.5.1.1 Result of DL 

Table 5.1 shows the results from the Detection Layer. Latency is expressed as 

time tick 

T - HRT , and it means HRTimer latency. We defined equation 


HRT ³ 100μs as 

latency, and record the number of latency times where latency-count as Table 5.1. It records 

not only the latency-time but also the line information (start line and end line) where the 

latency has occurred. 

Figure 5.9 shows a source code which saves the line information causing the latency. As 

latency_start_end[start_end_num][] is two dimensional arrangement variable on which the 

start_end_num saved the latency times and on the back part, information of start line and end 

line of latency is saved.


Table 5.1: Result of Detection Layer 

HRTimer latency (ns) Latency count Start line End line 

100,032 1 1,203 1,588 

116,693 2 5,606 5,810 

1,423 - - - 

… … … … 

100,806 15 18,445 18,548 

950 - - - 

93 - - - 

5,393 - - - 

100,465 16 31,220 31,310 

1423 - - - 

548 - - - 

46 - - - 

103,898 17 45,101 45,204 

432 - - - 

Figure 5.9: Source code of DL for line information 

66

5.5.1.2 Result of SL 

67 


In SL, it reads the latency_start_end value, which is the location information saved in DL, 

and its event log in same time. After that, it separates data of latency from whole event log. 

Figure 5.10: Result of Separation Layer 

Figure 5.10 shows the result of SL. It separates every event occurred from 

smp_apic_timer_interrupt_entry to smp_apic_timer_interrupt_exit, which shows entry and 

exit of HRTimer interrupt. 

5.5.1.3 Result of AL 

AL is the last layer of KAS and it finds statistics information about every delay occurred in 

HRTimer. Statistic information is outputted in every period and it is possible to trace latency 

by using the statistic information.


Table 5.2: Result of Analysis Layer 

Event name Execution times Consumption (ns) 

kernel_arch_syscall_exit 114 129,687 

kernel_arch_syscall_entry 114 269,678 

kernel_sched_try_wakeup 8 16,628 

kernel_timer_itimer_expired 1 688 

kernel_softirq_raise 2 2,976 

kernel_softirq_exit 3 2,257 

kernel_softirq_entry 3 4,182 

kernel_timer_set 4 4,013 

kernel_timer_update_time 1 1,955 

kernel_send_signal 2 2,939 

kernel_irq_exit 3 9,675 

kernel_irq_entry 3 15,591 

mm_page_free 9 18,643 

mm_page_alloc 351 553,913 

fs_writev 2 1,616 

fs_write 1 1,010 

fs_read 4 3,317 

fs_ioctl 4 3,126 

fs_pollfd 24 19,678 

fs_select 66 54,945 

kernel_sched_schedule 6 9,645 

input_event 7 30,820 

net_socket_call 85 53,884 

net_socket_sendmsg 85 68,806 

net_dev_receive 6 4,761 

net_dev_xmit 6 15,093 

68

69 


Table 5.2 shows the statistic information on occurred latency such as event names and 

execution times and consumption time. Execution time shows how many times each event 

was called in a period and consumption time shows overall time of each event during a 

period in nanosecond. From Table 5.2, it is possible to analyze event (mm_page_alloc, 

net_socket_call, and net_socket_sendmsg) has higher number than other events by analyzing 

execution times and consumption time. 

Table 5.3: Result of Analysis Layer: (a) Execution times when the time latency did not 

occur, (b) Execution times when the time latency occurred 

Event name (a) Execution times (b) Execution times 

kernel_arch_syscall_entry 37 92 

kernel_arch_syscall_exit 37 91 

net_socket_recvmsg 0 2 

net_socket_sendmsg 0 85 

net_dev_xmit 0 85 

mm_page_alloc 3 359 

mm_page_free 3 20 

… … … 

kernel_softirq_entry 1 6 

kernel_softirq_exit 1 6 

kernel_timer_itimer_expired 1 1 

Table 5.3 also shows the results of the analysis layer. In the table, the case (a) means 

execution times when the time latency did not occur, and the causes (b) means execution 

times when the time latency occurred by the network stress and I/O stress program. By 

comparing the case (a) to the case (b), we can figure out what event cause the time latency. In 

the result, the events - net_socket_sendmsg, net_dev_xmit, mm_page_alloc - were executed


most of time. Especially, the mm_page_alloc event caused the biggest time latency. 

Consequently, we can find the events which are occurred latency. 

5.5.2 Analysis of HRTimer Latency 

In this section, we analyze cause of delay in HRTimer and describe a solution based on the 

analyzed result by KAS. 

Figure 5.11: Event log of part where delay occurred 

70

71 


Figure 5.11 is the data showing the part where HRTimer latency occurred. According to the 

result of AL, events such as net_socket_sending, net_dev_xmit, and mm_page_alloc have the 

occurred a lot. These events were operated before the HRTimer event (HRTIMER_SOFTIRQ) 

was processed and according to the analysis of the kernel source, these events have higher 

priority than softriq of HRTimer. 

Figure 5.12 shows the result analyzed based on event logs is Figure 5.11. Figure 5.12 shows 

how HRTimer analyzes its latency. In the process of executing, between each softirq handler 

execution, HRTimer softirq (HRTIMER_SOFTIRQ) is executed. We can find that 

run_hrtimer_softirq() occurred after net_dev_xmit (NET_TX_SOFTIRQ) and mm_page_alloc 

(BLOCK_SOFTIRQ) that is higher priority than HRTimer softirq. 

Figure 5.12: One of the reasons of HRTimer latency 

Therefore, Figure 5.13 shows process of interrupt in timeline. When softriq with high 

priority is in progress, softriq with low priority cannot be executed. After the high priority 

softriq is finished, the low priority (HRTimer softriq) is executed.


Figure 5.13: Result of analysis of HRTimer latency 

Figure 5.14 is the kernel source after changing the priority of softriq which was the cause 

of HRTimer’s delay. We had an experiment after raising the priority of HRTimer’s softriq 

than network softirq and block softirq in interrupt.h. 

Figure 5.14: Kernel source of softirq modified HRTimer 

After changing the priority of softriq, we repeated experiment with 100μs in same 

environment as before. Figure 5.15 shows the data analyzed by using KAS after having 

repeated experiments. We measured the various methods that are tested by real-time patched 

Linux and general-purpose Linux and unmodified softriq and modified softriq. According to 

72

73 


the data shows in Figure 5.15(a), there were 608 delays in Linux-rt-patched and 5,546 delays 

in general-purpose Linux. In Figure 5.15(b), the number of delays in Linux-rt-patched was 

1,607 and 1,620 in general-purpose Linux. 

Figure 5.15: Result of experiment on Linux-RT and general Linux in 100μs. (a) Linux- 

2.6.24-rt-patched and Linux-2.6.24-not_changed-softirq. (b) Linux-2.6.24-rt-patched and 

Linux-2.6.24-changed-softirq. 

Figure 5.16 shows the result of experiment on Linux-rt-patched and general-purpose Linux 

in 1ms. By looking at data in Figure 5.16(a), there were 229 delays in Linux-rt-patched and 

524 delays in general Linux. In Figure 5.15(b), data shows 289 delays in Linux-rt-patched 

and 225 delays in general-purpose Linux. By comparing delay time of HRTimer to 100μs and 

1ms, it is clear that the incidence of delay in 1ms is much low. Also, the result of Figure 

5.16(b) shows that the result of solving delay problem of HRTimer by KAS is lower than 

Figure 5.16(a). This is the result of experiment after modifying softriq, which is one of causes 

of latency.


Figure 5.16: Result of experiment on Linux-RT and general Linux in 1ms. (a) Linux- 

2.6.24-rt-patched and Linux-2.6.24-not_changed-softirq. (b) Linux-2.6.24-rt-patched and 

Linux-2.6.24-changed-softirq. 

Figure 5.17 shows the result of an experiment of HRTimer latency in Linux-2.6.24. And 

Figure 5.18 shows the result of HRTimer latency in changed softirq Linux-2.6.24 and not 

changed the environments of Figure 5.17. We can find out that delay has decreased compare 

to general-purpose Linux kernel. 

Figure 5.17: Result of an experiment of HRTimer latency in Linux-2.6.24 with heavy 

background load 

74

75 


Figure 5.18: Result of an experiment of HRTimer latency in Linux-2.6.24-changed-softirq 

with heavy background load 

Figure 5.19: Result of experiment of HRTimer latency in Linux-2.6.24-rt-patched with heavy 

background load 

Also, Figure 5.19 shows the result of an experiment of HRTimer latency in Linux-2.6.24-rt- 

patched with heavy background load. This is the result in same environment as Figure 5.17 

after applying real-time patch to Linux-2.6.24.


5.6 Summary 

We analyzed latency of HRTimer by using KAS that is not perfect solution for timer 

latency. There is not only one reason of latency in the kernel but also complex reasons of 

latency from the kernel such as the relation between processes, dependency with hardware 

interrupt (hardirq) [10] [19] and software interrupt (softirq). Thus, there’re so many reason of 

latency it’s impossible to solve the problem by one perfect solution. After high resolution 

timer latency experiment and analyzing event log, the most common problem is the priority 

of softirq. 

KAS evaluated problem of HRTimer, currently it is not possible to analyze other problems. 

However, it proved by using KAS along with the analysis of kernel timer, which is one of the 

most important and difficult problem in kernel. If a developer and a system administrator 

used the KAS that it is possible to analyze timer problems quickly and efficiently. 

76


Conclusions and Future Work 

In this chapter, we describe a conclusion, and point out the flow of KAS, and describe future 

work. 

6.1 Conclusions 

Embedded systems are widely used in various fields. Especially an embedded Linux has 

many advantages in that it includes many strong points’ of Linux. However, sometimes the 

embedded Linux confronts complex problems because an embedded system is becoming 

complex rapidly. Sometimes the problems occur after system has launched. Therefore, not 

only the developer of the system but also the user of it has a possibility to be damaged by 

unexpected problems.


In embedded systems, there are a lot of errors and bugs. Some errors and bugs are solved 

easily. However, there are a lot of complicated problems such as memory leak and timer 

latency. In addition, it is not easy to find where the problem has occurred and how a 

developer or system manager fixes the problems up so as to improve the performance of 

system. 

Every embedded kernel generates event information such as irq, system call, I/O, memory 

and network It is the event information that is the one of the most useful solution to analyze 

the cause of a problem in embedded systems to improve the performance. However, there is 

another problem that the event information is huge. Therefore, to analyze all the event 

information is not easy and not effective. 

We propose a new system architecture which analyzes the event log through the embedded 

kernel - KAS. The KAS finds out problems through the kernel exceedingly quickly, and 

separates them from the whole event information. And then, it starts to analyze them 

statistically and provides a developer or system manager with the recommended solution. In a 

case study, we tested HRTimer’s latency, and got a result that the cause of HRTimer’s latency 

is usually coming from softirq priority. However, only by solving the priority problem, the 

HRTimer’s latency problem is not able to be settled perfectly since sometimes it is caused by 

hardirq’s latency or latency at an interrupt locking. However, we examined how effective and 

quick to solve problems at an embedded kernel by using the KAS. 

6.2 Future Work 

In this section, we describe problems of the KAS, and explain how to improve it. Actually, 

the KAS does not support real-time analysis. In addition, it needs to improve to be connected 

78

an accounting system that is able to control processes. 

6.2.1 Real-Time Architecture of KAS 

79 


KAS starts to analyze at the end of event log information. This can be a problem because it 

needs to save all the event log information first. In addition, problems can not be prevented in 

advance because KAS detects problems from the event log information. From these reasons, 

there are two main problems in KAS. First, it is not easy to analysis problem because the 

event log information is huge. Second, it can not be prevented in advance. 

To improve the KAS, we need to decrease of the event log information not by saving all the 

information but by saving only the part of the problem in real-time. This can be very helpful 

to decrease workload of a system and, save time for analyzing SL and AL at KAS. 

In Figure 6.1, KAS is the real-time architecture model of KAS. It checks problems through 

KAS daemon in real-time. If problems have detected, KAS saves only the part of the event 

log information which problems has occurred, and analyzes them by using SL and AL. 

6.2.2 KAS and CABI 

We tested the HRTimer latency at the case study. From the result of the case study, there are 

some processes that occupied all of the CPU usage or timer latency in the Linux kernel. 

Therefore, if KAS finds out a process which occupies CPU usage too much, it sends the 

process id to an accounting system. The accounting system [50] can manage the CPU usage


Figure 6.1: Real-Time Architecture of KAS 

based on the process id to lower the timer latency in embedded systems. Currently, we have 

integrated KAS and the CABI system [51] [52] in the Linux kernel; however, we did not 

experiment to measure the timer latency using the integrated system. 

Figure 6.2: KAS and CABI 

80

81 


Figure 6.2 shows the integrated system with KAS and CABI. KAS finds out the 

process which causes the time latency, and sends process id to CABI. And then, the 

CABI manages the CPU usage.

Appendix 

A.1 RTOS 

Commonly, a goal of an Operation System is to provide a convenient environment for a 

user to run programs. In other words, OS is a system program that supports how to use 

computer systems easily and uses computer hardware efficiently. Therefore, OS is the core 

software to use computers and it plays a very important role to control hardware, software 

and data. Real-Time Operation System (RTOS), which is one of OSes, can be defined in 

many means but it is an OS that guarantees interrupts to be processed in a period time which 

can be suitable to real-time applications such as embedded applications. In embedded 

systems, OSes can be largely divided into a real-time OS and a non-real-time OS. VxWorks, 

pSOS, VRTX, QNX, OSE, Nucleus, and MC/OSII can be good examples of real-time OS 

among commercialized OS until now. All of these real-time OSes support preemptive 

multitasking, and POSIX API. In preemptive kernel, as each task has priority, high priority

83 

Appendix 

task execute more than low priority tasks. Of course there are kernel mode and user mode in 

real-time OS as other OSes. Also, by providing the integrated development environment 

(IDE) and debugging tool, it makes it possible for developers to develop software easily. 

However, the problem is that as real-time OSes pay royalty, it increasing development cost of 

system and increased product cost. 

Following is characteristics of the RTOS. 

� Support multithread and preempt mode. 

� Guarantee priority each process. 

� Support synchronization among threads. 

� OS must be running clearly (processing time of interrupt latency time and system 

call, time of OS and driver to mask interrupt). 

Also, there is deadline for real-time and according to time constraints it can be divided into 

three types. 

� Hard real-time system: hardware or software that must operate within the confines of 

a stringent deadline. If deadline is missed, it occur the cost loss and the damage to 

users. 

� Soft real-time system: Failure to meet a deadline is considered neither in application 

nor system failure. The system can tolerate some occasional deadline misses. 

Due to system`s characteristic, real-time system have constraints for H/W and S/W. In case 

of hardware, to provide reliability, fault tolerance and scalability must be used. Also, for

Appendix 

software, real-time OS must be composed with the real-time task scheduling, task 

synchronization, interrupt priority and real-time clock according to purpose of platform. 

A.2 RT-Linux 

RT-Linux, which added real-time characteristic to the general-purpose Linux, was started 

by Victor Yodaiken from New Mexico Tech. As the Linux kernel has low real-time 

characteristic, real-time kernel was made to use the real-time applications. Thus, Linux 

kernel was not changed and real-time module was patched to the Linux kernel. By doing this, 

it is possible to execute real-time applications with to modify minimum sources. After 

patching real-time module, the processes of Linux kernel is assigned to the lower priority 

then the priority of real-time application. 

A scheduling method of general-purpose Linux was executed by time-slice. If there is 

higher priority task then current task than current task will not be stopped. The high priority 

task waits until it receives time-slice. For this reason, general-purpose Linux has low real- 

time characteristic. However, RT-Linux has better real-time characteristic compare to the 

Linux kernel. The RT-Linux supports soft real-time and hard real-time. However, in case of 

hard real-time, RT-Linux does not supports real-rime perfectly. 

A.3 Real-time scheduling 

Real-time system means the system time takes while managing data delivered from the 

sensor and delivering the result to actuator. If CPU processes one task, scheduling becomes 

very simple. Only performance of CPU will give effect to deadline. As CPU processes 

84

85 

Appendix 

various tasks with many characteristics in same time, scheduling problem became 

complicated. The purpose of scheduling is to prevent real-time task close to deadline due to 

task with less real-time characteristic. According to problem of real-time scheduling, 

hundreds of algorithms were developed according to characteristic of task as seen in Figure 

A.1. 

Figure A.1: Classification of real-time scheduling algorithm 

In real-time scheduling, task is divided into periodic task, aperiodic task, and sporadic task 

according to time characteristics. Periodic task means process repeated in decided cycle. 

Aperiodic task is a task with no time characteristic. Task that outputs status information 

according to order of administrator can be an example of aperiodic task. Sporadic task means 

task that has certain time characteristic like periodic task but it is not known when it would 

be processed. Also, according to seriousness of not keeping deadline, it can be divided into

Appendix 

critical task and non-critical task. Moreover, according to whether CPU is conceded or not 

when high priority task arrives, it can be divided in to preemptive task and non-preemptive 

task. Common task concedes CPU to high priority task, which is preemptive task. However, 

when task is processing critical section, it cannot be prior occupied. 

Methods of giving priority to task are static priority method and dynamic priority method. 

Static priority is a method that never changes priority that was assigned from scheduler and 

dynamic priority means method that changes dynamically according to process that will be 

occurred. The dynamic priority method can be scheduled more effectively more than static 

priority. The dynamic priority algorithm mostly used in commercial real-time OS, such as 

VxWorks or PSOS, are fixed priority scheduling and Earliest Deadline First (EDF) method. 

Both of algorithms are preemptive scheduling method and they are algorithm that assumes 

single processor. 

Firstly, as shown in the name, fixed priority scheduling is a method that gives fixed priority 

to every task. It was a problem about how to grant fixed priority. In 1973, Liu and Laryland 

proposed algorithm called rate-monotonic scheduling. This method proposed to grant high 

priority to task with short period (premise: every task is periodic task and deadline of task is 

equal to period of task). Due to simplicity of rate-monotonic scheduling and its mathematical 

characteristic which makes it possible to schedule anytime Utilization Factor (UF) is under 

0.67, it was often used for these few decades. 

EDF is an algorithm that grants higher priority to task that is shorter to deadline. For this 

reason, it is also called as deadline driven algorithm. This algorithm can be used to schedule 

not only periodic task but also all other tasks on single processor architecture. It was proved 

that it is most optimal mathematically. 

86

A.4 CABI 

87 

Appendix 

CABI is the system which manages CPU resource in Linux. Linux does not restrict the 

resource consumption for their processes. For example, when malicious application programs 

are downloaded and executed, the programs may consume a large amount of the CPU 

capacity easily. For the multimedia applications, more fine grain and CPU reservation control 

is needed. These requirements such as CPU QoS are increasing even in the Embedded 

System area. To solve this problem, CABI (CPU Accounting and Blocking Interfaces, 

currently that change the name to Common resource Accounting and Blocking Interfaces) 

[50] [51] proposed, a general-purpose resource monitoring and restriction system that 

prevents the excessive use of the resource capacity of a process or a group of processes. The 

CABI implemented in the Linux kernel. 

CABI was designed by the consideration of the following three issues [51] 

� Simplicity 

CABI should be simple and generic to be used in a variety of OS services such as 

security enhancement, class-based accounting, overload monitoring, and processor 

reservation. 

� Accuracy 

CABI should monitor the CPU capacity of each process very accurately for making 

the execution of application more stable. A fine-grained timer is used to realize the 

accurate monitoring.

Appendix 

� Portability 

CABI should be implemented in a variety of operating systems. The system confines 

the interface to a few hooks in the host kernel. 

Figure A.2: Control the consumptions of the resources by CABI 

Figure A.2 is the example of CABI controlling the process group. It puts together all the 

related process and control utilization of CPU for each object. If CPU utilization of Audio & 

Video application is set by 60%, only 60% of the process of MPEG, Mailer and Browser 

included in Audio & Video application’s object can be used in the whole CPU utilization. The 

CABI is a resource monitoring and restriction system that has the purpose of improving the 

system’s reliability and security. The system is a very generic to offer various services, such 

as security improvement, overload control, and class-based accounting, that require CPU 

resource control [50]. 

88

Bibliography

Bibliography 

[1] L. Abeni, A. Goel, C. Krasic, J. Snow, and J. Walpole, “A measurement- 

based analysis of the real-time performance of the Linux kernel”, In Real- 

Time Technology and Applications Symposium (RTAS 2002), Sept. 2002. 

[2] Tim Bird, “Learning the kernel and finding performance problems with 

KFI”, In CELF International Technical Conference, 2005. 

[3] Mathieu Desnoyers and Michel R. Dagenais, “The lttng tracer: A low 

impact performance and behavior monitor for gnu/Linux”, In OLS (Ottawa 

Linux Symposium) 2006, July. 

[4] Robert W. Wisniewski, Reza Azimi, Mathieu Desnoyers, Maged M, 

Michael, Jose Moreira, Doron Shiloach, and Livio Soares, “Experiences 

Understanding Performance in a Commercial Scale-Out Environment”, 13th 

International Euro-Par Conference, 2007. 

[5] LTTng project, http://ltt.polymtl.ca/. 

[6] G.Anzinger,. http://high-res-timers.sourceforge.net/, High resolution timers 

project. 

[7] Yaghmour K. and Dagenais M. R., “Measuring and characterizing system 

behavior using kernel-level event logging”, In Proceedings of the Annual 

Technical Conference on USENIX Annual Technical Conference, 13_26, 

2000. 

[8] System Director Mevalet, http://www.nec.co.jp/cced/mevalet. 

[9] Mathieu Desnoyers and Michel Dagenais, “Low disturbance embedded 

system tracing with Linux Trace Toolkit Next Generation”, In ELC 

(Embedded Linux Conference) 2006. 

[10] Martin Bligh, Mathieu Desnoyers and Rebecca Schultz, “Linux Kernel 

Debugging on Google-sized clusters”, Proceedings of the Linux 

Symposium June, 2007, Ottawa, Ontario in Canada. 

90

[11] Linux Kernel State Tracer, http://lkst.sourceforge.net . 

91 

Bibliography 

[12] Debugging with Data Display Debugger, User Guide and Reference 

Manual First Edition, for DDD Version 3.3.9. 15 January, 2004.An 

Introduction to the Real-time OS \& Nucleus PLUS Training Guide. 

Accelerated Technology Inc. 

[13] Kernel Function Trace, http://eLinux.org/Kernel\Function \Trace. 

[14] Nucleus, “An Introduction to the Real-time OS & Nucleus PLUS 

Training Guide”, Accelerated Technology. 

[15] Yu-Chung and K.-J Lin, “Enhancing the Real-Time Capability of the 

Linux Kernel”, In Proceedings of the IEEE Real Time Computing 

Systems and Applications, Hiroshima, Japan, October 1998. 

[16] Mark Wilding and Dan Behman, “Self-Linux Mastering -The Art of 

Problem Determination”. 

[17] Ki Duk Kwon, Joon Mo Jung and Sang Hong Kwon. “A Dynamic 

Voltage Scaling Algorithm for Aperiodic Tasks”, The Korea Academia- 

Industrial cooperation Society (KAIS), vol.7 no.5 P.866-874, October 

2006. 

[18] Bryan M. Cantrill, Michael W.Shapiro, and Adam H.Leventhal. 

“Dtrace: Dynamic instrumentation of production system”, In USENIX04, 

2004. 

[19] SystemTAP: Vara Prasad, William Cohen, Frank Ch. Eigler, Martin 

Hunt, Jim Keniston, and Brad Chen. “Locating system problems using 

dynamic instrumentation”, In OLS05 (Ottawa Linux Symposium) , 2005. 

[20] Mathieu Desnoyers and Michel R. Dagenais, “Deploying LTTng on 

Exotic Embedded Architectures”, Embedded Linux Conference 2009.

Bibliography 

[21] C. Yuan, N. Lao, J.-R. Wen, J. Li, Z. Zhang, Y.-M. Wang, W.-Y. Ma, 

"Automated Known Problem Diagnosis with Event Traces", Microsoft 

Research Technical Report MSR-TR-2005-81, Jun. 2005. 

[22] Stephen Atkins, IBM pSeries Technical Support, IBM Software Group, 

http://www.ibm.com/deve-loperworks/aix/ library/au-nmon_analyser/. 

[23] Lauterbach, “Integrated Run and Stop Mode Debugging for Embedded 

System”, Embedded System Conference 2007. 

[24] The Linux Advantage (Join the Linux revolution), Sage Software, Inc. 

[25] IA-PC HPET (High Precision Event Timer) Specification, Intel 

Corporation, 2004. 

[26] Embedded World, http://www.embeddedworld.co.kr/english/. 

[27] Luis Henriques. “Threaded IRQs on Linux PREEMPT-RT”, In 

International Workshop on Operating Systems Platforms for Embedded 

Real-Time Applications. Pages 23-32. Dublin, Ireland June 2009. 

[28] T. Gleixner and D. Niehaus, “Hrtimers and beyond: Transforming the 

Linux timer subsystem”, in Proc. Linux Symposium, Ottawa, Ontario, 

Canada, July 2006. 

[29] B. Srinivasan, S. Pather, R. Hill, F. Ansari, and D. Niehaus. “A firm 

real-time system implementation using commercial off-the shelf 

hardware and free software”, In 4 th Real-Time Technology and 

Applications Symposium, Denver, June 1998. 

[30] Mathieu Desnoyers and M. R. Dagenais, “Tracing for hardware, driver, 

and binary reverse engineering in Linux” , CodeBreaks Journal, vol. 1, 

no. 1, 2007. 

[31] Gabriel Matni and M. Dagenais, “Automata-based approach for kernel 

trace analysis”, in Proceedings of the 22nd IEEE Canadian Conference 

92

93 

Bibliography 

on Electrical and Computer Engineering, (St. John's, Newfoundland, 

Canada), May 2009. 

[32] Mathieu Desnoyers and M. Dagenais, “LTTng, filling the gap between 

kernel instrumentation and a widely usable kernel tracer”, in Proceedings 

of the 3rd Annual Linux Foundation Collaboration Summit, (San 

Francisco, California), April 2009. 

[33] Jean-Hughes Deschenes, Mathieu Desnoyers, and M. Dagenais, 

“Tracing Time Operating System State Determination”, The Open 

Software Engineering Journal, vol. 2, pp. 40-44, 2008. 

[34] Eric Clement and M. Dagenais, “Traces synchronization in distributed 

networks”, Journal of Computer Systems, Networks, and 

Communications, vol. 2009, 2009. 

[35] M. Dagenais, R. Moore, R. Wisniewski, K. Yaghmour, and T. Zanussi, 

“Efficient and accurate tracing of events in Linux clusters”, in 

Proceedings of the 2003 High Performance Computing Systems and 

Applications & OSCAR Symposium, (Sherbrooke, Quebec Canada), pp. 

291-294, May 2003. 

[36] Ki Duk Kwon, Midori Sugaya and Tatsuo Nakajima. “KTAS: Analysis 

of Timer Latency for Embedded Linux Kernel”, International Journal of 

Advanced Science and Technology (IJAST), vol.18, May, 2010. 

[37] E. Merlo, M. Dagenais, P. Bachand, J. S. Sormani, S. Gradara, and G. 

Antoniol, “Investigating large software system evolution: The Linux 

kernel”, in Proceedings of the 26th Annual International Computer 

Software and Applications Conference, (Oxford, England), pp. 421-426, 

August 2000.

Bibliography 

[38] Magdalena Balazinska, E. Merlo, M. R. Dagenais, B. Laguë, and K. 

Kontogiannis, “Advanced clone analysis to support object-oriented 

system refactoring”, in Proceedings of the 7th Working Conference on 

Reverse Engineering, (Brisbane, Australia), November 2000. 

[39] Karim Yaghmour and M. R. Dagenais, “The Linux Trace Toolkit”, in 

Actes de la conférence Linux Expo, (Montreal, Quebec, Canada), April 

2000. 

[40] Y. Blaquière, M. Dagenais, and Y. Savaria, “A new accurate and 

hierarchical timing analysis approach”, in Proceedings of the IEEE 

European Design Automation Conference, February 1993. 

[41] Kwon Ki Duk, Sugaya Midori, Ohno Yuuki and Nakajima Tatsuo. 

“Performance analysis of information explosion by using LTTng”, 

Information Processing Society of Japan (IPSJ), 5-299, March 2008. 

[42] Ohno Yuuki, Sugaya Midori and Kwon Ki-Duk. “Performance 

analysis of distributed applications in the information explosion era”, 


[43] Kiduk Kwon, Midori Sugaya, Tatsuo Nakajima. "Analysis of High 

Resolution Timer Latency Using Kernel Analysis System in Embedded 

System”, 12th IEEE Symposium on Object/component/service-oriented 

Real-time distributed Computing Co-located with First International 

Workshop on Software Technologies for Future Dependable Distributed 

Systems(STFSSD 2009), pp.122-126, March 2009. 

[44] Martin Schulz, Brian S. White, Sally A. McKee, Hsien-Hsin Lee, and 

Jürgen Jeitner, “Owl: Next Generation System Monitoring”, In 

Proceedings of Computing Frontiers (CF'05) , Ischia, IT, May 2005. 

94

95 

Bibliography 

[45] R. Vaarandi, “Tools and Techniques for Event Log Analysis”, PhD 

Thesis, Tallinn University of Technology, 2005. 

[46] J. E. Prewett. “Analyzing cluster log files using logsurfer”, In Proc. 

Annual Conf. on Linux Clusters,2003. 

[47] Swatch, http://sial.org/howto/logging/swatch. 

[48] Syslog, http://www.loganalysis.org. 

[49] Nicholas Mc Guire, “Kernel Function Instrumentation – KFT”, 

Distributed & Embedded System Lab, Lanzhou University, December 31, 

2006. 

[50] Midori Sugaya, Shuichi Oikawa, Tatsuo Nakajima, “Accounting 

System: A fine-grained CPU resource protection mechanism for 

Embedded System”, IEEE International Symposium on Object-oriented 

Real-time distributed Computing (ISORC) 2006: 72-84. 

[51] CABI, http://osrg.dcl.info.waseda.ac.jp/~doly/cabi/. 

[52] Midori sugaya, Yuki Ohno, Andrej van der zee, Tatsuo Nakajim, “A 

Lightweight Anomaly Detection System for Information Appliances”, 

IEEE International Symposium on Object-oriented Real-time distributed 

Computing (ISORC) 2009 

[53] Android, http://developer.android.com/sdk/index.html. 

[54] iOS4, http://developer.apple.com/technologies/iphone/whats-new.html 

[55] Kirk Glerum, Kinshuman Kinshumann, Steve Greenberg, Gabriel Aul, 

Vince Orgovan, Greg Nichols, David Grant, Gretchen Loihle, and Galen 

Hunt, “Debugging in the (Very) Large: Ten Years of Implementation and 

Experience”, Proceedings of the 22nd ACM Symposium on Operating 

Systems Principles (SOSP '09).

Bibliography 

[56] Balakrishnan, S.; Ravi Rajwar; Upton, M.; Lai, K., “The impact of 

performance asymmetry in emerging multicore architectures”, Computer 

Architecture, 2005. ISCA '05 Proceedings 32 nd International Symposium, 

vol.no.pp. 506- 517, 4-8 June 2005. 

[57] Min Hong Yun, Woo Sik Kim, Jae Ho Lee, Do Hyoung Kim, Sun Ja 

Kim, “Embedded Linux Solution for Smartphone System”, Electronics 

and Telecommunications Research Institute (ETRI), vol 21, no 1, 

February 2006. 

[58] Robert W. Wisniewski , Peter F. Sweeney , Kartik Sudeep , 

Matthias Hauswirth, “Performance and environment monitoring for 

whole-system characterization and optimization”, 2004. 

[59] Mathieu Desnoyers and M. Dagenais, “LTTng: Tracing across 

execution layers, from the hypervisor to user-space”, in Proceedings of 

the 2008 Linux Symposium, (Ottawa, Canada), July 2008. 

[60] Tokuda, H., Kotera, M, “A Real-Time Tool Set for the ARTS Kernel”, 

Proceedings of 9th IEEE Real-Time Systems Symposium, Dec., 1988 . 

[61] Clifford W. Mercer and Ragunathan Rajkumar, “An Interactive 

Interface and RT-Mach Support for Monitoring and Controlling 

Resource Management”, In Proceedings of the Real-Time Technology 

and Applications Symposium, May 1995. 

[62] Edward A. Lee, “Cyber-Physical Systems – Are Computing 

Foundations Adequate?”, NSF Workshop On Cyber-Physical Systems: 

Research Motivation, Techniques and Roadmap, October 16-17, 2006. 

[63] Venjamin Poirier, R. Roy, and M. Dagenais, “Unified kernel and user 

space distributed tracing for message passing analysis”, in Proceedings of 

96

97 

Bibliography 

the First International Conference on Parallel, Distributed and Grid 

Computing for Engineering, (Pecs, Hungary), April 2009. 

[64] HP, “HP OpenView Operations for Windows Troubleshooting Guide”, 

February 2004. 

[65] IBM Corporation Software Group, “IBM Tivoli Risk Manager”, 2004. 

[66] Ethan Galstad , “Nagios® Version 2.x Documentation”, November 

2006. 

[67] Aiko Pras, João Paulo Almeida, Yohannes Albertino Ramlie, “An 

Overview - NTOP – Network TOP”, University of Twente in 

Netherlands, June 2000. 

[68] http://htop.sourceforge.net/ . 

[69] http://sourceforge.net/projects/strace/. 

[70] http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html. 

[71] http://www.jffnms.org/. 

[72] http://www.ethereal.com/.

Acknowledgements 

I would firstly like to thank all the members in Distributed Computing & Ubiquitous 

Laboratory of Waseda University. It was truly helpful for me to have several seminars and 

discussions with members in the laboratory during PhD period. Especially, I think I would 

have gone through hard time proceeding my research without Professor Nakajima’s help. I 

would like to say special thanks to professor for all the best answers to my questions and 

helps while I write this dissertation. Also, I want to tank Midori Sugaya. Thank you for 

telling good opinion when writing thesis or doing experiment. 

I would like to send my thanks to Alexandre Courbot and Yuki Ohno who gave me lots of 

help when writing the thesis. Also, I want to thank Mr. Yong-Gu Kang who gave personal 

help while writing PhD thesis. 

I would like to thank all of great friends I have during my stay at Waseda University. And I 

would also like to thank my parents, my wife and our lovely daughter for their moral support. 

Finally I would like to express great thanks to the Hasekawa scholarship foundation, which 

sponsored my PhD study from April 2007 to March 2010. 

98

Publication List

Publication List 

種類別題名、発表・発行掲載誌名、発表・発行年月、連名者(申請者含む) 

論文誌 

国際会議 

1. Ki Duk Kwon, Midori Sugaya and Tatsuo Nakajima. “KTAS: Analysis of 

Timer Latency for Embedded Linux Kernel”, International Journal of 

Advanced Science and Technology (IJAST), vol.19, June, 2010 

1. Kiduk Kwon, Midori Sugaya, Tatsuo Nakajima. "Analysis of Embedded 

Linux Using Kernel Analysis System," The 6th IEEE International 

Conferences on Embedded Software and Systems (ICESS), pp.417-422, May 

2009 

2. Kiduk Kwon, Midori Sugaya, Tatsuo Nakajima. "Analysis of High 

Resolution Timer Latency Using Kernel Analysis System in Embedded 

System”, 12th IEEE Symposium on Object/component/service-oriented Realtime 

distributed Computing Co-located with First International Workshop on 

Software Technologies for Future Dependable Distributed Systems (STFSSD 

2009),pp.122-126, March 2009. 

3. Tatsuo Nakajima, Hiroo Ishikawa, Yuki Kinebuchi, Midori Sugaya, Lei Sun, 

Alexandre Courbot, Andrej van der Zee, Aleksi Aalto, and Kwon Ki Duk. “An 

Operating System Architecture for Future Information Appliances. ”The 6th 

IFIP WG 10.2 International Workshop, SEUS 2008, October 2008. Lecture 

Notes in Computer Science (LNCS), Vol. 5287 / 2008, pp. 292-303. 

100

101 

Publication List 

種類別題名、発表・発行掲載誌名、発表・発行年月、連名者(申請者含む) 

国内会議 

著書 

1. Kwon Ki Duk, Sugaya Midori, Ohno Yuuki and Nakajima Tatsuo. 

“Performance analysis of information explosion by using LTTng”, 


2. Ohno Yuuki, Sugaya Midori and Kwon Ki-Duk. “Performance analysis of 

distributed applications in tahe information explosion era”, Information 

Processing Society of Japan (IPSJ), 5-147, March 2008. 

3. Ohno Yuuki, Sugaya Midori and Kwon Ki-Duk, Nakajima Tatsuo “リソー 

スモニタリングによる異常検出システム”, The 6th Dependability 

System Workshop (DSW’08 Summer), pp.71-76, 2008. 

1. Ki-Duk Kwon, Je-Jung Yu, Bong-Kyu Seo, “Brew Mobile Programming ”, 

YoungJin publish company, 08, 2003.

Chapter 4 - DSpace at Waseda University

Create successful ePaper yourself

Delete template?

Save as template?