Child Process

The child process updates N random pages (by updating a cache line within each page) and exits;

From: Advances in Computers , 2017

Embedded Operating Systems

Tammy Noergaard , in Embedded Systems Architecture (Second Edition), 2013

Example 3: Embedded Linux and Fork/Exec [3]

In embedded Linux, all process creation is based upon the fork/exec model:

int fork (void)   void exec (…)

In Linux, a new "child" process can be created with the fork system call (shown above), which creates an almost identical copy of the parent process. What differentiates the parent task from the child is the process ID—the process ID of the child process is returned to the parent, whereas a value of "0" is what the child process believes its process ID to be.

#include <sys/types.h>

#include <unistd.h>

void program(void)

{

processId child_processId;

/* create a duplicate: child process */

child_processId = fork();

if (child_processId == −1) {

ERROR;

}

else if (child_processId == 0) {

run_childProcess();

}

else {

run_parentParent();

}

The exec function call can then be used to switch to the child's program code.

int program (char* program, char** arg_list)

{

processed child_processId;

/* Duplicate this process */

child_processId = fork ();

if (child_pId ! = 0)

/* This is the parent process */

return child_processId;

else

{

/* Execute PROGRAM, searching for it in the path */

execvp (program, arg_list);

/* execvp returns only if an error occurs */

fprintf (stderr, "Error in execvp\n");

abort (); }

}

}

Tasks can terminate for a number of different reasons, such as normal completion, hardware problems such as lack of memory, and software problems such as invalid instructions. After a task has been terminated, it must be removed from the system so that it doesn't waste resources, or even keep the system in limbo. In deleting tasks, an OS deallocates any memory allocated for the task (TCBs, variables, executed code, etc.). In the case of a parent task being deleted, all related child tasks are also deleted or moved under another parent, and any shared system resources are released (see Figure 9-14a).

Figure 9-14a. VxWorks and Spawn task deleted. [4]

When a task is deleted in VxWorks, other tasks are not notified and any resources such as memory allocated to the task are not freed—it is the responsibility of the programmer to manage the deletion of tasks using the subroutines below.

In Linux, processes are deleted with the void exit(int status) system call, which deletes the process and removes any kernel references to process (updates flags, removes processes from queues, releases data structures, updates parent-child relationships, etc.). Under Linux, child processes of a deleted process become children of the main init parent process (see Figure 9-14b).

Figure 9-14b. Embedded Linux and fork/exec task deleted. [3]

Because Jbed is based upon the Java model, a garbage collector (GC) is responsible for deleting a task and removing any unused code from memory once the task has stopped running. Jbed uses a non-blocking mark-and-sweep garbage collection algorithm, which marks all objects still being used by the system and deletes (sweeps) all unmarked objects in memory.

In addition to creating and deleting tasks, an OS typically provides the ability to suspend a task (meaning temporarily blocking a task from executing) and resume a task (meaning any blocking of the task's ability to execute is removed). These two additional functions are provided by the OS to support task states. A task's state is the activity (if any) that is going on with that task once it has been created, but has not been deleted. OSs usually define a task as being in one of three states:

READY: the process is ready to be executed at any time, but is waiting for permission to use the CPU.

RUNNING: the process has been given permission to use the CPU, and can execute.

BLOCKED or WAITING: the process is waiting for some external event to occur before it can be "ready" to "run."

OSs usually implement separate READY and BLOCKED/WAITING "queues" containing tasks (their TCBs) that are in the relative state (see Figure 9-15). Only one task at any one time can be in the RUNNING state, so no queue is needed for tasks in the RUNNING state.

Figure 9-15. Task states and queues. [4]

Based upon these three states (READY, BLOCKED, and RUNNING), most OSs have some process state transition model similar to the state diagram in Figure 9-16. In this diagram, the "New" state indicates a task that has been created, and the "Exit" state is a task that has terminated (suspended or stopped running). The other three states are defined above (READY, RUNNING, and BLOCKED). The state transitions (according to Figure 9-16) are New → READY (where a task has entered the ready queue and can be scheduled for running), READY → RUNNING (based on the kernel's scheduling algorithm, the task has been selected to run), RUNNING → READY (the task has finished its turn with the CPU and is returned to the ready queue for the next time around), RUNNING → BLOCKED (some event has occurred to move the task into the blocked queue, not to run until the event has occurred or been resolved), and BLOCKED → READY (whatever blocked task was waiting for has occurred and task is moved back to the ready queue).

Figure 9-16. Task state diagram. [2]

When a task is moved from one of the queues (READY or BLOCKED/WAITING) into the RUNNING state, it is called a context switch. Examples 4, 5, and 6 give real-world examples of OSs and their state management schemes.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123821966000091

Resource Sharing

Xiaocong Fan , in Real-Time Embedded Systems, 2015

18.3.5.1 Use a named semaphore to protect a memory-mapped file

Figure 18.13 illustrates a design where the process named semstarter creates a file object which is protected by a binary semaphore named mysem.

Figure 18.13. Use a semaphore to protect a memory-mapped file object.

The code of semstarter is given in Listing 18.6. It spawns three child processes from the code given in Listing 18.7. Let us refer to the child processes by the names semclient1, semclient2, and semclient3, respectively. According to the code, each of the three child processes maps the file log.txt into its own process space, opens the named semaphore created by semstarter, and then uses the semaphore to gain exclusive access to the mapped file object. In particular, for each round of the loop, the manipulation of the file object is enclosed by a procure operation and a vacate operation. This helps to preserve the integrity of the file object.

Listing 18.6. Sample code of using a named semaphore: starter.

Listing 18.7. Sample code of using a named semaphore: starter.

Figure 18.14 shows a screenshot of a run of the program. As compared with Figure 18.4, the big difference is that here another task cannot start to access the file object until after the current task has completed its use, whereas in Figure 18.4, more than one task can access the file object at the same time.

Figure 18.14. A run of the program given in Listings 18.6 and 18.7.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128015070000183

Introducing linux

Doug Abbott , in Linux for Embedded and Real-Time Applications (Fourth Edition), 2018

The Execve() Function

Of course, what really happens 99% of the time is that the child process invokes a new program by calling execve() to load an executable image file from disk. Listing 3.2 shows in skeletal form a simple command line interpreter. It reads a line of text from stdin, parses it, and calls fork() to create a new process. The child then calls execve() to load a file and execute the command just entered. execve() overwrites the calling process's code, data, and SSs.

Listing 3.2. Command line interpreter.

If this is a normal "foreground" command, the command interpreter must wait until the command completes. This is accomplished with waitpid() which blocks the calling process until the process matching the pid argument has completed. Note, by the way, that most multitasking operating systems do not have the ability to block one process or task pending the completion of another.

If execve() succeeds, it does not return. Instead, control is transferred to the newly-loaded program.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128112779000031

Application Software for Industrial Control

Peng Zhang , in Industrial Control Technology, 2008

5.3.3.5 Process Creation, Evolution, and Termination

A new process can be created because an existing process makes an exact copy of itself. This child process has the same environment as its parent, only the process ID number is different. This procedure is called "forking." After the forking process, the address space of the child process is overwritten with the new process data.

In an exceptional case, a process might finish while the parent does not wait for the completion of this process. Such an unburied process is called a "zombie" process.

When a process ends normally (it is not killed or otherwise unexpectedly interrupted), the program returns its exit status to the parent. This exit status is a number returned by the program providing the results of the program's execution.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780815515715500062

Simple Operations in Memory to Reduce Data Movement

Vivek Seshadri , Onur Mutlu , in Advances in Computers, 2017

8.2.1 The fork System Call

fork is one of the most expensive yet frequently used system calls in modern systems [99]. Since fork triggers a large number of CoW operations (as a result of updates to shared pages from the parent or child process), RowClone can significantly improve the performance of fork.

The performance of fork depends on two parameters: (1) the size of the address space used by the parent—which determines how much data may potentially have to be copied, and (2) the number of pages updated after the fork operation by either the parent or the child—which determines how much data are actually copied. To exercise these two parameters, we create a microbenchmark, forkbench, which first creates an array of size S and initializes the array with random values. It then forks itself. The child process updates N random pages (by updating a cache line within each page) and exits; the parent process waits for the child process to complete before exiting itself.

As such, we expect the number of copy operations to depend on N—the number of pages copied. Therefore, one may expect RowClone's performance benefits to be proportional to N. However, an application's performance typically depends on the overall memory access rate [100, 101], and RowClone can only improve performance by reducing the memory access rate due to copy operations. As a result, we expect the performance improvement due to RowClone to primarily depend on the fraction of memory traffic (total bytes transferred over the memory channel) generated by copy operations. We refer to this fraction as FMTC—fraction of memory traffic due to copies.

Fig. 17 plots FMTC of forkbench for different values of S (64   MB and 128   MB) and N (2–16   k) in the baseline system. As the figure shows, for both values of S , FMTC increases with increasing N. This is expected as a higher N (more pages updated by the child) leads to more CoW operations. However, because of the presence of other read/write operations (e.g., during the initialization phase of the parent), for a given value of N, FMTC is larger for S = 64 MB compared to S = 128 MB. Depending on the value of S and N, anywhere between 14% and 66% of the memory traffic arises from copy operations. This shows that accelerating copy operations using RowClone has the potential to significantly improve the performance of the fork operation.

Fig. 17. FMTC of forkbench for varying S and N.

Fig. 18 plots the performance (IPC) of FPM and PSM for forkbench, normalized to that of the baseline system. We draw two conclusions from the figure. First, FPM improves the performance of forkbench for both values of S and most values of N. The peak performance improvement is 2.2× for N = 16   k (30% on average across all data points). As expected, the improvement of FPM increases as the number of pages updated increases. The trend in performance improvement of FPM is similar to that of FMTC (Fig. 17), confirming our hypothesis that FPM's performance improvement primarily depends on FMTC. Second, PSM does not provide considerable performance improvement over the baseline. This is because the large on-chip cache in the baseline system buffers the writebacks generated by the copy operations. These writebacks are flushed to memory at a later point without further delaying the copy operation. As a result, PSM, which just overlaps the read and write operations involved in the copy, does not improve latency significantly in the presence of a large on-chip cache. On the other hand, FPM, by copying all cache lines from the source row to destination in parallel, significantly reduces the latency compared to the baseline (which still needs to read the source blocks from main memory), resulting in high-performance improvement.

Fig. 18. Performance improvement due to RowClone for forkbench with different values of S and N.

Fig. 19 shows the reduction in DRAM energy consumption (considering both the DRAM and the memory channel) of FPM and PSM modes of RowClone compared to that of the baseline for forkbench with S = 64   MB. Similar to performance, the overall DRAM energy consumption also depends on the total memory access rate. As a result, RowClone's potential to reduce DRAM energy depends on the fraction of memory traffic generated by copy operations. In fact, our results also show that the DRAM energy reduction due to FPM and PSM correlate well with FMTC (Fig. 17). By efficiently performing the copy operations, FPM reduces DRAM energy consumption by up to 80% (average 50%, across all data points). Similar to FPM, the energy reduction of PSM also increases with increasing N with a maximum reduction of 9% for N  =   16   k.

Fig. 19. Comparison of DRAM energy consumption of different mechanisms for forkbench ( S = 64   MB).

In a system that is agnostic to RowClone, we expect the performance improvement and energy reduction of RowClone to be in between that of FPM and PSM. By making the system software aware of RowClone (Section 7.3), i.e., designing the system software to be aware of the topology (subarray and bank organization) of DRAM, as also advocated by various recent works [102–104], we can approximate the maximum performance and energy benefits by increasing the likelihood of the use of FPM.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0065245817300165

POSIX and RTOS

Xiaocong Fan , in Real-Time Embedded Systems, 2015

13.1.1.1 Process creation

For a large (embedded) system consisting of multiple programs, it is common practice to have one master program, also known as a starter program, to start all the other programs. The starter itself is typically the last command of a shell script file (a batch of commands), which is automatically executed when the system is up and running.

A process (actually a thread within the process) can invoke system calls to create new processes. The newly created process is called a child process of the calling process, and the calling process is referred to as the parent of the new process. A child process and its parent process run independently, but a child inherits many attributes from its parent.

POSIX provides the following system calls for starting up new processes:

fork(). This call creates a new process which duplicates the calling process: (a) both processes execute the same application code; (b) their executions are synchronized at this call—both are about to return from the fork() call. The return value of the fork() call can be used to distinguish the calling process from the newly created child process: the return value is 0 in the newly created child process, while the return value is the child process's ID in the parent process. When the new process has just been created, it contains a single thread. If fork() is called from within a multithreaded process, the new process contains a replica of the calling thread only.

exec() family of functions. The calling process image is overlaid with the new process image, which is constructed from an executable file. There shall be no return from a successful exec. A call to any exec function from a process with more than one thread shall result in all threads being terminated.

posix_spawn() family of functions. The fork() implementation normally relies on MMU services such as memory swapping or dynamic address translation, which is generally too slow for real-time applications. The posix_spawn() function (and its variants) is a simple, fast implementation without address translation or other MMU services. Upon successful completion, posix_spawn() returns the child process's ID to the parent process.

system(). This call takes a command, starting up a shell (command language interpreter) to execute the command. This call does not return until the child process has terminated. It can be implemented by a posix_spawn() call, or alternatively by a fork() followed by an exec().

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128015070000134

Operating Systems

Yao-Nan Lien , in The Electrical Engineering Handbook, 2005

4.2.4 Shell: Command-Based User Interface

A user can interact with the operating system through a special process, the command interpreter using a command language. A command language can be either very simple in the early days or as powerful as a regular programming language. Most UNIX systems offer one or more command interpreter with a name called shell or its variations. Unfortunately, the term shell may cause some confusion because it is also used by researchers as a generic term to denote the layer of software for users to interact with an operating system. When any user logs in, a shell is started up. The shell has the terminal as standard input and standard output. It starts out by displaying a prompt symbol, such as a dollar sign, indicating that it is waiting to accept a command. For example, if the user types:

who ,

the shell creates a child process and runs the who program as the child. The shell is waiting until the child process is terminated. When the child finishes, the shell displays the prompt again and waits for the next command.

UNIX's shell offers a regular programming capability to its users such that its users can request very rich services from the operating system. In addition to the common flow control constructs, shell offers several unique features: I/O redirection, pipelining, and background job execution.

A user can specify standard output to be redirected to a file by typing, for example:

who > outfile .

Similarly, standard input can be redirected from another file, as in:

sort < infile > outfile ,

which invokes the sort program with input taken from file infile and output sent to outfile.

The output of one program can be used as the input for another program by connecting them with a pipe. Thus, the command:

who | wc > dev / tty 10

invokes the who program to list all users currently logged in the system and sends the output to wc to count the number of entries. The output of wc is redirected to a file, /dev/tty10 which by convention is a special file denoting a terminal.

If a user put an ampersand after a command, the shell does not wait for it to complete. Instead it just gives a prompt immediately. For instance, the command:

cc file . c &

starts up a compiling job in the background, allowing the user to continue working as the compiler is running.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978012170960050030X

Sendmail and IMAP Security

In E-Mail Virus Protection Handbook, 2000

Alternatives: Postfix and Qmail

Although Sendmail performs a large share of the mail routing that occurs on the Internet, competitors to Sendmail have appeared in recent years. The development of these alternative mail routers was driven by Sendmail's perceived security weaknesses and by the potential to improve the efficiency of mail routing software. Currently there are two programs that are serious contenders for a Sendmail replacement. Both were written by academic computer scientists and both are distributed as open source software.

Postfix

Postfix was developed by Wietse Venema at IBM's T. J. Watson Research Center. The development was supported by IBM and Postfix is freely distributed. Venema was also involved in writing the SATAN system security challenge software and has written about and authored a number of Internet software tools including TCP Wrapper (tcpd).

Postfix was designed for speed and security. Unlike Sendmail, it does not run as root. It is compatible with Sendmail in its support of .forward files for user control of mail forwarding to an alternate address. It uses Sendmail format alias files, and user inboxes in /var/mail or /var/spool/mail. To inject a mail message into Sendmail's outgoing mail queue, a program either has to run with root privilege or invoke Sendmail to accept the message into the queue. Postfix uses a world-writable mail drop directory, so that unprivileged programs can initiate sending mail. This causes potential problems if you have local users on the system. Local users may attempt to insert bogus files in the queue or create conditions that would delay or prevent delivery of certain messages. Postfix is fastidious enough to process queue files only with its specific format. On the other hand, this feature can be made more secure by revoking world-writable permissions and setting group permissions for the program that injects mail into the mail drop.

Sendmail can utilize a great number of resources when faced with a situation where mail cannot be delivered because of a network outage. Each time Sendmail attempts another delivery, it starts up more child processes, even if there are already a large number of Sendmail processes attempting to deliver messages. You can set a limit to the number of child processes that Sendmail can start. However, if that limit is too low, it can impede your normal delivery of messages. (This limit is set by the MaxDaemonChildren attribute in the sendmail.cf file. You can also control how quickly processes are started by setting the ConnectionRateThrottle parameter.) If the limit is high enough, your system will probably be overloaded by the time the limit takes effect. Postfix is designed to back off if it encounters shortages in memory or disk space.

In Postfix' brief existence, it has yet to stand a thorough trial. It is a bit bothersome that Postfix does not do a large amount of checking for data or command argument size, making a buffer overflow condition a possibility. Because Postfix' components do not run as root, the threat is theoretically lower.

Although Postfix is being used by several large sites, it is still untested and bears watching. You can find Postfix by visiting www.postfix.org and selecting the Web server nearest you. Along with an overview of the program and its design, there is access to multiple download sites.

Qmail

A competitor of Sendmail and Postfix is Qmail, written by Dan Bernstein, a professor in the Department of Mathematics, Statistics, and Computer Science at the University of Illinois at Chicago. Qmail's main claim to fame is its use for delivering messages for Hotmail, the free e-mail service that was acquired by Microsoft. Microsoft reportedly attempted to run Hotmail on an Exchange server, but was unsuccessful (see http://cr.yp.to/qmail/faq/orientation.html#users).

Qmail was designed with security as one of its main features. So confident (or worried) was its author that he offered a $500 prize for anyone finding a security flaw in the Qmail program. A $1000 prize was later offered, but neither prize has been claimed at the time of this book.

Bernstein assures security by not treating programs and files as addresses. Sendmail supports a form of alias delivery by appending to a file. Unless you control what files or what characteristics of files you can deliver to, this function can obviously be a security hole. In fact, Sendmail has a SafeFileEnvironment option to control this very behavior. Likewise, Sendmail can deliver to programs such as procmail or listserv. Qmail departs from this strategy and instead relies on its local delivery agent to deliver to programs as directed in the users's .qmail configuration file. Root is never a user in this context and thus you avoid the potential of delivery to an arbitrary or potentially threatening program.

Qmail avoids setting alternate user permissions and does as little as possible as root. As we've seen in our discussion of Sendmail security, running as root risks inherent vulnerabilities via a root compromise. The only program that runs with alternate user privileges is Qmail's mailqueue injector program.

Qmail uses multiple and non-trusting programs to perform the various functions. This is a bit like the classic revolutionary cabal that limits the knowledge of any one person in the organization. If one of Qmail's programs is compromised, only that one function is compromised and the other functions will continue to operate unthreatened.

Qmail uses a simple design and is smaller than Sendmail. Bernstein has adapted some of his previously developed programming libraries as a way of minimizing errors in programming (he is, after all, a computer scientist). By using programming code that he mostly designed and implemented, he can be more confident of avoiding buffer overflows and similar conditions that can introduce security problems.

Qmail has limited compatibility with Sendmail similar to that discussed for Postfix. It supports .forward files and Sendmail format aliases. It supports delivery to a central queue such as /var/mail. Sendmail can be used to inject messages into the outgoing mail queue, but Qmail has its own mail injector program as well.

Qmail documentation is not as well organized as that for Postfix, and none of the information on the Web pages, especially Bernstein's, has very much depth. There are a large number of links at the Qmail Web page, www.qmail.org, as well as a link to the distribution package. Qmail does have an impressive list of users, including, as mentioned Hotmail, ONElist, and the University of Buffalo's listserv service. However, its installed base is not nearly as extensive as Sendmail's.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9781928994237500148

Varied Expression Analysis of Children With ASD Using Multimodal Deep Learning Technique

S.P. Abirami ME , ... R. Karthick BOT , in Deep Learning and Parallel Computing Environment for Bioengineering Systems, 2019

14.2 State-of-the-Art

The state-of-the-art in this paper deals with four major discussions, namely (i) the motivation and driving force for the research, (ii) characteristics of autism, (iii) current screening methods, and (iv) existing computer interventions that could be incorporated in the screening mechanism. The study and survey do not give a comparison of existing deep learning techniques to identify the autism based insights. This is because, with the extensive possible study, the expression of a human being is identified through deep learning and the ability of identification of expression by an autistic children were dealt with. But, it does not include the expression shown by an autistic child [16,20]. This seems to be a major research gap and the chapter focuses on bridging the gap by identifying the facial expression of an autistic child through computer intervention.

Even though ASD is famed by major disturbances in social communication, there also exit other psychiatric conditions that are associated with impairments in social skills, communication, high up restriction and even repetitive behaviors [30,28]. These impairments lead to intellectual disability, specific language impairment, attention-deficit/hyperactivity disorder, anxiety disorders and disruptive behavior disorders [1]. Such compensatory abilities are caused through the neurodevelopment of the child varied in the age scale. The sensory nerves react to the state of children and reflect emotion expressed by them. Facial expressions are one of the primary signals used to detect the impaired communication abilities in social environment among the children with high functioning of autism spectral disorder [2].

The behavioral characteristics and the diagnostic probability show differences among various gender children with autism. But the sensory symptoms and the basic characteristics of an autistic child remain the same at the early identification level [3]. Hence the chapter deals with expression identification without considering the gender details and age. Also, it is important to analyze the expression and emotional behaviors in children post the clinical analysis period, as the subclinical levels of autism identification in the absence of compensatory ability identification over time may result in autism in the future [3,13,14,19].

Such early identification could be intervened through basic facial emotions that the child processes. On a very granular observation, these emotions could be identified by a mother when breast-feeding the baby. The children who are prone to autism do not show proper interest in viewing a human and that results in the non-engaged facial expression [21,25,29]. Such facial expression gaps could be identified and must be analyzed in an efficient manner so as to boost up the clinical observations and reviews.

In an experiment [4], it was observed that the target object to the stimuli might influence the emotional behavior of the children, either ASD or TD. Such emotions are to be carefully notified to improve the efficiency of the screening results and impacts [5,15,17]. Thus the expression faced by the children should be analyzed with and without the object intervention. The chapter initially focuses on exploring such expressions of the children in a contactless environment, later pertaining to the assumption of facing a camera and human under courtesy. Such in-depth analysis through human–computer interaction could be established on application of deep learning techniques to result in more sophisticated results that support the screening technique.

The early screening method initiates the process by face detection and through feature extraction. Viola et al. proposed the Viola–Jones algorithm for face detection that belongs to the class of Haar classifiers of facial detection and feature identification [10]. The algorithm undergoes Haar cascade classification, whereas Jing-Wein Wang et al. suggested an algorithm for facial feature specification that categorized the face into T shaped structure by extracting eyes, nose and position as three feature dimensions [11]. The current research chapter employs the Viola–Jones algorithm for facial detection that proves to be a better ethnic technique.

Lydia R. Whitaker et al. classified the facial expression of the children when the target object shows anger and happyness. The difference in the variance of emotion boundary suggests that the target object might be an influencing factor for an ASD positive child [12]. Exploration of such emotions identified through facial detections using machine learning algorithms will support the identification of autism much earlier when the clinical analysis process is initiated. To make some further improvements in the classification accuracy and to better rely on the screening mechanisms for early identification, the feature identification and analysis is made in a deeper sense using deep learning algorithms involved in the process.

A facial feature tracker can gather a combination of displacement factors from feature motion obtained from images or a live motion video and is subsequently used to train an SVM classifier [18]. Such classifications sort the expressions that are unseen by the humans. Such SVM based expression classifications are employed together in an ad hoc based, incrementally trained architecture for person-independent expression identification and analysis [10,18].

While the real impact of deep learning became apparent in recent decades [6,7], it has been applied in a wide range of application domains, including natural language processing, automatic speech recognition, image recognition, natural language processing, bioinformatics, and with a major focus on the medical diagnosis field [8,9,23,25]. Among the typically available deep learning models, stacked auto-encoder (SAE), deep belief network (DBN), convolution neural network (CNN), and recurrent neural network (RNN) stand to be widely used deep learning techniques which could converge at a faster rate.

It is observed that it is possible to apply more advanced features in a practical face detection solution as long as the false positive detections can be rejected quickly in the early stages upon the features classified using simple linear SVM. In this regard, CNNs can automatically learn features to capture complex visual variations by leveraging a large amount of training data. The chapter focuses on major directions to implement the CNN architecture that would progressively result in a better solution, giving maximum accuracy.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978012816718200021X

Fault and timing analysis in critical multi-core systems: A survey with an avionics perspective

Andreas Löfwenmark , Simin Nadjm-Tehrani , in Journal of Systems Architecture, 2018

4 Fault injection

To verify fault tolerance mechanisms such as those reviewed in Section 3, one can use automated fault-injection tools to speed up the otherwise often time-consuming verification and validation. Most fault-injection tools focus on a particular type of fault (hardware or software fault) and on a particular component such as cache or memory, which occupy large parts of the chip area and are thereby extra vulnerable to faults. However, estimating the impact of faults on timing also needs an assumption of the frequency of faults, which is missing from the models adopted by most fault injection techniques.

Since there already exists a recent survey of fault injection methods for single-core platforms, Natella et al. [10], we complement their survey by selecting representative papers that address fault injection on multi-core platforms. We are not only interested in verifying the fault tolerance mechanism, but also timeliness (Question 2).

4.1 Fault injection emulating hardware faults

Most modern processors include capabilities for detecting and reporting errors in most processor units. Cinque and Pecchia [50] use this mechanism in Core i7 from Intel to emulate machine check errors (e.g., cache, memory controller and interconnection errors) by writing to registers associated with the error-reporting capability. They target virtualized multi-core systems and support fault injection at both hypervisor and guest-OS-level. This is an easy and lightweight fault injection method, but requires support for writing to these error-reporting registers. In the works below we consider both specific and generic hardware models as a basis of injecting faults.

Wulf et al. [51] present a software-based fault injection tool, SPFI-TILE, which emulates single-bit hardware faults in registers or memory on the many-core TILE64 using a debugger (the gnu debugger, GDB). They also present a data cache fault-injection extension called Smooth Cache Injection Per Skipping (SCIPS), which distributes fault injection probabilities evenly over all cache locations. As the cache is not directly accessible from software, they emulate faulty cache data by using the debugger to halt the selected tile (core) at a fault injection point, step to the next load instruction, inject a fault in the correct memory address, and then continue the execution to let the load instruction finish with the faulty data. SCIPS is used to balance the fault injection probabilities by randomly skipping several load instructions instead of injecting into the first load instruction after the location where the debugger halts the processor.

By using the fork() and ptrace() system calls and operating system signals (e.g., SIGSTOP and SIGCONT) Vargas et al. [52] develop a fault injection tool that is hardware-independent (but not operating system-independent) and can inject faults into general purpose registers, some selected special purpose registers and in memory regions. The parent process (after the fork() call) is used as the fault injector and the child process executes the application under analysis. This is actually similar to what GDB does under the hood (used by Wulf et al. [51]). Multi-threaded (pthreads or OpenMP) applications are supported as the fault injector queries the operating system for the number of threads in the child process and their IDs.

Software-based fault injection is easy to use and portable, but cannot be used to inject faults into parts that are not accessible from software. A full-system simulator can expose the internal state of the processor, which simplifies fault injection and no modifications to the device under test is required.

In most simulators one can save a checkpoint containing the current state of the simulated system. This feature can be used to inject faults if modification of the saved checkpoint is possible. One can save one or more checkpoints during a fault-free simulation, which is regarded the golden model and can be compared to checkpoints saved after a fault has been injected. Checkpoints can also be used to reduce the amount of re-executed code and speed up the simulation. This checkpointing and golden model method is used by several fault injection frameworks, of which a few ([53–56]) are described here. Carlisle et al. [53] use the Simics full-system simulator to develop DrSEUs (Dynamic robust Single-Event Upset simulator), which is used to simulate single event upsets (SEUs) and single event functional interrupts (SEFIs). They use the checkpointing capability of Simics to inject bit-flips into any of the components in the processor (e.g., general-purpose registers, program counter, Ethernet controller registers and translation-lookaside buffer entries). Caches and main memory are not targeted by DrSEUs. OVPSim-FIM is presented by Rosa et al. [54]. The golden checkpoints are created by executing the application on the original OVPSim. Faults (single bit-flips) are injected at a random time, in a random component (registers or memory) and then the simulation continues. Miele [55] adds SystemC/TLM, for modeling the architecture, to the methodology which offers the possibility to monitor system behavior at both the application and the architectural level. The emulator QEMU is used to inject faults by Höller et al. [56] to analyse software countermeasures against attacks.

Petersén et al. [57] present a simulation-based platform for experimenting with fault injection and fault management, which utilizes an existing IEEE 1687 network for monitoring purposes. To simplify the experimenting, several parts of the platform are implemented using VHDL to model a multi-processor system-on-chip.

4.2 Fault injection emulating software bugs

Software fault injection (SFI) is a common technique to validate fault tolerance mechanisms in systems. Natella and Cotroneo [6] investigate whether SFI really does emulate transient software faults (mandelbugs) to a satisfactory degree. They perform a case study and analyse the SFI tool G-SWFIT, finding that the injected faults do not represent mandelbugs that well. This is because the injected faults are activated early in the execution phase and all process replicas are affected in a deterministic way.

Natella et al. [10] discuss several important aspects of SFI, such as how well the injected faults represent real faults, how efficiently the experiments can be performed and how usable the methods and tools are. Such a thorough analysis would also be needed for multi-core systems and would be useful when a sizeable body of works address injecting faults in multi-core systems.

Read full article

URL:

https://www.sciencedirect.com/science/article/pii/S1383762117304903