| SYSCALL(9) | Kernel Developer's Manual | SYSCALL(9) |
syscall — system
calls overview
System calls in the kernel are implemented through a set of switch tables for each emulation type. Each table is generated from the “master” file by sys/kern/makesyscalls.sh through the appropriate rules in the Makefile.
The “master” file is a text file consisting of a list of lines for each system call. Lines may be split by the means of back slashing the end of the line. Each line is a set of fields separated by whitespace:
number type ...Where:
The rest of the line for the STD, NODEF, NOARGS, and COMPAT_XX types is:
{ pseudo-proto } [alias]pseudo-proto is a C-like prototype used to
generate the system call argument list, and alias is an optional name alias
for the call. The function in the prototype has to be defined somewhere in
the kernel sources as it will be used as an entry point for the
corresponding system call.
For other types the rest of the line is a comment.
To generate the header and code files from the “master” file a make(1) command has to be run from the directory containing the “master” file.
Entry from the user space for the system call is machine dependent. Typical code to invoke a system call from the machine dependent sources might look like this:
const struct sysent *callp;
register_t code, args[8], rval[2];
struct proc *p = curproc;
int code, nsys;
...
/* "code" is the system call number passed from the user space */
...
if (code < 0 || code >= nsys)
callp += p->p_emul->e_nosys; /* illegal */
else
callp += code;
/* copyin the arguments from the user space */
...
rval[0] = 0;
/* the following steps are now performed using mi_syscall() */
#ifdef SYSCALL_DEBUG
scdebug_call(p, code, args);
#endif
#ifdef KTRACE
if (KTRPOINT(p, KTR_SYSCALL))
ktrsyscall(p, code, argsize, args);
#endif
error = (*callp->sy_call)(p, args, rval);
switch (error) {
case 0:
/* normal return */
...
break;
case ERESTART:
/*
* adjust PC to point before the system call
* in the user space in order for the return
* back there we reenter the kernel to repeat
* the same system call
*/
...
break;
case EJUSTRETURN:
/* just return */
break;
default:
/*
* an error returned:
* call an optional emulation errno mapping
* routine and return back to the user.
*/
if (p->p_emul->e_errno)
error = p->p_emul->e_errno[error];
...
break;
}
/* the following steps are now performed using mi_syscall_return() */
#ifdef SYSCALL_DEBUG
scdebug_ret(p, code, orig_error, rval);
#endif
userret(p);
#ifdef KTRACE
if (KTRPOINT(p, KTR_SYSRET))
ktrsysret(p, code, orig_error, rval[0]);
#endif
The SYSCALL_DEBUG parts of the code are
explained in the Debugging section
below. For the KTRACE portions of the code refer to
the ktrace(9) document for further explanations.
For debugging purposes the line
option SYSCALL_DEBUGshould be included in the kernel configuration file (see options(4)). This allows tracing for calls, returns, and arguments for both implemented and non-implemented system calls. A global integer variable scdebug contains a mask for the desired logging events:
Use ddb(4) to set scdebug to the desired value.
The syscall section manual page appeared
in OpenBSD 3.4.
| December 13, 2023 | Debian |