Example: marketing

A fork() in the road - microsoft.com

Afork()in the roadAndrew BaumannMicrosoft ResearchJonathan AppavooBoston UniversityOrran KriegerBoston UniversityTimothy RoscoeETH ZurichABSTRACTThe received wisdom suggests that Unix s unusual combi-nation offork()andexec()for process creation was aninspired design. In this paper, we argue that fork was a cleverhack for machines and programs of the 1970s that has longoutlived its usefulness and is now a liability. We catalog theways in which fork is a terrible abstraction for the mod-ern programmer to use, describe how it compromises OSimplementations, and propose the designers and implementers of operating systems,we should acknowledge that fork s continued existence asa first-class OS primitive holds back systems research, anddeprecate it. As educators, we should teach fork as a histor-ical artifact, and not the first process creation mechanismstudents Reference Format:Andrew Baumann, Jonathan Appavoo, Orran Krieger, and TimothyRoscoe. 2019. Afork()in the road. InWorkshop on Hot Topics inOperating Systems (HotOS 19), May 13 15, 2019, Bertinoro, , New York, NY, USA, 9 pages.

Fork is no longer simple. Fork’s semantics have in-fected the design of each new API that creates process state. The POSIX specification now lists 25 special cases in how the parent’s state is copied to the child [63]: file locks, timers, asynchronous IO operations, tracing, etc. In addi-tion, numerous system call flags control fork’s ...

Tags:

  System, Microsoft

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of A fork() in the road - microsoft.com

1 Afork()in the roadAndrew BaumannMicrosoft ResearchJonathan AppavooBoston UniversityOrran KriegerBoston UniversityTimothy RoscoeETH ZurichABSTRACTThe received wisdom suggests that Unix s unusual combi-nation offork()andexec()for process creation was aninspired design. In this paper, we argue that fork was a cleverhack for machines and programs of the 1970s that has longoutlived its usefulness and is now a liability. We catalog theways in which fork is a terrible abstraction for the mod-ern programmer to use, describe how it compromises OSimplementations, and propose the designers and implementers of operating systems,we should acknowledge that fork s continued existence asa first-class OS primitive holds back systems research, anddeprecate it. As educators, we should teach fork as a histor-ical artifact, and not the first process creation mechanismstudents Reference Format:Andrew Baumann, Jonathan Appavoo, Orran Krieger, and TimothyRoscoe. 2019. Afork()in the road. InWorkshop on Hot Topics inOperating Systems (HotOS 19), May 13 15, 2019, Bertinoro, , New York, NY, USA, 9 pages.

2 INTRODUCTIONWhen the designers of Unix needed a mechanism to createprocesses, they added a peculiar new system call:fork(). Asevery undergraduate now learns, fork creates a new processidentical to its parent (the caller of fork), with the exceptionof the system call s return value. The Unix idiom offork()followed byexec()to execute adifferentprogram in thechild is now well understood, but still stands in stark contrastto process creation in systems developed independently ofUnix [ , 1, 30, 33, 54].Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copiesare not made or distributed for profit or commercial advantage and thatcopies bear this notice and the full citation on the first page. Copyrightsfor components of this work owned by others than the author(s) mustbe honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.

3 Request permissions from 19, May 13 15, 2019, Bertinoro, Italy 2019 Copyright held by the owner/author(s). Publication rights licensedto ISBN 978-1-4503-6727-1/19/05.. $ years later, fork remains the default process creationAPI on POSIX: Atlidakis et al.[8]found 1304 Ubuntu pack-ages ( of the total) calling fork, compared to only 41uses of the more modernposix_spawn(). Fork is used byalmost every Unix shell, major web and database servers ( ,Apache, PostgreSQL, and Oracle), Google Chrome, the Rediskey-value store, and even The received wisdom ap-pears to hold that fork is a good design. Every OS textbookwe reviewed [4,7,9,35,75,78] covered fork in uncriticalor positive terms, often noting its simplicity compared toalternatives. Students today are taught that theforksystemcall is one of Unix s great ideas [46] and there are lots ofways to design APIs for process creation; however, the com-bination offork()andexec()are simple and immenselypowerful .. the Unix designers simply got it right [7].

4 Our goal is to set the record straight. Fork is an anachro-nism: a relic from another era that is out of place in modernsystems where it has a pernicious and detrimental a community, our familiarity with fork can blind us to itsfaults ( 4). Generally acknowledged problems with fork in-clude that it is not thread-safe, it is inefficient and unscalable,and it introduces security concerns. Beyond these limitations,fork has lost its classic simplicity; it today impacts all theother operating system abstractions with which it was onceorthogonal. Moreover, a fundamental challenge with fork isthat, since it conflates the process and the address space inwhich it runs, fork is hostile to user-mode implementationof OS functionality, breaking everything from buffered IOto kernel-bypass networking. Perhaps most problematically,forkdoesn t compose every layer of a system from the kernelto the smallest user-mode library must support illustrate the havoc fork wreaks on OS implementa-tions using our experiences with prior research systems ( 5).

5 Fork limits the ability of OS researchers and developers toinnovate because any new abstraction must be special-casedfor it. Systems that support fork and exec efficiently areforced to duplicate per-process state lazily. This encouragesthe centralisation of state, a major problem for systems notstructured using monolithic kernels. On the other hand, re-search systems that avoid implementing fork are unable torun the enormous body of software that uses end with a discussion of alternatives ( 6) and a call toaction ( 7): fork should be removed as a first-class primitiveof our systems, and replaced with good-enough emulationfor legacy applications. It is not enough to add new primitivesto the OS fork must be removed from the 19, May 13 15, 2019, Bertinoro, ItalyAndrew Baumann, Jonathan Appavoo, Orran Krieger, and Timothy Roscoe2 HISTORY: FORK BEGAN AS A HACKA lthough the term originates with Conway, the first imple-mentation of a fork operation is widely credited to the ProjectGenie time-sharing system [61].

6 Ritchie and Thompson[70]themselves claimed that Unix fork was present essentially aswe implemented it in Genie. However, the Genie monitor sfork call was more flexible than that of Unix: it permittedthe parent process to specify the address space and machinecontext for the new child process [49,71]. By default, thechild shared the address space of its parent (somewhat likea modern thread); optionally, the child could be given anentirely different address space of memory blocks to whichthe user had access; presumably, in order to run a differentprogram. Crucially, however, there was no facility to copythe address space, as was done unconditionally by [69]later noted that it seems reasonable to sup-pose that it exists in Unix mainly because of the ease withwhich fork could be implemented without changing muchelse. He goes on to describe how the first fork was imple-mented in 27 lines of PDP-7 assembly, and consisted of copy-ing the current process out to swap and keeping the childresident in also noted that a combined Unixfork-exec would have been considerably more complicated,if only because exec as such did not exist; its function wasalready performed, using explicit IO, by the shell.

7 The TENEX operating system [18] yields a notablecounter-example to the Unix approach. It was also influ-enced by Project Genie, but evolved independently of designers also implemented a fork call for process cre-ation, however, more similarly to Genie, the TENEX forkeither shared the address space between parent and child,or else created the child with an empty address space [19].There was no Unix-style copying of the address space, likelybecause virtual memory hardware was fork wasnota necessary inevitability [61]. It wasan expedient PDP-7 implementation shortcut that, for 50years, has pervaded modern OSes and ADVANTAGES OF THE FORK APIWhen Unix was rewritten for the PDP-11 (with memorytranslation hardware permitting multiple processes to re-main resident), copying the process s entire memory onlyto immediately discard it in exec was already, arguably, in-efficient. We suspect that copying fork survived the earlyyears of Unix mainly because programs and memory weresmall (eight 8 KiB pages on the PDP-11), memory access1 Sharing memory between parent and child (as in Genie) was impracti-cal, because the PDP-7 lacked virtual memory hardware; instead, Uniximplemented multiprocessing by swapping full processes to also supported copy-on-write memory, but this does not appear tohave been used by fork [20].

8 Was fast relative to instruction execution, and it provided acompelling abstraction. There are two main aspects to this:Fork was well as being easy to implement,fork simplified the Unix API. Most obviously, fork needsno arguments, because it provides a simple default for allthe state of a new process: inherit it from the parent. Instark contrast, the WindowsCreateProcess()API takesexplicit parameters specifying every aspect of the child skernel state 10 parameters and many optional significantly, creating a process with fork is orthog-onal to starting a new program, and the space between forkand exec serves a useful purpose. Since fork duplicates theparent, the same system calls that permit a process to modifyits kernel state can be reused in the child prior to exec: theshell opens, closes, and remaps file descriptors prior to com-mand execution, and programs can reduce permissions oralter the namespace of a child to run it in restricted eased the days before threads orasynchronous IO, fork without exec provided an effectiveform of concurrency.

9 In the days before shared libraries,it enabled a simple form of code reuse. A program couldinitialise, parse its configuration files, and then fork multiplecopies of itself that ran either different functions from thesame binary or processed different inputs. This design liveson in pre-forking servers; we return to it in FORK IN THE MODERN ERAAt first glance, fork still seems simple. We argue that thisis a deceptive myth, and that fork s effects cause modernapplications more harm than is no longer s semantics have in-fected the design of each new API that creates processstate. The POSIX specification now lists 25 special casesin how the parent s state is copied to the child [63]: file locks,timers, asynchronous IO operations, tracing, etc. In addi-tion, numerous system call flags control fork s behaviourwith respect to memory mappings (Linuxmadvise()flagsMADV_DONTFORK/DOFORK /WIPEONFORK, etc.), file descriptors(O_CLOEXEC,FD_CLOEXEC) and threads (pthread_atfork()).Any non-trivial OS facility must document its behaviouracross a fork, and user-mode libraries must be prepared fortheir state to be forked at any time.

10 The simplicity and or-thogonality of fork is now a doesn t fork duplicates an entireaddress space, it is a poor fit for OS abstractions implementedin user-mode. Buffered IO is a classic example: a user mustexplicitly flush IO prior to fork, lest output be duplicated [73].Fork isn t processes today supportthreads, but a child created by fork has only a single thread(a copy of the calling thread). Unless the parent serialises forkwith respect to its other threads, the child address space may2 Afork()in the roadHotOS 19, May 13 15, 2019, Bertinoro, ItalyParent process size (MiB)050100150200250 Time (ms)0510152025fork+exec (fragmented)fork+exec (dirty)spawnFigure 1: Cost offork()+exec() ()end up as an inconsistent snapshot of the parent. A simplebut common case is one thread doing memory allocationand holding a heap lock, while another thread forks. Anyattempt to allocate memory in the child (and thus acquire thesame lock) will immediately deadlock waiting for an unlockoperation that will never guides advise not using fork in a multi-threaded process, or calling exec immediately afterwards [64,76,77].


Related search queries