Transcription of OpenDataStructures(inpseudocode)
1 Open Data Structures (in pseudocode)Edition Pat MorinContentsAcknowledgmentsixWhy This Book?xi1 The Need for Efficiency.. Interfaces.. The Queue, Stack, and Deque Interfaces.. The List Interface: Linear Sequences.. The USet Interface: Unordered Sets.. The SSet Interface: Sorted Sets.. Mathematical Background.. Exponentials and Logarithms.. Factorials.. Asymptotic Notation.. Randomization and Probability.. The Model of Computation.. Correctness, Time Complexity, and Space Complexity.. Code Samples.. List of Data Structures.. Discussion and Exercises.. 232 Array-Based ArrayStack: Fast Stack Operations Using an Array.. The Basics.. Growing and Shrinking.. Summary.. FastArrayStack: An Optimized ArrayStack.. ArrayQueue: An Array-Based Queue.. Summary.. ArrayDeque: Fast Deque Operations Using an Array.. Summary.. DualArrayDeque: Building a Deque from Two Stacks.
2 Balancing.. Summary.. RootishArrayStack: A Space-Efficient Array Stack.. Analysis of Growing and Shrinking.. Space Usage.. Summary.. Discussion and Exercises.. 573 Linked SLList: A Singly-Linked List.. Queue Operations.. Summary.. DLList: A Doubly-Linked List.. Adding and Removing.. Summary.. SEList: A Space-Efficient Linked List.. Space Requirements.. Finding Elements.. Adding an Element.. Removing an Element.. Amortized Analysis of Spreading and Gathering.. Summary.. Discussion and Exercises.. 784 The Basic Structure.. SkiplistSSet: An Efficient SSet.. Summary.. SkiplistList: An Efficient Random-Access List.. Summary.. Analysis of Skiplists.. Discussion and Exercises.. 975 Hash ChainedHashTable: Hashing with Chaining.. Multiplicative Hashing.. Summary.. LinearHashTable: Linear Probing.. Analysis of Linear Probing.
3 Summary.. Tabulation Hashing.. Hash Codes.. Hash Codes for Primitive Data Types.. Hash Codes for Compound Objects.. Hash Codes for Arrays and Strings.. Discussion and Exercises.. 1226 Binary BinaryTree: A Basic Binary Tree.. Recursive Algorithms.. Traversing Binary Trees.. BinarySearchTree: An Unbalanced Binary Search Tree.. Searching.. Addition.. Removal.. Summary.. Discussion and Exercises.. 1407 Random Binary Search Random Binary Search Trees.. Proof of Lemma .. Summary.. Treap: A Randomized Binary Search Tree.. Summary.. Discussion and Exercises.. 160vContents8 Scapegoat ScapegoatTree: A Binary Search Tree with Partial Analysis of Correctness and Running-Time.. Summary.. Discussion and Exercises.. 1739 Red-Black 2-4 Trees.. Adding a Leaf.. Removing a Leaf.. RedBlackTree: A Simulated 2-4 Tree.. Red-Black Trees and 2-4 Trees.. Left-Leaning Red-Black Trees.
4 Addition.. Removal.. Summary.. Discussion and Exercises.. 19810 BinaryHeap: An Implicit Binary Tree.. Summary.. MeldableHeap: A Randomized Meldable Heap.. Analysis of merge(h1,h2).. Summary.. Discussion and Exercises.. 21411 Sorting Comparison-Based Sorting.. Merge-Sort.. Quicksort.. Heap-sort.. A Lower-Bound for Comparison-Based Sorting.. Counting Sort and Radix Sort.. Counting Sort.. Radix-Sort.. Discussion and Exercises.. 235vi12 AdjacencyMatrix: Representing a Graph by a Matrix.. AdjacencyLists: A Graph as a Collection of Lists.. Graph Traversal.. Breadth-First Search.. Depth-First Search.. Discussion and Exercises.. 25213 Data Structures for BinaryTrie: A digital search tree.. XFastTrie: Searching in Doubly-Logarithmic Time.. YFastTrie: A Doubly-Logarithmic Time SSet.. Discussion and Exercises.. 27214 External Memory The Block Store.. B-Trees.. Searching.. Addition.
5 Removal.. Amortized Analysis ofB-Trees.. Discussion and Exercises.. 296 Bibliography301 Index309viiAcknowledgmentsI am grateful to Nima Hoda, who spent a summer tirelessly proofread-ing many of the chapters in this book; to the students in the Fall 2011offering of COMP2402/2002, who put up with the first draft of this bookand spotted many typographic, grammatical, and factual errors; and toMorgan Tunzelmann at Athabasca University Press, for patiently editingseveral near-final This Book?There are plenty of books that teach introductory data structures. Someof them are very good. Most of them cost money, and the vast majorityof computer science undergraduate students will shell out at least somecash on a data structures free data structures books are available online. Some are verygood, but most of them are getting old. The majority of these books be-came free when their authors and/or publishers decided to stop updat-ing them.
6 Updating these books is usually not possible, for two reasons:(1) The copyright belongs to the author and/or publisher, either of whommay not allow it. (2) Thesource codefor these books is often not avail-able. That is, the Word, WordPerfect, FrameMaker, or LATEX source forthe book is not available, and even the version of the software that han-dles this source may not be goal of this project is to free undergraduate computer science stu-dents from having to pay for an introductory data structures book. I havedecided to implement this goal by treating this book like an Open Sourcesoftware project. The LATEX source, pseudocode source, and build scriptsfor the book are available to download from the author s website1andalso, more importantly, on a reliable source code management source code available there is released under a Creative CommonsAttribution license, meaning that anyone is free toshare: to copy, dis-tribute and transmit the work; and toremix: to adapt the work, includingthe right to make commercial use of the work.
7 The only condition onthese rights isattribution: you must acknowledge that the derived workcontains code and/or text This Book?Anyone can contribute corrections/fixes using thegitsource-codemanagement system. Anyone can also fork the book s sources to developa separate version (for example, in another programming language). Myhope is that, by doing things this way, this book will continue to be a use-ful textbook long after my interest in the project, or my pulse, (whichevercomes first) has 1 IntroductionEvery computer science curriculum in the world includes a course on datastructures and algorithms. Data structures arethatimportant; they im-prove our quality of life and even save lives on a regular basis. Manymulti-million and several multi-billion dollar companies have been builtaround data can this be? If we stop to think about it, we realize that we inter-act with data structures constantly. Open a file: File system data structures are used to locate the partsof that file on disk so they can be retrieved.
8 This isn t easy; diskscontain hundreds of millions of blocks. The contents of your filecould be stored on any one of them. Look up a contact on your phone: A data structure is used to lookup a phone number in your contact list based on partial informationeven before you finish dialing/typing. This isn t easy; your phonemay contain information about a lot of people everyone you haveever contacted via phone or email and your phone doesn t have avery fast processor or a lot of memory. Log in to your favourite social network: The network servers useyour login information to look up your account information. Thisisn t easy; the most popular social networks have hundreds of mil-lions of active users. Do a web search: The search engine uses data structures to find theweb pages containing your search terms. This isn t easy; there are1 billion web pages on the Internet and each page contains alot of potential search terms. Phone emergency services (9-1-1): The emergency services networklooks up your phone number in a data structure that maps phonenumbers to addresses so that police cars, ambulances, or fire truckscan be sent there without delay.
9 This is important; the person mak-ing the call may not be able to provide the exact address they arecalling from and a delay can mean the difference between life The Need for EfficiencyIn the next section, we look at the operations supported by the most com-monly used data structures. Anyone with a bit of programming experi-ence will see that these operations are not hard to implement can store the data in an array or a linked list and each operation canbe implemented by iterating over all the elements of the array or list andpossibly adding or removing an kind of implementation is easy, but not very efficient. Does thisreally matter? Computers are becoming faster and faster. Maybe the ob-vious implementation is good enough. Let s do some rough calculationsto find of operations:Imagine an application with a moderately-sizeddata set, say of one million (106), items. It is reasonable, in most appli-cations, to assume that the application will want to look up each itemat least once.
10 This means we can expect to do at least one million (106)searches in this data. If each of these 106searches inspects each of the106items, this gives a total of 106 106= 1012(one thousand billion) speeds:At the time of writing, even a very fast desktop com-puter can not do more than one billion (109) operations per speeds are at most a few gigahertz (billions of cycles per second), and eachoperation typically takes a few Need for Efficiency that this application will take at least 1012/109= 1000 seconds, orroughly 16 minutes and 40 seconds. Sixteen minutes is an eon in com-puter time, but a person might be willing to put up with it (if he or shewere headed out for a coffee break).Bigger data sets:Now consider a company like Google, that indexesover billion web pages. By our calculations, doing any kind of queryover this data would take at least seconds. We already know that thisisn t the case; web searches complete in much less than seconds, andthey do much more complicated queries than just asking if a particularpage is in their list of indexed pages.