Transcription of NoSQL Distilled: A Brief Guide to the Emerging World of ...
1 NoSQL DistilledA Brief Guide to the Emerging World of Polyglot PersistencePramod J. SadalageMartin FowlerUpper Saddle River, NJ Boston Indianapolis San FranciscoNew York Toronto Montreal London Munich Paris MadridCapetown Sydney Tokyo Singapore Mexico CityMany of the designations used by manufacturers and sellers to distinguish their products are claimedas trademarks. Where those designations appear in this book, and the publisher was aware of atrademark claim, the designations have been printed with initial capital letters or in all authors and publisher have taken care in the preparation of this book, but make no expressed orimplied warranty of any kind and assume no responsibility for errors or omissions. No liability isassumed for incidental or consequential damages in connection with or arising out of the use of theinformation or programs contained publisher offers excellent discounts on this book when ordered in quantity for bulk purchases orspecial sales, which may include electronic versions and/or custom covers and content particular toyour business, training goals, marketing focus, and branding interests.
2 For more information, Corporate and Government Sales(800) 382 sales outside the United States please contact:International us on the Web: of Congress Cataloging-in-Publication Data:Sadalage, Pramod J. NoSQL distilled : a Brief Guide to the Emerging World of polyglotpersistence / Pramod J Sadalage, Martin Fowler. p. cm. Includes bibliographical references and index. ISBN 978-0-321-82662-6 (pbk. : alk. paper) -- ISBN 0-321-82662-0 (pbk. :alk. paper) 1. Databases--Technological innovations. 2. Informationstorage and retrieval systems. I. Fowler, Martin, 1963- II. Title. 2013 2013 Pearson Education, rights reserved. Printed in the United States of America. This publication is protected bycopyright, and permission must be obtained from the publisher prior to any prohibited reproduction,storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical,photocopying, recording, or likewise.
3 To obtain permission to use material from this work, pleasesubmit a written request to Pearson Education, Inc., Permissions Department, One Lake Street, UpperSaddle River, New Jersey 07458, or you may fax your request to (201) 236 : 978-0-321-82662-6 ISBN-10: 0-321-82662-0 Text printed in the United States on recycled paper at RR Donnelley in Crawfordsville, printing, August 2012 For my teachers Gajanan Chinchwadkar,Dattatraya Mhaskar, and Arvind Parchure. Youinspired me the most, thank you. PramodFor Cindy MartinContentsPrefacePart I: UnderstandChapter 1: Why NoSQL ? The Value of Relational Getting at Persistent A (Mostly) Standard Impedance Application and Integration Attack of the The Emergence of Key PointsChapter 2: Aggregate Data Example of Relations and Consequences of Aggregate Key-Value and Document Data Column-Family Summarizing Aggregate-Oriented Further Key PointsChapter 3: More Details on Data Graph Schemaless Materialized Modeling for Data Key PointsChapter 4: Distribution Single Master-Slave Peer-to-Peer Combining Sharding and Key PointsChapter 5: Update Read Relaxing The CAP Relaxing Further Key PointsChapter 6: Version Business and System Version Stamps on Multiple Key PointsChapter 7.
4 Basic Partitioning and Composing Map-Reduce A Two Stage Map-Reduce Incremental Further Key PointsPart II: ImplementChapter 8: Key-Value What Is a Key-Value Key-Value Store Query Structure of Suitable Use Storing Session User Profiles, Shopping Cart When Not to Relationships among Multioperation Query by Operations by SetsChapter 9: Document What Is a Document Database? Query Suitable Use Event Content Management Systems, Blogging Web Analytics or Real-Time E-Commerce When Not to Complex Transactions Spanning Different Queries against Varying Aggregate StructureChapter 10: Column-Family What Is a Column-Family Data Store? Query Suitable Use Event Content Management Systems, Blogging Expiring When Not to UseChapter 11: Graph What Is a Graph Database?
5 Query Suitable Use Connected Routing, Dispatch, and Location-Based Recommendation When Not to UseChapter 12: Schema Schema Schema Changes in Migrations for Green Field Migrations in Legacy Schema Changes in a NoSQL Data Incremental Migrations in Graph Changing Aggregate Further Key PointsChapter 13: Polyglot Disparate Data Storage Polyglot Data Store Service Usage over Direct Data Store Expanding for Better Choosing the Right Enterprise Concerns with Polyglot Deployment Key PointsChapter 14: Beyond File Event Memory Version XML Object Key PointsChapter 15: Choosing Your Programmer Data-Access Sticking with the Hedging Your Key Final ThoughtsBibliographyIndexPrefaceWe ve spent some twenty years in the World of enterprise computing.
6 We ve seen many things changein languages, architectures, platforms, and processes. But through all this time one thing has stayedconstant relational databases store the data. There have been challengers, some of which have hadsuccess in some niches, but on the whole the data storage question for architects has been the questionof which relational database to is a lot of value in the stability of this reign. An organization s data lasts much longer that itsprograms (at least that s what people tell us we ve seen plenty of very old programs out there). It svaluable to have a stable data storage that s well understood and accessible from many applicationprogramming , however, there s a new challenger on the block under the confrontational tag of NoSQL . It sborn out of a need to handle larger data volumes which forced a fundamental shift to building largehardware platforms through clusters of commodity servers.
7 This need has also raised long-runningconcerns about the difficulties of making application code play well with the relational data term NoSQL is very ill-defined. It s generally applied to a number of recent nonrelationaldatabases such as Cassandra, Mongo, Neo4J, and Riak. They embrace schemaless data, run onclusters, and have the ability to trade off traditional consistency for other useful of NoSQL databases claim that they can build systems that are more performant, scalemuch better, and are easier to program this the first rattle of the death knell for relational databases, or yet another pretender to thethrone? Our answer to that is neither. Relational databases are a powerful tool that we expect to beusing for many more decades, but we do see a profound change in that relational databases won t bethe only databases in use.
8 Our view is that we are entering a World of Polyglot Persistence whereenterprises, and even individual applications, use multiple technologies for data management. As aresult, architects will need to be familiar with these technologies and be able to evaluate which onesto use for differing needs. Had we not thought that, we wouldn t have spent the time and effort writingthis book seeks to give you enough information to answer the question of whether NoSQLdatabases are worth serious consideration for your future projects. Every project is different, andthere s no way we can write a simple decision tree to choose the right data store. Instead, what weare attempting here is to provide you with enough background on how NoSQL databases work, so thatyou can make those judgments yourself without having to trawl the whole web.
9 We ve deliberatelymade this a small book, so you can get this overview pretty quickly. It won t answer your questionsdefinitively, but it should narrow down the range of options you have to consider and help youunderstand what questions you need to Are NoSQL Databases Interesting?We see two primary reasons why people consider using a NoSQL database. Application development productivity. A lot of application development effort is spent onmapping data between in-memory data structures and a relational database. A NoSQL databasemay provide a data model that better fits the application s needs, thus simplifying thatinteraction and resulting in less code to write, debug, and evolve. Large-scale data. Organizations are finding it valuable to capture more data and process itmore quickly. They are finding it expensive, if even possible, to do so with relationaldatabases.
10 The primary reason is that a relational database is designed to run on a singlemachine, but it is usually more economic to run large data and computing loads on clusters ofmany smaller and cheaper machines. Many NoSQL databases are designed explicitly to run onclusters, so they make a better fit for big data s in the BookWe ve broken this book up into two parts. The first part concentrates on core concepts that we thinkyou need to know in order to judge whether NoSQL databases are relevant for you and how theydiffer. In the second part we concentrate more on implementing systems with NoSQL 1 begins by explaining why NoSQL has had such a rapid rise the need to process largerdata volumes led to a shift, in large systems, from scaling vertically to scaling horizontally onclusters. This explains an important feature of the data model of many NoSQL databases the explicitstorage of a rich structure of closely related data that is accessed as a unit.