Transcription of Introduction to Database Systems - - turingMachine
1 Introduction to Database SystemsUVic C SC 370Dr. Daniel M. GermanDepartment of Computer ScienceMay 5, 2004 Version: 1 Introduction to Database Systems ( )CSC 370 is a DBMS? what is a relational DBMS?EWhy do we need them?EHow do we represent and store data in a DBMS?EHow does it support concurrent access and system failures?1 2 Introduction to Database Systems ( )CSC 370 how do we store lots of data?EAssume you work for Walmart (and Database managementsystems have not been invented) and you are asked to write acollection of programs that can store and retrieve every single sellin every store of the chain (could be a Tbyte of info)1 3 Introduction to Database Systems ( )CSC 370 How do you do it? How do you find and retrieve data?
2 How many files do you need? How many disks do you need? And if we add complexity: How do you operate on the data? How do you allow concurrent access and modifications to thedata? The problems are not trivial1 4 Introduction to Database Systems ( )CSC 370 a collection of data, typically describing theactivities of one or more related organizations. A databaseiscomposed of:FEntitiesFRelationsEADatabase Management SystemorDBMSis softwaredesigned to assist in maintaining and utilizing large collections 5 Introduction to Database Systems ( )CSC 370 are we going to cover?EDatabase design and application development: how do werepresent the world with a Database ?EData analysis: how can we answer questions about the enterpriseusing this data?
3 EConcurrency and robustness: How does a DBMS allow manyusers to access data concurrently, and how does it protect againstfailures?EEfficiency and Scalability: How does the Database cope withlarge amounts of data?1 6 Introduction to Database Systems ( )CSC 370 bit of historyEEarly 1960s:Charles Bachmanat GE creates the first generalpurpose DBMSI ntegrated Data Store. It creates the basis for thenetwork model(standardized by CODASYL)ELate 1960s:IBMdevelops theInformation Management system (IMS). It uses an alternate model, called thehierarchical datamodel. SABRE is created around :Edgar Codd, fromIBMcreates therelational data 1981 Codd receives the Turing Award for his contributionstodatabase passed away 2 weeks ago (April 2003)1 7 Introduction to Database Systems ( )CSC 370 bit of , developed byIBM, becomes the standard querylanguage for databases.
4 SQL is standardized by and 1990s, IBM, Oracle, Informix and others developpowerful the Internet Age, DBMS are showing how useful they can 8 Introduction to Database Systems ( )CSC 370 do we use a DBMS?EData independenceEEfficient data accessEData integrity and securityEData administrationEConcurrent access and crash recoveryEReduced application development timeSee textbook, section 9 Introduction to Database Systems ( )CSC 370 Relational Data Model: introductionEAdata modelis a collection of high level description constructsthat hide many low-level storage detailsEMost current DBMS use therelational data modelEThe central data description in this model is therelation(a set oftuples same as in set theory mathematics)EFor convenience, we refer to each tuple as a rowEAschemais a description of the data in terms of the data the relational model the schema looks like:RelationN ame(f ield1:type1.)
5 , f ieldn:typen)1 10 Introduction to Database Systems ( )CSC 370 Data Model: exampleEA relation of students:Students(sid:string, name:string, age:integer, gpa:real)EAn instance of the students relation can be represented 11 Introduction to Database Systems ( )CSC 370 Data Model: exampleEEach row is atuplein the relation (arecordin the DBMS)EWe can addintegrity constraints(assertions on the data) such asevery student has a differentsid1 12 Introduction to Database Systems ( )CSC 370 of Abstraction in a DBMSE xternal Schema 1 External Schema 3 External Schema 2 Conceptual SchemaPhysical SchemaDisk1 13 Introduction to Database Systems ( )CSC 370 SchemaEThethe conceptual schemadescribes the data stored in thedatabaseFIn a relational Database it describes all the relations stored inthe databaseECreating a good conceptual schema is not a simple task, and itiscalledconceptual Database design.
6 It involves:FDetermining the different relations neededFThe number of fields per relationFThe type of each 14 Introduction to Database Systems ( )CSC 370 of a Conceptual SchemaStudents(sid:string, name: string, login: string,age: integer, gpa: integer,gpa: real)Faculty(fid: string, fname: string, sal: real)Courses(cid: string, cname: string, credits: integer)Teaches(fid: string, cid: string)Enrolled(sid: string, cid: string, grade: string)..1 15 Introduction to Database Systems ( )CSC 370 SchemaEThephysical schemaspecifies how the relations are actuallystored in secondary storage devicesEIt also specifies auxiliary data structures (indexes) used to speedup the access to the relationsEDecisions about the physical schema depend upon.
7 FUnderstanding how the data is going to be accessedFThe facilities provided by the DBMS1 16 Introduction to Database Systems ( )CSC 370 of Physical SchemaEStore all relations in unsorted files of recordsECreate indexes in the first column of every relation, and in thesalcolumn 17 Introduction to Database Systems ( )CSC 370 SchemaETheexternal schemais a refinement of the conceptual schemaEAllows customized and authorized access to individual users orgroups of usersEEvery Database hasoneconceptual andonephysical schema, butit can havemanyexternal schemasEEach external schema: usersFis tailored to a particular group of usersFconsists ofone or more viewsandrelationsof the conceptualschemaEAviewis conceptually a relation, but its records are not stored inthe Database .
8 Instead, they are computed from other 18 Introduction to Database Systems ( )CSC 370 of External SchemaCourseInfo(cid: string, fname: string,enrollment: integer)1 19 Introduction to Database Systems ( )CSC 370 IndependenceEData Independencemeans that programs are isolated fromchanges in the way the data is structured and long as we maintain the external schema, we can modify theother 2 schemas of an applicationFLogical Data Independence: users are shielded from thelogical structure of the data ( a relation is split into 2ormore)FPhysical Data Independence: As long as the conceptualschema remains the same, we can change the storage detailsof the application without affecting the 20 Introduction to Database Systems ( )CSC 370 in a DBMSEWhy do we need a DBMS?
9 To answer queriesEA DBMS provides a specialized language, calledquerylanguageto ask questions to the DBMS1 21 Introduction to Database Systems ( )CSC 370 ManagementEWhat happens when a DBMS has more than one concurrent user?EWhen several users access (and possibly modify) a databaseconcurrently, the DBMS must order their request carefully toavoid conflictsEDBMS should also protect users from system failures:Fit should make sure data is not lostFit should deal with crashesin the middleof atransactionETransaction: a transaction is a conceptually indivisible group ofoperations that a user wants to perform (for example, gettingtransferring money from one account to another)1 22 Introduction to Database Systems ( )CSC 370 Execution of TransactionEAn important task of a DBMS is to schedule concurrent accessesin a way that every user can ignore the fact that others areaccessing the data at the same timeEA DBMS allows user to think that their programs are executed inisolationELocking has to be implemented to allow transactions to beinterleavedFShared Locks: allow several transactions to hold (and access)an object at the same timeFExclusive Locks.
10 Only one transaction can hold the object1 23 Introduction to Database Systems ( )CSC 370 Transactions and system CrashesEWhat happens if a DBMS crashes in the middle of a transaction?EWhen the DBMS recovers, the incomplete transaction should DBMS maintains alogof everything it writesEThe log is createdbeforethe operation is done:write-ahead log1 24 Introduction to Database Systems ( )CSC 370 of a DBMS1 25 Introduction to Database Systems ( )CSC 370 who deal with databasesEEnd user (maybe through a program):FWide range of skills and needsEProgrammers:FUsually combine code with DBMS commandsEDatabase administrator: responsible for:FDesign of the conceptual and physical schemasFSecurity and AuthorizationFData availability and failure recoveryFDatabase tuning1 26 Introduction to Database Systems ( )CSC 370 we are going to learn hereEHow to be a programmer.