Transcription of Introduction to Database Concepts
1 1 Part I: Introduction to DatabasesKostis Sagonas2 Introduction to DatabasesIntroduction to Database ConceptsQPurpose of Database SystemsQView of DataQData ModelsQData Definition LanguageQData Manipulation Language3 Introduction to DatabasesDatabase Management System (DBMS)QCollection of interrelated dataQSet of programs to access the dataQDBMS contains information about a particular enterpriseQDBMS provides an environment that is both convenient andefficient to Applications:+Banking: all transactions+Airlines: reservations, schedules+Universities: registration, grades+Sales: customers, products, purchases+Manufacturing: production, inventory, orders, supply chain+Human resources: employee records, salaries, tax deductionsQDatabases touch all aspects of our lives4 Introduction to DatabasesPurpose of Database SystemQIn the early days, Database applications were built on top offile systemsQDrawbacks of using file systems to store data:+Data redundancy and inconsistency Multiple file formats, duplication of information in different files+Difficulty in accessing data Need to write a new program to carry out each new task+Data isolation multiple files and formats+Integrity problems Integrity constraints ( account balance > 0) become partof program code Hard to add new constraints or change existing ones5 Introduction to DatabasesPurpose of Database Systems (Cont.)
2 QDrawbacks of using file systems (cont.)+Atomicity of updates Failures may leave Database in an inconsistent state with partialupdates carried out transfer of funds from one account to another should eithercomplete or not happen at all+Concurrent access by multiple users Concurrent accessed needed for performance Uncontrolled concurrent accesses can lead to inconsistencies two people reading a balance and updating it at the sametime+Security problemsQDatabase systems offer solutions to all the above problems6 Introduction to DatabasesLevels of AbstractionQPhysical level describes how a record ( , customer) is level: describes data stored in Database , and therelationships among the customer = recordname : string;street : string;city : integer;end;QView level: application programs hide details of data can also hide information ( , salary) for to DatabasesInstances and SchemasQSimilar to types and variables in programming languagesQSchema the logical structure of the Database + , the Database consists of information about a set of customers andaccounts and the relationship between them)+Analogous to type information of a variable in a program+Physical schema: Database design at the physical level+Logical schema.
3 Database design at the logical levelQInstance the actual content of the Database at a particular point in time+Analogous to the value of a variableQPhysical Data Independence the ability to modify the physical schemawithout changing the logical schema+Applications depend on the logical schema+In general, the interfaces between the various levels and components shouldbe well defined so that changes in some parts do not seriously influence to DatabasesData ModelsQA collection of tools for describing+data+data relationships+data semantics+data constraintsQEntity-Relationship modelQRelational modelQOther models:+object-oriented model+semi-structured data models+Older models: network model and hierarchical model9 Introduction to DatabasesEntity-Relationship ModelExample of schema in the entity-relationship model10 Introduction to DatabasesEntity Relationship Model (Cont.)
4 QE-R model of real world+Entities (objects) customers, accounts, bank branch+Relationships between entities Account A-101 is held by customer Johnson Relationship set depositor associates customers with accountsQWidely used for Database design+ Database design in E-R model usually converted to design in therelational model (coming up next) which is used for storage andprocessing11 Introduction to DatabasesRelational ModelQExample of tabular data in the relational modelcustomer-namecustomer-idcustomer-st reetcustomer-cityaccount-numberJohnsonSm ithJohnsonJonesSmith192-83-7465019-28-37 46192-83-7465321-12-3123019-28-3746 AlmaNorthAlmaMainNorthPalo AltoRyePalo AltoHarrisonRyeA-101A-215A-201A-217A-201 Attributes12 Introduction to DatabasesA Sample Relational Database313 Introduction to DatabasesData Definition Language (DDL)QSpecification notation for defining the Database schema+ table account ( account-number char(10), balance integer)QDDL compiler generates a set of tables stored in a datadictionaryQData dictionary contains metadata ( , data about data)
5 + Database schema+Data storage and definition language language in which the storage structure and access methodsused by the Database system are specified Usually an extension of the data definition language14 Introduction to DatabasesData Manipulation Language (DML)QLanguage for accessing and manipulating the data organized bythe appropriate data model+DML also known as query languageQTwo classes of languages+Procedural user specifies what data is required and how to getthose data+Nonprocedural user specifies what data is required withoutspecifying how to get those dataQSQL is the most widely used query language15 Introduction to DatabasesSQLQSQL: widely used non-procedural language+ find the name of the customer with customer-id 192-83-7465select customerwhere = 192-83-7465 + find the balances of all accounts held by the customer withcustomer-id 192-83-7465select depositor, accountwhere = 192-83-7465 and = programs generally access databases through+Language extensions that allow embedded SQL+Application program interfaces ( ODBC/JDBC) which allow SQLqueries to be sent to a databasePart II.
6 The Relational Model17 Introduction to DatabasesThe Relational ModelQStructure of Relational DatabasesQRelational AlgebraQTuple Relational CalculusQDomain Relational CalculusQExtended Relational-Algebra-OperationsQModificati on of the DatabaseQViews18 Introduction to DatabasesExample of a Relation419 Introduction to DatabasesBasic StructureQFormally, given sets D1, D2, .. Dn a relation r is a subset ofD1 x D2 x .. x DnThus a relation is a set of n-tuples (a1, a2, .., an) whereai DiQExample: ifcustomer-name = {Jones, Smith, Curry, Lindsay}customer-street = {Main, North, Park}customer-city = {Harrison, Rye, Pittsfield}Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield)} is a relation over customer-name x customer-street x customer-city20 Introduction to DatabasesAttribute TypesQEach attribute of a relation has a nameQThe set of allowed values for each attribute is called the domainof the attributeQAttribute values are (normally) required to be atomic, that is,indivisible+ multivalued attribute values are not atomic+ composite attribute values are not atomic21 Introduction to DatabasesRelation SchemaQA1, A2.
7 , An are attributesQR = (A1, A2, .., An ) is a relation Customer-schema = (customer-name, customer-street, customer-city)Qr(R) is a relation on the relation schema (Customer-schema)22 Introduction to DatabasesRelation InstanceQThe current values (relation instance) of a relation arespecified by a tableQAn element t of r is a tuple, represented by a row in a tableJonesSmithCurryLindsaycustomer-name MainNorthNorthParkcustomer-streetHarriso nRyeRyePittsfieldcustomer-citycustomerat tributestuples23 Introduction to DatabasesRelations are UnorderedQ Order of tuples is irrelevant (tuples may be stored in an arbitrary order)Q account relation with unordered tuples24 Introduction to DatabasesDatabaseQA Database consists of multiple relationsQInformation about an enterprise is broken up into parts, with eachrelation storing one part of the : account : stores information about accounts depositor : stores information about which customer owns which account customer : stores information about customersQStoring all information as a single relation such as bank(account-number, balance, customer-name.)
8 Results in+repetition of information ( two customers own an account)+the need for null values ( represent a customer without anaccount)QNormalization theory deals with how to design relationalschemas525 Introduction to DatabasesThe customer Relation26 Introduction to DatabasesThe depositor Relation27 Introduction to DatabasesE-R Diagram for the Banking Enterprise28 Introduction to DatabasesKeysQLet K RQK is a superkey of R if values for K are sufficient to identify aunique tuple of each possible relation r(R) by possible r wemean a relation r that could exist in the enterprise we : {customer-name, customer-street} and {customer-name}are both superkeys of Customer, if no two customers canpossibly have the same is a candidate key if K is minimalExample: {customer-name} is a candidate key for Customer,since it is a superkey {assuming no two customers can possiblyhave the same name), and no subset of it is a to DatabasesDetermining Keys from E-R SetsQStrong entity set.}
9 The primary key of the entity set becomesthe primary key of the entity set. The primary key of the relation consists of theunion of the primary key of the strong entity set and thediscriminator of the weak entity set. The union of the primary keys of the relatedentity sets becomes a super key of the relation.+For binary many-to-one relationship sets, the primary key of the many entity set becomes the relation s primary key.+For one-to-one relationship sets, the relation s primary key can bethat of either entity set.+For many-to-many relationship sets, the union of the primary keysbecomes the relation s primary key30 Introduction to DatabasesSchema Diagram for the Banking Enterprise631 Introduction to DatabasesQuery LanguagesQLanguage in which user requests information from the of languages+procedural+non-proceduralQ Pure languages.
10 +Relational Algebra+Tuple Relational Calculus+Domain Relational CalculusQPure languages form underlying basis of query languages thatpeople to DatabasesRelational AlgebraQProcedural languageQSix basic operators+select+project+union+set difference+Cartesian product+renameQThe operators take two or more relations as inputs and give anew relation as a to DatabasesSelect Operation Example Relation rABCD 15122377310 A=B ^ D > 5 (r)ABCD 12371034 Introduction to DatabasesSelect OperationQNotation: p(r)Qp is called the selection predicateQDefined as: p(r) = {t | t r and p(t)}Where p is a formula in propositional calculus consistingof terms connected by : (and), (or), (not)Each term is one of:<attribute>op <attribute> or <constant> where op is one of: =, , >, . <. QExample of selection: branch-name= Perryridge (account)35 Introduction to DatabasesProject Operation ExampleQRelation r:ABC 102030401112AC 1112=AC 112Q A,C (r)36 Introduction to DatabasesProject OperationQNotation: A1, A2.