Example: confidence

FORMAL LANGUAGES - TUT

FORMAL LANGUAGESK eijo Ruohonen2009 Contents1I WORDS AND and REGULAR Expressions and of Words. Finite s of Machines and Transducers (A Brief Overview)18 III s Hierarchy25IV of Algorithms (A Brief Overview) and Complements of Problems. Post s Correspondence Problem42V of CS-Languages51VI Complexity Classes (A Brief Overview)56 VII Sch utzenberger s Sardinas Patterson Sums. Prefix Codes and Huffman s Algorithmiii68 VIII LINDENMAYER S L-Systems or L-Systems with Interaction73IX FORMAL POWER as a FORMAL Power General FORMAL Power FORMAL Power Series. Sch utzenberger s Representation and Hadamard s of FORMAL Power Languages90 References92 IndexForewordThese lecture notes were translated from the Finnish lecture notes for the TUT course Formaalit kielet . The notes form the base text for the course MAT-41186 FormalLanguages . They contain an introduction to the basic concepts and constructs, as seenfrom the point of view of LANGUAGES and grammars.

In formal language theory defining languages and investigating languages via their definitions is paramount. Thus only a (minuscule) portion of all possible languages enters

Tags:

  Language, Formal, Formal language, In formal language

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of FORMAL LANGUAGES - TUT

1 FORMAL LANGUAGESK eijo Ruohonen2009 Contents1I WORDS AND and REGULAR Expressions and of Words. Finite s of Machines and Transducers (A Brief Overview)18 III s Hierarchy25IV of Algorithms (A Brief Overview) and Complements of Problems. Post s Correspondence Problem42V of CS-Languages51VI Complexity Classes (A Brief Overview)56 VII Sch utzenberger s Sardinas Patterson Sums. Prefix Codes and Huffman s Algorithmiii68 VIII LINDENMAYER S L-Systems or L-Systems with Interaction73IX FORMAL POWER as a FORMAL Power General FORMAL Power FORMAL Power Series. Sch utzenberger s Representation and Hadamard s of FORMAL Power Languages90 References92 IndexForewordThese lecture notes were translated from the Finnish lecture notes for the TUT course Formaalit kielet . The notes form the base text for the course MAT-41186 FormalLanguages . They contain an introduction to the basic concepts and constructs, as seenfrom the point of view of LANGUAGES and grammars.

2 In a sister course MAT-41176 Theoryof Automata much similar material is dealt with from the point of view of automata,computational complexity and LANGUAGES have their origin in the symbolical notation formalisms of mathe-matics, and especially in combinatorics and symbolic logic. These were later joined byvarious codes needed in data encryption, transmission, and error-correction all thesehave significantly influenced also the theoretical side of things andin particular variousmathematical models of automation and was however only after Noam Chomsky s ground-breaking ideas inthe investigationof natural LANGUAGES , and the algebro-combinatorial approach of Marcel-Paul Sch utzen-berger s in the 1950 s that FORMAL language theory really got a pushforward. The stronginfluence of programming LANGUAGES should be noted, too. During the heydays of formallanguages, in the 1960 s and 1970 s, much of the foundation was created for the theoryas it is it could be said that the basis of FORMAL language theory hassettled into a fairly standard form, which is seen when old and more recent text-books inthe area are compared.

3 The theory is by no means stagnant, however, and research in thefield continues to be quite lively and these lecture notes the classical Chomskian FORMAL language theory is fairly fullydealt with, omitting however much of automata constructs and computability issues. In1 Among the top investigators in the area especially the Finnish academician Arto Salomaa might , surveys of Lindenmayer system theory and the mathematical theory of codes aregiven. As a somewhat uncommon topic, an overview of FORMAL powerseries is from being a nice algebraic alternative formalism, they give a mechanism for gen-eralizing the concept of language in numerous ways, by changing theunderlying conceptof set but not the concept of Ruohonen2 There are various ways of generalizing LANGUAGES by changing the concept of word, say, to a graph,or a picture, or a multidimensional word, or an infinite word, but these are not dealt with 1 WORDS AND LANGUAGES The words. Why did they have to exist?

4 Without them, there wouldn t be any of this. (Markus Zusak:The Book Thief) Words and AlphabetsAword(orstring) is a finite sequence of items, so-calledsymbolsorletterschosen from aspecified finite set called of common alphabets are letters inthe Finnish alphabet (+ interword space, punctuation marks, etc.), and the bits 0 and word of length one is identified with its only symbol. A special word is theempty word(ornull word) having no symbols, denoted by (or or or 1).Thelengthof the wordwis the number of symbols in it, denoted by|w|. The lengthof the empty word is 0. If there areksymbols in the alphabet, then there areknwordsof lengthn. Thus there arenXi=0ki=kn+1 1k 1words of length at mostn, ifk >1, andn+ 1 words, ifk= 1. The set of all words isdenumerably infinite, that is, they can be given as an infinite list, say,by ordering thewords first according to their basic operation of words isconcatenation, that is, writing words as a concatenation of the wordsw1andw2is denoted simply byw1w2.

5 Examples ofconcatenations in the alphabet{a, b, c}:w1=aacbba , w2=caac , w1w2=aacbbacaacw1=aacbba , w2= , w1w2=w1=aacbbaw1= , w2=caac , w1w2=w2=caacConcatenation is associative, ,w1(w2w3) = (w1w2) a consequence of this, repeated concatenations can be written without the other hand, concatenation is usually not commutative, As a rule thenw1w26=w2w1,but not always, and in the case of a unary alphabet concatenation isobviously (concatenation)powerof the wordwiswn=ww w|{z} 1. WORDS AND LANGUAGES2 Especiallyw1=wandw0= , and always n= .Themirror image(orreversal) of the wordw=a1a2 anis the word w=an a2a1,especially = . Clearly we havedw1w2= w2 w1. A worduis aprefix( ) of thewordw, ifw=uv( ) for some wordv. A worduis asubword(orsegment)of the wordw, ifw=v1uv2for some wordsv1andv2. A worduis ascattered subwordofthe wordw, ifw=w1u1w2u2 wnunwn+1whereu=u1u2 un, for somenand some wordsw1, w2, .. , wn+1andu1, u2.

6 , LanguagesAlanguageis a set words over some alphabet. Special examples of LANGUAGES arefinitelanguageshaving only a finite number of words,cofinite languagesmissing only a finitenumber of words, and theempty language having no words. Often a singleton language {w}is identified with its only wordw, and the language is denoted simply customary set-theoretic notation is used for LANGUAGES : (inclusion), (properinclusion), (union), (intersection), (difference) and(complement against the setof all words over the alphabet). Belonging of a wordwin the languageLis denoted byw L, as usual. Note also the negated relations6 ,6 and/ .The language of all words over the alphabet , in particular , is denoted by . Thelanguage of all nonempty words over the alphabet is denoted by +. ThusL= Land += { }.Theorem is a nondenumerably infinite amount of LANGUAGES over any alphabet,thus the LANGUAGES cannot be given in an infinite us assume the contrary: All LANGUAGES (over some alphabet )appear in thelistL1, L2.

7 We then define the languageLas follows: Letw1, w2, ..be a list containgall words over the alphabet . The wordwiis in the languageLif and only if it is notin the languageLi. Clearly the languageLis not then any of the LANGUAGES in the listL1, L2, ..The counter hypothesis is thus false, and the theorem holds above method of proof is an instance of the so-calleddiagonal canbe only a denumerably infinite amount of ways of defining LANGUAGES , since all such def-initions must be expressible in some natural language , and thus listable in lexicographicorder. In FORMAL language theory defining LANGUAGES and investigating LANGUAGES via theirdefinitions is paramount. Thus only a (minuscule) portion of all possible LANGUAGES entersthe investigation!There are many other operations of LANGUAGES in addition to the set-theoretic onesabove. Theconcatenationof the languagesL1andL2isL1L2={w1w2|w1 L1andw2 L2}.Thenth(concatenation)powerof the languageLisLn={w1w2 wn|w1, w2.}

8 , wn L},CHAPTER 1. WORDS AND LANGUAGES3and especiallyL1=LandL0={ }. In particular 0={ }! Theconcatenation closureorKleenean starof the languageLisL = [n=0Ln, , the set obtained by concatenating words ofLin all possible ways, including the empty concatenation giving . SimilarlyL+= [n=1Ln,which contains the empty word only if it is already inL. (Cf. and +above.) Thus ={ }, but += . Note that in factL+=L L=LL .Theleft and right quotientsof the languagesL1andL2are defined asL1\L2={w2|w1w2 L2for some wordw1 L1}(remove from words ofL2prefixes belonging inL1in all possible ways) andL1/L2={w1|w1w2 L1for some wordw2 L2}(remove from words ofL1suffixes belonging inL2in all possible ways). Note that theprefix or the suffix can be empty. Themirror image(orreversal) of the languageLis thelanguage L={ w|w L}.There are two fundamental machineries of defining LANGUAGES :grammars,which gen-erate words of the language , andautomata,which recognize words of the language .]]

9 Thereare many other ways of defining LANGUAGES , defining regular LANGUAGES using 2 REGULAR LANGUAGES Some people, when confronted witha problem, think I know, I ll useregular expressions. Now theyhave two problems. (Jamie Zawinski) Regular Expressions and LanguagesAregular expressionis a formula which defines a language using set-theoretical union,denoted here by +, concatenation and concatenation closure. These operations are com-bined according to a set of rules, adding parentheses ( and ) when necessary. The atomsof the formula are symbols of the alphabet, the empty language , and the empty word , the braces{and}indicating sets are defined by regular expressions are the so-calledregular usdenote the family of regular LANGUAGES over the alphabet byR , or simply byRif thealphabet is clear by the the family of LANGUAGES satisfying the following language is inRand the corresponding regular expression is . language { }is inRand the corresponding regular expression is.

10 Each symbola, the language {a}is inRand the corresponding regular LANGUAGES inR, andr1andr2are the corresponding regular ex-pressions, then(a)the languageL1 L2is inRand the corresponding regular expression is(r1+r2).(b)the languageL1L2is inRand the corresponding regular expression is(r1r2). a language inRandris the corresponding regular expression, thenL is inRand the corresponding regular expression is(r ). LANGUAGES obtainable by using the above rules1. order to avoid overly long expressions, certain customary abbreviations are used, (rr) =denote(r2),(r(rr)) =denote(r3) and(r(r )) =denote(r+).On the other hand, the rules produce fully parenthesized regular expressions. If the orderof precedence ,concatenation,+4 CHAPTER 2. REGULAR LANGUAGES5is agreed on, then a lot of parentheses can be omitted, and for examplea+b ccan be usedinstead of the full expression (a+((b )c)). It is also often customary to identify a regularexpression with the language it defines, means that the correspondingregular LANGUAGES are the same, even though the expressions themselves can be quitedifferent.


Related search queries