Please Enable the Desktop mode for better view experience

Compiler Design Unit 04 Complete Notes

Chapter 04: Symbol Tables & Errors – Digitaload Notes

Compiler Design

Chapter 04: Symbol Tables & Error Management – Storage, Scopes, and Error Recovery.

Understanding how compilers manage program identifiers, memory during runtime, and handle various types of errors.

Categories of all notes

πŸ“˜ Topic: Symbol Tables: Data Structure for Symbol Tables

What is a Symbol Table?

English: A symbol table is a data structure used by a compiler to keep track of all the identifiers (like variable names, function names) and their attributes (like type, scope, memory location).

Hinglish: Symbol table ek aisa data structure hai jisme compiler program ke sabhi names (variables, functions) aur unki details ko store karta hai.

Why Symbol Table is Needed?

  • πŸ’‘ To store information about identifiers during compilation.
  • πŸ’‘ To quickly find attributes of variables and functions.
  • πŸ’‘ To check for errors like duplicate declarations.
  • πŸ’‘ To help in type checking and memory allocation.
  • πŸ’‘ To manage scope and lifetime of variables.

Common Attributes Stored in Symbol Table

  • Name of identifier (variable or function).
  • Data type (int, float, char, etc.).
  • Scope (global, local, block).
  • Memory address or offset.
  • Size of the variable.
  • Other info like parameters for functions.

Data Structures Used for Symbol Tables

  • 1. Array
    • Simple, easy to implement.
    • Fast access by index.
    • Not efficient for large or dynamic symbol tables.
  • 2. Linked List
    • Dynamic size, easy to insert/delete.
    • Search is slower (linear time).
    • Good for small symbol tables.
  • 3. Hash Table
    • Most commonly used.
    • Provides fast average-time lookup, insertion, and deletion.
    • Uses hash function on identifier names.
    • Handles collisions by chaining or open addressing.
  • 4. Binary Search Tree (BST)
    • Allows sorted order traversal.
    • Average case search is efficient (log n).
    • Can degrade to linear if unbalanced.

Example of Symbol Table Entry

NameTypeScopeMemory LocationSize
xintlocal10004
sumfloatglobal20008
mainfuncglobal3000β€”

Important Points about Symbol Tables

  • πŸ’‘ Hash tables are preferred for fast access.
  • πŸ’‘ Symbol tables grow as new identifiers are found.
  • πŸ’‘ Different scopes may use different symbol tables or linked structures.
  • πŸ’‘ Errors like redeclaration are detected using symbol tables.
  • πŸ’‘ Symbol table helps during semantic analysis and code generation.

Summary in Hinglish

  • Symbol table ek important data structure hai jisme compiler variables aur functions ki information rakhta hai.
  • Hash table sabse jyada use hota hai kyunki wo fast searching deta hai.
  • Array aur linked list bhi use kar sakte hain chhote symbol tables ke liye.
  • Symbol table ki madad se compiler variables ke type, scope aur location ko manage karta hai.

πŸ“˜ Topic: Representing Scope Information

What is Scope in Programming?

English: Scope defines where a variable or identifier is visible and can be accessed in a program. It tells which part of the code can use the variable.

Hinglish: Scope matlab ek variable ya identifier program ke kis hisse mein accessible hai, ya use kiya ja sakta hai.

Why Representing Scope is Important?

  • πŸ’‘ To distinguish variables with the same name but in different parts of the program.
  • πŸ’‘ To manage lifetimes of variables.
  • πŸ’‘ To prevent errors by checking if a variable is used outside its scope.
  • πŸ’‘ To help compilers allocate memory correctly.
  • πŸ’‘ To support nested blocks and functions.

Types of Scope

  • Global Scope: Variable accessible throughout the program.
  • Local Scope: Variable accessible only within a function or block.
  • Block Scope: Variable accessible only within a particular block (like {} braces).
  • Nested Scope: When a scope exists inside another scope.

How to Represent Scope Information?

  • 1. Using Symbol Tables with Scope
    • Each scope can have its own symbol table or a chain of symbol tables.
    • When entering a new scope (like a function or block), create a new symbol table or scope record.
    • When leaving the scope, discard or hide that symbol table.
  • 2. Using Stack of Symbol Tables
    • Maintain a stack where each element is a symbol table for a scope.
    • When entering a new scope, push a new symbol table on the stack.
    • When exiting, pop it off.
    • Lookup starts from top of the stack (current scope) downwards.
  • 3. Using Display or Static/Dynamic Links
    • Display: Array or table that points to the symbol tables of active scopes, for quick access.
    • Static Link: Pointer in activation records to the symbol table of the static parent scope.
    • Dynamic Link: Pointer to caller’s activation record (for dynamic scoping).

Example of Scope Representation

Consider this C-like code:


int x = 10; // global scope

void func() {
   int x = 5; // local scope inside func
   {
      int x = 3; // block scope inside func
   }
}
                
  • Global x visible everywhere except where local x shadows it.
  • Local x inside func() shadows global x.
  • Block x shadows local x inside the block.
  • Compiler uses scope info to resolve which x is used at each point.

Important Points about Scope Representation

  • πŸ’‘ Scope information helps resolve variable names correctly.
  • πŸ’‘ Symbol tables and scope chains keep track of active variables.
  • πŸ’‘ Nested scopes mean symbol tables can be nested or linked.
  • πŸ’‘ Stack-based approach is most common for managing scope during parsing/compilation.

Summary in Hinglish

  • Scope batata hai ki variable program ke kis hissa mein accessible hai.
  • Compiler har scope ke liye alag symbol table ya stack use karta hai.
  • Jab naya scope start hota hai to naya symbol table banta hai, aur end hone par wo hata diya jata hai.
  • Isse compiler ko pata chalta hai ki kis variable ko use karna hai jab same naam ke variables ho.

πŸ“˜ Topic: Run-Time Administration: Implementation of Simple Stack Allocation Scheme

What is Run-Time Administration?

English: Run-Time Administration means managing memory and resources while a program is running, especially keeping track of variables, function calls, and their data.

Hinglish: Run-Time Administration matlab program chalate waqt memory aur resources ko manage karna, jaise variables aur functions ka data sambhalna.

What is Stack Allocation Scheme?

English: Stack allocation means using a stack data structure to manage memory for variables and function calls during program execution.

Hinglish: Stack allocation matlab ek stack (LIFO structure) ka use karke memory allocate karna jab program chal raha ho.

Why Use Stack Allocation?

  • πŸ’‘ Function calls need temporary memory (for parameters, local variables).
  • πŸ’‘ Easy to allocate and deallocate memory when function starts and ends.
  • πŸ’‘ Supports nested and recursive function calls.
  • πŸ’‘ Keeps track of return addresses.
  • πŸ’‘ Efficient memory usage.

How Simple Stack Allocation Works?

  • When a function is called, its Activation Record (AR) or Stack Frame is pushed onto the stack.
  • Activation record stores function’s parameters, local variables, return address, saved registers, etc.
  • When function ends, the activation record is popped off (removed) from the stack.
  • The stack pointer moves up and down to allocate or free memory.

Activation Record (Stack Frame) Structure

PartDescription
Return AddressAddress where control returns after function ends.
ParametersArguments passed to the function.
Local VariablesVariables declared inside the function.
Saved RegistersRegisters saved before function call.

Example: Stack Allocation Flow

Suppose a program calls function A(), which calls B():

  • Stack before calls: empty
  • Call A(): Push A’s activation record
  • Inside A(), call B(): Push B’s activation record on top
  • After B() ends, pop B’s activation record
  • After A() ends, pop A’s activation record

Stack grows downwards or upwards depending on system.

Important Points about Stack Allocation

  • πŸ’‘ Stack allocation is LIFO (Last In First Out).
  • πŸ’‘ Stack pointer keeps track of top of stack.
  • πŸ’‘ Allows efficient function calls and returns.
  • πŸ’‘ Supports recursion naturally.
  • πŸ’‘ Simple and widely used in modern compilers.

Summary in Hinglish

  • Run-time pe function ke liye memory stack mein allocate hoti hai jise activation record kehte hain.
  • Jab function call hota hai to uska data stack mein push hota hai, aur function ke khatam ΫΩˆΩ†Ϋ’ par pop ho jata hai.
  • Is tarah se stack pointer se memory manage karte hain.
  • Yeh scheme simple, fast aur recursive functions ke liye perfect hai.

πŸ“˜ Topic: Storage Allocation in Block-Structured Languages

What is Storage Allocation?

English: Storage allocation means assigning memory locations to variables, constants, and other data during program execution.

Hinglish: Storage allocation matlab program ke variables ya data ke liye memory ka jagah allocate karna.

What is a Block-Structured Language?

English: A block-structured language allows code to be divided into blocks, where each block can have its own variables and scope. Examples: C, Pascal, Ada.

Hinglish: Block-structured language wo language hoti hai jisme code ko blocks mein divide kiya jata hai, jahan har block ke apne variables aur scope hote hain.

How Storage Allocation Works in Block-Structured Languages?

  • Each block has its own scope and variables declared inside the block live only during the execution of that block.
  • When program enters a block, memory is allocated for its local variables.
  • When program leaves the block, memory is deallocated (freed).
  • This is usually managed using a stack allocation scheme (activation records).

Types of Storage Allocation

  • 1. Static Allocation
    • Memory for variables is allocated before program starts and remains fixed.
    • Used for global variables or static variables.
  • 2. Stack (Dynamic) Allocation
    • Memory for local variables in blocks/functions is allocated on stack when block is entered and freed on exit.
    • Supports recursion and nested blocks.
  • 3. Heap Allocation
    • Memory allocated dynamically during runtime using pointers (not directly related to block scope).

Example: Block-Structured Allocation


begin
   int x;       // Allocated when block starts
   begin
       int y;   // Allocated when inner block starts
       // use x, y
   end          // y is deallocated here
   // x still exists here
end            // x is deallocated here
                
  • x lives for the outer block duration.
  • y lives only inside the inner block.
  • Memory allocated and freed as blocks execute.

Important Points about Storage Allocation in Block-Structured Languages

  • πŸ’‘ Block-structured languages use stack allocation to manage storage of local variables.
  • πŸ’‘ Nested blocks mean nested activation records on stack.
  • πŸ’‘ Variables from outer blocks are accessible inside inner blocks (if language allows).
  • πŸ’‘ Helps in managing variable lifetime and memory efficiently.
  • πŸ’‘ Reduces memory wastage by freeing memory when blocks end.

Summary in Hinglish

  • Block-structured language mein har block ke variables ke liye memory allocate hoti hai jab block start hota hai, aur free hoti hai block khatam ΫΩˆΩ†Ϋ’ pe.
  • Stack allocation sabse common technique hai iske liye.
  • Is tarah se nested blocks ke liye nested memory management hota hai.
  • Global variables ke liye static allocation hoti hai jo pura program chalne tak rehti hai.

πŸ“˜ Topic: Error Detection & Recovery: Lexical, Syntactic, and Semantic Errors

What is Error Detection & Recovery?

English: Error detection is finding mistakes in the source code during compilation. Recovery means handling those errors so the compiler can continue working or give meaningful messages.

Hinglish: Error detection matlab program mein galtiyan dhundhna, aur recovery matlab un galtiyon ko sambhalna taaki compiler aage badh sake ya sahi message de.

Types of Errors in Compiler

  • 1. Lexical Errors
    • Mistakes detected during the lexical analysis phase.
    • Related to invalid tokens or characters.
    • Example: Using an illegal symbol or invalid identifier like @var or 123abc.
  • 2. Syntactic Errors
    • Detected during syntax analysis (parsing).
    • Related to wrong grammar or structure of the program.
    • Example: Missing semicolon, unmatched parentheses, incorrect statement order.
  • 3. Semantic Errors
    • Detected during semantic analysis.
    • Related to meaning or logic of the code, even if syntax is correct.
    • Example: Using a variable before declaration, type mismatches, incompatible operations.

Lexical Phase Errors

  • Occur when the input contains characters or tokens that the lexer can’t recognize.
  • Example: int x = 100$; β†’ $ is invalid character.
  • Recovery: Skip invalid characters or report error and continue scanning.

Syntactic Phase Errors

  • Occur if token sequence doesn’t follow grammar rules.
  • Example: if (x > 0 { missing closing parenthesis.
  • Recovery: Methods include panic mode (skip tokens until a safe point), phrase level recovery (insert/delete tokens).

Semantic Errors

  • Occur when code is syntactically correct but meaning is wrong.
  • Example: Trying to add integer and string without conversion.
  • Recovery: Report error with line number and try to continue checking other parts.

Example of Each Error Type

Error TypeCode ExampleExplanation
Lexical Errorint @x = 5;@ is invalid token
Syntactic Errorif (x > 0 {Missing closing )
Semantic Errorint x = "hello" + 5;Adding string and integer invalid

Important Points about Error Handling

  • πŸ’‘ Error detection happens in phases: lexical β†’ syntax β†’ semantic.
  • πŸ’‘ Early detection improves error recovery and user feedback.
  • πŸ’‘ Recovery techniques aim to continue compilation after an error to find more errors.
  • πŸ’‘ Good compilers give clear, precise error messages.
  • πŸ’‘ Sometimes errors in one phase lead to cascading errors in next phases.

Summary in Hinglish

  • Lexical errors tab hote hain jab invalid character ya token mile.
  • Syntactic errors tab hote hain jab grammar rule break ho jaye, jaise missing semicolon.
  • Semantic errors tab hote hain jab code ka matlab galat ho, jaise type mismatch.
  • Compiler alag alag technique se errors detect karta hai aur recover karne ki koshish karta hai.
Scroll to Top