sdlparser-scrap/docs/ARCHITECTURE.md

11 KiB
Raw Blame History

## Documentation

Technical Deep Dives

For implementation details and visual guides:

Development

Archive

Historical planning documents are in archive/ for reference.

High-Level Architecture

Input (C Header) → Scanner → Declarations → Dependency Resolver → CodeGen → Output (Zig)

Core Components

1. Scanner (src/patterns.zig)

Purpose: Parse C header files into structured declarations

Process:

  1. Reads header file line by line
  2. Tries to match each line against known patterns
  3. Extracts type information, comments, and structure
  4. Returns array of Declaration structures

Supported Patterns:

  • Opaque types: typedef struct SDL_X SDL_X;
  • Typedefs: typedef Uint32 SDL_PropertiesID;
  • Enums: typedef enum { ... } SDL_Type;
  • Structs: typedef struct { int x, y; } SDL_Rect;
  • Flags: typedef Uint32 SDL_Flags; + #define values
  • Functions: extern SDL_DECLSPEC void SDLCALL SDL_Func(...);

2. Dependency Resolver (src/dependency_resolver.zig)

Purpose: Automatically find and extract missing type definitions

Process:

  1. Scans all declarations to find referenced types
  2. Compares referenced types against defined types
  3. Identifies missing types
  4. Parses #include directives from source
  5. Searches included headers for missing types
  6. Extracts and clones matching declarations

Key Features:

  • Type string normalization (strips *, const, etc.)
  • Deduplication using HashMaps
  • Deep cloning for safe ownership
  • Selective extraction (only types needed)

3. Code Generator (src/codegen.zig)

Purpose: Convert C declarations to idiomatic Zig code

Process:

  1. Groups functions by first parameter type (method categorization)
  2. Generates type declarations
  3. Generates function wrappers
  4. Applies naming conventions
  5. Performs type conversion

Features:

  • Method organization for opaque types
  • Inline function wrappers
  • Automatic type conversion
  • Doc comment preservation

4. Type Converter (src/types.zig)

Purpose: Convert C types to Zig equivalents

Conversions:

"bool"  "bool"
"Uint32"  "u32"
"int"  "c_int"
"SDL_Type *"  "?*Type"
"const SDL_Type *"  "*const Type"

5. Naming Convention Handler (src/naming.zig)

Purpose: Convert C names to idiomatic Zig

Rules:

  • Strip SDL_ prefix: SDL_GPUDeviceGPUDevice
  • Remove first underscore: SDL_GPU_TYPEGPUType
  • CamelCase functions: SDL_CreateDevicecreateDevice
  • Lowercase first letter for values

Data Flow

1. Parsing Phase

C Header File
    ↓
Scanner.scan()
    ↓
[]Declaration {
    .opaque_type,
    .typedef_decl,
    .enum_decl,
    .struct_decl,
    .flag_decl,
    .function_decl,
}

2. Dependency Analysis Phase

[]Declaration
    ↓
DependencyResolver.analyze()
    ├─ collectDefinedTypes() → defined_types HashMap
    └─ collectReferencedTypes() → referenced_types HashMap
    ↓
getMissingTypes()
    ↓
missing_types = referenced - defined

3. Dependency Resolution Phase

For each missing_type:
    Parse #include directives
        ↓
    For each included header:
        Read header file
            ↓
        Scanner.scan()
            ↓
        Search for matching type
            ↓
        If found: cloneDeclaration()

4. Code Generation Phase

[]Declaration (primary + dependencies)
    ↓
CodeGen.generate()
    ├─ categorizeDeclarations() (group methods)
    ├─ writeHeader()
    └─ writeDeclarations()
        ├─ writeOpaqueWithMethods()
        ├─ writeTypedef()
        ├─ writeEnum()
        ├─ writeStruct()
        ├─ writeFlags()
        └─ writeFunction()
    ↓
Zig source code (string)

5. Validation Phase

Generated Zig code
    ↓
std.zig.Ast.parse()
    ↓
Check for syntax errors
    ↓
ast.renderAlloc() (format)
    ↓
Write to file or stdout

Key Algorithms

Type Extraction

Purpose: Strip pointer/const decorators to get base type

"SDL_Window *"  "SDL_Window"
"?*const SDL_Rect"  "SDL_Rect"
"SDL_Buffer *const *"  "SDL_Buffer"

Algorithm:

  1. Trim whitespace
  2. Remove leading qualifiers (const, *, ?)
  3. Remove trailing qualifiers (*, *const, const)
  4. Handle special patterns ([*c])
  5. Return base type string

Multi-Field Parsing

Purpose: Handle C compact syntax like int x, y;

Algorithm:

  1. Detect comma in field declaration
  2. Extract common type (before first field name)
  3. Split remaining part on commas
  4. Create separate FieldDecl for each name
  5. Return array of fields

Example:

int x, y;    [FieldDecl{.name="x", .type="int"}, 
               FieldDecl{.name="y", .type="int"}]

Method Categorization

Purpose: Determine if function should be a method

Algorithm:

  1. Check if function has parameters
  2. Get type of first parameter
  3. Check if type is an opaque type pointer
  4. If yes, add to opaque type's methods
  5. If no, write as standalone function

Example:

void SDL_Destroy(SDL_Device *d)    Method of GPUDevice
void SDL_Init(void)                Standalone function

Memory Management

Ownership Rules

  1. Scanner owns strings during parsing (allocated from its allocator)
  2. Parser owns declarations after scanning (freed at end of main)
  3. Resolver owns HashMap keys (duped when inserted, freed in deinit)
  4. Cloned declarations own strings (allocated explicitly, freed by caller)

Allocation Strategy

GPA (General Purpose Allocator)
  ├─ Primary header source (freed at end)
  ├─ Primary declarations (freed with deep free)
  ├─ DependencyResolver
  │   ├─ referenced_types HashMap (keys owned)
  │   └─ defined_types HashMap (keys borrowed)
  ├─ Missing types array (freed explicitly)
  ├─ Includes array (freed explicitly)
  ├─ Dependency declarations (freed with deep free)
  └─ Generated output (freed after writing)

Cleanup Pattern

defer {
    for (decls) |decl| {
        freeDeclDeep(allocator, decl);
    }
    allocator.free(decls);
}

Error Handling

Fatal Errors (Exit Immediately)

  • File not found (primary header)
  • Out of memory
  • Cannot write output file

Non-Fatal Errors (Continue with Warnings)

  • Dependency header not readable → Skip, try next
  • Type not found in any header → Print warning, continue
  • Struct parsing error → Generate partial, continue
  • Syntax errors in output → Print errors, write anyway

Error Recovery

The parser uses graceful degradation:

  1. Try to extract as much as possible
  2. Warn about issues
  3. Continue processing
  4. Generate best-effort output

This allows partial success even with problematic headers.

Extension Points

Adding New Pattern Support

  1. Add new variant to Declaration union in patterns.zig
  2. Implement scan*() function to match pattern
  3. Add to pattern matching chain in Scanner.scan()
  4. Update all switch statements:
    • Cleanup code in parser.zig
    • cloneDeclaration() in dependency_resolver.zig
    • freeDeclaration() in dependency_resolver.zig
  5. Implement write*() in codegen.zig

Adding Type Conversions

Edit src/types.zig:

pub fn convertType(c_type: []const u8, allocator: Allocator) ![]const u8 {
    // Add new conversion here
    if (std.mem.eql(u8, c_type, "MyType")) {
        return try allocator.dupe(u8, "MyZigType");
    }
    // ...
}

Adding Naming Rules

Edit src/naming.zig:

pub fn typeNameToZig(c_name: []const u8) []const u8 {
    // Add custom naming logic
}

Performance Characteristics

Time Complexity

  • Primary parsing: O(n) where n = source lines
  • Dependency analysis: O(d) where d = declarations
  • Type extraction: O(h × d) where h = headers, d = declarations per header
  • Code generation: O(d) where d = total declarations

Overall: O(n + h×d) - Linear for typical use

Space Complexity

  • Declarations: O(d) where d = declaration count
  • HashMaps: O(t) where t = unique type names
  • Output: O(d) where d = declaration count

Peak memory: ~2-5MB for SDL_gpu.h (169 declarations)

Optimization Points

Current optimizations:

  • HashMap-based deduplication
  • Early exit when type found
  • Selective parsing (only missing types)
  • String interning for type names

Potential improvements:

  • Cache parsed headers (avoid re-parsing)
  • Parallel header processing
  • Lazy header loading

Testing Strategy

Unit Tests (test/)

  • Pattern matching tests (each C pattern)
  • Type conversion tests
  • Naming convention tests
  • Dependency resolution tests
  • Multi-field parsing tests

Integration Tests

  • Real SDL headers (SDL_gpu.h)
  • Dependency chain resolution
  • End-to-end parsing and generation

Validation

  • AST parsing of generated code
  • Memory leak detection (GPA)
  • No regressions (all tests must pass)

Code Organization

src/
├── parser.zig              # Main entry point, CLI handling
├── patterns.zig            # Pattern matching and scanning
├── types.zig               # C to Zig type conversion
├── naming.zig              # Naming convention handling
├── codegen.zig             # Zig code generation
├── mock_codegen.zig        # C mock generation
└── dependency_resolver.zig # Dependency analysis and extraction

test/
└── (various test files)

docs/
├── GETTING_STARTED.md      # This file
├── ARCHITECTURE.md         # Architecture overview
├── DEPENDENCY_RESOLUTION.md # Dependency system details
└── ...

Next Steps


Related Documents: