11 KiB
## Documentation
- README - Project overview and quick start
- Getting Started - Installation and first steps
- Architecture - How the parser works
- Dependency Resolution - Automatic type extraction
- API Reference - Command-line options and features
- Known Issues - Limitations and workarounds
- Quickstart Guide - Quick reference
- Roadmap - Future plans and priorities
Technical Deep Dives
For implementation details and visual guides:
- Dependency Flow - Complete technical walkthrough
- Visual Flow Diagrams - Quick reference diagrams
- Multi-Field Structs - Struct parsing details
- Typedef Support - Typedef implementation
- Multi-Header Testing - Test results
Development
- Development Guide - Contributing and extending the parser
Archive
Historical planning documents are in archive/ for reference.
High-Level Architecture
Input (C Header) → Scanner → Declarations → Dependency Resolver → CodeGen → Output (Zig)
Core Components
1. Scanner (src/patterns.zig)
Purpose: Parse C header files into structured declarations
Process:
- Reads header file line by line
- Tries to match each line against known patterns
- Extracts type information, comments, and structure
- Returns array of
Declarationstructures
Supported Patterns:
- Opaque types:
typedef struct SDL_X SDL_X; - Typedefs:
typedef Uint32 SDL_PropertiesID; - Enums:
typedef enum { ... } SDL_Type; - Structs:
typedef struct { int x, y; } SDL_Rect; - Flags:
typedef Uint32 SDL_Flags;+#definevalues - Functions:
extern SDL_DECLSPEC void SDLCALL SDL_Func(...);
2. Dependency Resolver (src/dependency_resolver.zig)
Purpose: Automatically find and extract missing type definitions
Process:
- Scans all declarations to find referenced types
- Compares referenced types against defined types
- Identifies missing types
- Parses
#includedirectives from source - Searches included headers for missing types
- Extracts and clones matching declarations
Key Features:
- Type string normalization (strips
*,const, etc.) - Deduplication using HashMaps
- Deep cloning for safe ownership
- Selective extraction (only types needed)
3. Code Generator (src/codegen.zig)
Purpose: Convert C declarations to idiomatic Zig code
Process:
- Groups functions by first parameter type (method categorization)
- Generates type declarations
- Generates function wrappers
- Applies naming conventions
- Performs type conversion
Features:
- Method organization for opaque types
- Inline function wrappers
- Automatic type conversion
- Doc comment preservation
4. Type Converter (src/types.zig)
Purpose: Convert C types to Zig equivalents
Conversions:
"bool" → "bool"
"Uint32" → "u32"
"int" → "c_int"
"SDL_Type *" → "?*Type"
"const SDL_Type *" → "*const Type"
5. Naming Convention Handler (src/naming.zig)
Purpose: Convert C names to idiomatic Zig
Rules:
- Strip
SDL_prefix:SDL_GPUDevice→GPUDevice - Remove first underscore:
SDL_GPU_TYPE→GPUType - CamelCase functions:
SDL_CreateDevice→createDevice - Lowercase first letter for values
Data Flow
1. Parsing Phase
C Header File
↓
Scanner.scan()
↓
[]Declaration {
.opaque_type,
.typedef_decl,
.enum_decl,
.struct_decl,
.flag_decl,
.function_decl,
}
2. Dependency Analysis Phase
[]Declaration
↓
DependencyResolver.analyze()
├─ collectDefinedTypes() → defined_types HashMap
└─ collectReferencedTypes() → referenced_types HashMap
↓
getMissingTypes()
↓
missing_types = referenced - defined
3. Dependency Resolution Phase
For each missing_type:
Parse #include directives
↓
For each included header:
Read header file
↓
Scanner.scan()
↓
Search for matching type
↓
If found: cloneDeclaration()
4. Code Generation Phase
[]Declaration (primary + dependencies)
↓
CodeGen.generate()
├─ categorizeDeclarations() (group methods)
├─ writeHeader()
└─ writeDeclarations()
├─ writeOpaqueWithMethods()
├─ writeTypedef()
├─ writeEnum()
├─ writeStruct()
├─ writeFlags()
└─ writeFunction()
↓
Zig source code (string)
5. Validation Phase
Generated Zig code
↓
std.zig.Ast.parse()
↓
Check for syntax errors
↓
ast.renderAlloc() (format)
↓
Write to file or stdout
Key Algorithms
Type Extraction
Purpose: Strip pointer/const decorators to get base type
"SDL_Window *" → "SDL_Window"
"?*const SDL_Rect" → "SDL_Rect"
"SDL_Buffer *const *" → "SDL_Buffer"
Algorithm:
- Trim whitespace
- Remove leading qualifiers (
const,*,?) - Remove trailing qualifiers (
*,*const,const) - Handle special patterns (
[*c]) - Return base type string
Multi-Field Parsing
Purpose: Handle C compact syntax like int x, y;
Algorithm:
- Detect comma in field declaration
- Extract common type (before first field name)
- Split remaining part on commas
- Create separate
FieldDeclfor each name - Return array of fields
Example:
int x, y; → [FieldDecl{.name="x", .type="int"},
FieldDecl{.name="y", .type="int"}]
Method Categorization
Purpose: Determine if function should be a method
Algorithm:
- Check if function has parameters
- Get type of first parameter
- Check if type is an opaque type pointer
- If yes, add to opaque type's methods
- If no, write as standalone function
Example:
void SDL_Destroy(SDL_Device *d) → Method of GPUDevice
void SDL_Init(void) → Standalone function
Memory Management
Ownership Rules
- Scanner owns strings during parsing (allocated from its allocator)
- Parser owns declarations after scanning (freed at end of main)
- Resolver owns HashMap keys (duped when inserted, freed in deinit)
- Cloned declarations own strings (allocated explicitly, freed by caller)
Allocation Strategy
GPA (General Purpose Allocator)
├─ Primary header source (freed at end)
├─ Primary declarations (freed with deep free)
├─ DependencyResolver
│ ├─ referenced_types HashMap (keys owned)
│ └─ defined_types HashMap (keys borrowed)
├─ Missing types array (freed explicitly)
├─ Includes array (freed explicitly)
├─ Dependency declarations (freed with deep free)
└─ Generated output (freed after writing)
Cleanup Pattern
defer {
for (decls) |decl| {
freeDeclDeep(allocator, decl);
}
allocator.free(decls);
}
Error Handling
Fatal Errors (Exit Immediately)
- File not found (primary header)
- Out of memory
- Cannot write output file
Non-Fatal Errors (Continue with Warnings)
- Dependency header not readable → Skip, try next
- Type not found in any header → Print warning, continue
- Struct parsing error → Generate partial, continue
- Syntax errors in output → Print errors, write anyway
Error Recovery
The parser uses graceful degradation:
- Try to extract as much as possible
- Warn about issues
- Continue processing
- Generate best-effort output
This allows partial success even with problematic headers.
Extension Points
Adding New Pattern Support
- Add new variant to
Declarationunion inpatterns.zig - Implement
scan*()function to match pattern - Add to pattern matching chain in
Scanner.scan() - Update all switch statements:
- Cleanup code in
parser.zig cloneDeclaration()independency_resolver.zigfreeDeclaration()independency_resolver.zig
- Cleanup code in
- Implement
write*()incodegen.zig
Adding Type Conversions
Edit src/types.zig:
pub fn convertType(c_type: []const u8, allocator: Allocator) ![]const u8 {
// Add new conversion here
if (std.mem.eql(u8, c_type, "MyType")) {
return try allocator.dupe(u8, "MyZigType");
}
// ...
}
Adding Naming Rules
Edit src/naming.zig:
pub fn typeNameToZig(c_name: []const u8) []const u8 {
// Add custom naming logic
}
Performance Characteristics
Time Complexity
- Primary parsing: O(n) where n = source lines
- Dependency analysis: O(d) where d = declarations
- Type extraction: O(h × d) where h = headers, d = declarations per header
- Code generation: O(d) where d = total declarations
Overall: O(n + h×d) - Linear for typical use
Space Complexity
- Declarations: O(d) where d = declaration count
- HashMaps: O(t) where t = unique type names
- Output: O(d) where d = declaration count
Peak memory: ~2-5MB for SDL_gpu.h (169 declarations)
Optimization Points
Current optimizations:
- HashMap-based deduplication
- Early exit when type found
- Selective parsing (only missing types)
- String interning for type names
Potential improvements:
- Cache parsed headers (avoid re-parsing)
- Parallel header processing
- Lazy header loading
Testing Strategy
Unit Tests (test/)
- Pattern matching tests (each C pattern)
- Type conversion tests
- Naming convention tests
- Dependency resolution tests
- Multi-field parsing tests
Integration Tests
- Real SDL headers (SDL_gpu.h)
- Dependency chain resolution
- End-to-end parsing and generation
Validation
- AST parsing of generated code
- Memory leak detection (GPA)
- No regressions (all tests must pass)
Code Organization
src/
├── parser.zig # Main entry point, CLI handling
├── patterns.zig # Pattern matching and scanning
├── types.zig # C to Zig type conversion
├── naming.zig # Naming convention handling
├── codegen.zig # Zig code generation
├── mock_codegen.zig # C mock generation
└── dependency_resolver.zig # Dependency analysis and extraction
test/
└── (various test files)
docs/
├── GETTING_STARTED.md # This file
├── ARCHITECTURE.md # Architecture overview
├── DEPENDENCY_RESOLUTION.md # Dependency system details
└── ...
Next Steps
- Read Dependency Resolution for details on automatic type extraction
- See API Reference for all command-line options
- Check Known Issues for current limitations
- Review Development to contribute
Related Documents:
- Technical deep dive: docs/DEPENDENCY_FLOW.md
- Visual diagrams: docs/VISUAL_FLOW.md