431 lines
11 KiB
Markdown
431 lines
11 KiB
Markdown
# ## Documentation
|
||
|
||
- **[README](../README.md)** - Project overview and quick start
|
||
- **[Getting Started](GETTING_STARTED.md)** - Installation and first steps
|
||
- **[Architecture](ARCHITECTURE.md)** - How the parser works
|
||
- **[Dependency Resolution](DEPENDENCY_RESOLUTION.md)** - Automatic type extraction
|
||
- **[API Reference](API_REFERENCE.md)** - Command-line options and features
|
||
- **[Known Issues](KNOWN_ISSUES.md)** - Limitations and workarounds
|
||
- **[Quickstart Guide](QUICKSTART.md)** - Quick reference
|
||
- **[Roadmap](ROADMAP.md)** - Future plans and priorities
|
||
|
||
## Technical Deep Dives
|
||
|
||
For implementation details and visual guides:
|
||
- **[Dependency Flow](DEPENDENCY_FLOW.md)** - Complete technical walkthrough
|
||
- **[Visual Flow Diagrams](VISUAL_FLOW.md)** - Quick reference diagrams
|
||
- **[Multi-Field Structs](MULTI_FIELD_IMPLEMENTATION.md)** - Struct parsing details
|
||
- **[Typedef Support](TYPEDEF_IMPLEMENTATION.md)** - Typedef implementation
|
||
- **[Multi-Header Testing](MULTI_HEADER_TEST_RESULTS.md)** - Test results
|
||
|
||
## Development
|
||
|
||
- **[Development Guide](DEVELOPMENT.md)** - Contributing and extending the parser
|
||
|
||
## Archive
|
||
|
||
Historical planning documents are in `archive/` for reference.
|
||
|
||
## High-Level Architecture
|
||
|
||
```
|
||
Input (C Header) → Scanner → Declarations → Dependency Resolver → CodeGen → Output (Zig)
|
||
```
|
||
|
||
## Core Components
|
||
|
||
### 1. Scanner (`src/patterns.zig`)
|
||
|
||
**Purpose**: Parse C header files into structured declarations
|
||
|
||
**Process**:
|
||
1. Reads header file line by line
|
||
2. Tries to match each line against known patterns
|
||
3. Extracts type information, comments, and structure
|
||
4. Returns array of `Declaration` structures
|
||
|
||
**Supported Patterns**:
|
||
- Opaque types: `typedef struct SDL_X SDL_X;`
|
||
- Typedefs: `typedef Uint32 SDL_PropertiesID;`
|
||
- Enums: `typedef enum { ... } SDL_Type;`
|
||
- Structs: `typedef struct { int x, y; } SDL_Rect;`
|
||
- Flags: `typedef Uint32 SDL_Flags;` + `#define` values
|
||
- Functions: `extern SDL_DECLSPEC void SDLCALL SDL_Func(...);`
|
||
|
||
### 2. Dependency Resolver (`src/dependency_resolver.zig`)
|
||
|
||
**Purpose**: Automatically find and extract missing type definitions
|
||
|
||
**Process**:
|
||
1. Scans all declarations to find referenced types
|
||
2. Compares referenced types against defined types
|
||
3. Identifies missing types
|
||
4. Parses `#include` directives from source
|
||
5. Searches included headers for missing types
|
||
6. Extracts and clones matching declarations
|
||
|
||
**Key Features**:
|
||
- Type string normalization (strips `*`, `const`, etc.)
|
||
- Deduplication using HashMaps
|
||
- Deep cloning for safe ownership
|
||
- Selective extraction (only types needed)
|
||
|
||
### 3. Code Generator (`src/codegen.zig`)
|
||
|
||
**Purpose**: Convert C declarations to idiomatic Zig code
|
||
|
||
**Process**:
|
||
1. Groups functions by first parameter type (method categorization)
|
||
2. Generates type declarations
|
||
3. Generates function wrappers
|
||
4. Applies naming conventions
|
||
5. Performs type conversion
|
||
|
||
**Features**:
|
||
- Method organization for opaque types
|
||
- Inline function wrappers
|
||
- Automatic type conversion
|
||
- Doc comment preservation
|
||
|
||
### 4. Type Converter (`src/types.zig`)
|
||
|
||
**Purpose**: Convert C types to Zig equivalents
|
||
|
||
**Conversions**:
|
||
```zig
|
||
"bool" → "bool"
|
||
"Uint32" → "u32"
|
||
"int" → "c_int"
|
||
"SDL_Type *" → "?*Type"
|
||
"const SDL_Type *" → "*const Type"
|
||
```
|
||
|
||
### 5. Naming Convention Handler (`src/naming.zig`)
|
||
|
||
**Purpose**: Convert C names to idiomatic Zig
|
||
|
||
**Rules**:
|
||
- Strip `SDL_` prefix: `SDL_GPUDevice` → `GPUDevice`
|
||
- Remove first underscore: `SDL_GPU_TYPE` → `GPUType`
|
||
- CamelCase functions: `SDL_CreateDevice` → `createDevice`
|
||
- Lowercase first letter for values
|
||
|
||
## Data Flow
|
||
|
||
### 1. Parsing Phase
|
||
|
||
```
|
||
C Header File
|
||
↓
|
||
Scanner.scan()
|
||
↓
|
||
[]Declaration {
|
||
.opaque_type,
|
||
.typedef_decl,
|
||
.enum_decl,
|
||
.struct_decl,
|
||
.flag_decl,
|
||
.function_decl,
|
||
}
|
||
```
|
||
|
||
### 2. Dependency Analysis Phase
|
||
|
||
```
|
||
[]Declaration
|
||
↓
|
||
DependencyResolver.analyze()
|
||
├─ collectDefinedTypes() → defined_types HashMap
|
||
└─ collectReferencedTypes() → referenced_types HashMap
|
||
↓
|
||
getMissingTypes()
|
||
↓
|
||
missing_types = referenced - defined
|
||
```
|
||
|
||
### 3. Dependency Resolution Phase
|
||
|
||
```
|
||
For each missing_type:
|
||
Parse #include directives
|
||
↓
|
||
For each included header:
|
||
Read header file
|
||
↓
|
||
Scanner.scan()
|
||
↓
|
||
Search for matching type
|
||
↓
|
||
If found: cloneDeclaration()
|
||
```
|
||
|
||
### 4. Code Generation Phase
|
||
|
||
```
|
||
[]Declaration (primary + dependencies)
|
||
↓
|
||
CodeGen.generate()
|
||
├─ categorizeDeclarations() (group methods)
|
||
├─ writeHeader()
|
||
└─ writeDeclarations()
|
||
├─ writeOpaqueWithMethods()
|
||
├─ writeTypedef()
|
||
├─ writeEnum()
|
||
├─ writeStruct()
|
||
├─ writeFlags()
|
||
└─ writeFunction()
|
||
↓
|
||
Zig source code (string)
|
||
```
|
||
|
||
### 5. Validation Phase
|
||
|
||
```
|
||
Generated Zig code
|
||
↓
|
||
std.zig.Ast.parse()
|
||
↓
|
||
Check for syntax errors
|
||
↓
|
||
ast.renderAlloc() (format)
|
||
↓
|
||
Write to file or stdout
|
||
```
|
||
|
||
## Key Algorithms
|
||
|
||
### Type Extraction
|
||
|
||
**Purpose**: Strip pointer/const decorators to get base type
|
||
|
||
```zig
|
||
"SDL_Window *" → "SDL_Window"
|
||
"?*const SDL_Rect" → "SDL_Rect"
|
||
"SDL_Buffer *const *" → "SDL_Buffer"
|
||
```
|
||
|
||
**Algorithm**:
|
||
1. Trim whitespace
|
||
2. Remove leading qualifiers (`const`, `*`, `?`)
|
||
3. Remove trailing qualifiers (`*`, `*const`, ` const`)
|
||
4. Handle special patterns (`[*c]`)
|
||
5. Return base type string
|
||
|
||
### Multi-Field Parsing
|
||
|
||
**Purpose**: Handle C compact syntax like `int x, y;`
|
||
|
||
**Algorithm**:
|
||
1. Detect comma in field declaration
|
||
2. Extract common type (before first field name)
|
||
3. Split remaining part on commas
|
||
4. Create separate `FieldDecl` for each name
|
||
5. Return array of fields
|
||
|
||
**Example**:
|
||
```c
|
||
int x, y; → [FieldDecl{.name="x", .type="int"},
|
||
FieldDecl{.name="y", .type="int"}]
|
||
```
|
||
|
||
### Method Categorization
|
||
|
||
**Purpose**: Determine if function should be a method
|
||
|
||
**Algorithm**:
|
||
1. Check if function has parameters
|
||
2. Get type of first parameter
|
||
3. Check if type is an opaque type pointer
|
||
4. If yes, add to opaque type's methods
|
||
5. If no, write as standalone function
|
||
|
||
**Example**:
|
||
```c
|
||
void SDL_Destroy(SDL_Device *d) → Method of GPUDevice
|
||
void SDL_Init(void) → Standalone function
|
||
```
|
||
|
||
## Memory Management
|
||
|
||
### Ownership Rules
|
||
|
||
1. **Scanner owns strings** during parsing (allocated from its allocator)
|
||
2. **Parser owns declarations** after scanning (freed at end of main)
|
||
3. **Resolver owns HashMap keys** (duped when inserted, freed in deinit)
|
||
4. **Cloned declarations own strings** (allocated explicitly, freed by caller)
|
||
|
||
### Allocation Strategy
|
||
|
||
```
|
||
GPA (General Purpose Allocator)
|
||
├─ Primary header source (freed at end)
|
||
├─ Primary declarations (freed with deep free)
|
||
├─ DependencyResolver
|
||
│ ├─ referenced_types HashMap (keys owned)
|
||
│ └─ defined_types HashMap (keys borrowed)
|
||
├─ Missing types array (freed explicitly)
|
||
├─ Includes array (freed explicitly)
|
||
├─ Dependency declarations (freed with deep free)
|
||
└─ Generated output (freed after writing)
|
||
```
|
||
|
||
### Cleanup Pattern
|
||
|
||
```zig
|
||
defer {
|
||
for (decls) |decl| {
|
||
freeDeclDeep(allocator, decl);
|
||
}
|
||
allocator.free(decls);
|
||
}
|
||
```
|
||
|
||
## Error Handling
|
||
|
||
### Fatal Errors (Exit Immediately)
|
||
|
||
- File not found (primary header)
|
||
- Out of memory
|
||
- Cannot write output file
|
||
|
||
### Non-Fatal Errors (Continue with Warnings)
|
||
|
||
- Dependency header not readable → Skip, try next
|
||
- Type not found in any header → Print warning, continue
|
||
- Struct parsing error → Generate partial, continue
|
||
- Syntax errors in output → Print errors, write anyway
|
||
|
||
### Error Recovery
|
||
|
||
The parser uses graceful degradation:
|
||
1. Try to extract as much as possible
|
||
2. Warn about issues
|
||
3. Continue processing
|
||
4. Generate best-effort output
|
||
|
||
This allows partial success even with problematic headers.
|
||
|
||
## Extension Points
|
||
|
||
### Adding New Pattern Support
|
||
|
||
1. Add new variant to `Declaration` union in `patterns.zig`
|
||
2. Implement `scan*()` function to match pattern
|
||
3. Add to pattern matching chain in `Scanner.scan()`
|
||
4. Update all switch statements:
|
||
- Cleanup code in `parser.zig`
|
||
- `cloneDeclaration()` in `dependency_resolver.zig`
|
||
- `freeDeclaration()` in `dependency_resolver.zig`
|
||
5. Implement `write*()` in `codegen.zig`
|
||
|
||
### Adding Type Conversions
|
||
|
||
Edit `src/types.zig`:
|
||
```zig
|
||
pub fn convertType(c_type: []const u8, allocator: Allocator) ![]const u8 {
|
||
// Add new conversion here
|
||
if (std.mem.eql(u8, c_type, "MyType")) {
|
||
return try allocator.dupe(u8, "MyZigType");
|
||
}
|
||
// ...
|
||
}
|
||
```
|
||
|
||
### Adding Naming Rules
|
||
|
||
Edit `src/naming.zig`:
|
||
```zig
|
||
pub fn typeNameToZig(c_name: []const u8) []const u8 {
|
||
// Add custom naming logic
|
||
}
|
||
```
|
||
|
||
## Performance Characteristics
|
||
|
||
### Time Complexity
|
||
|
||
- **Primary parsing**: O(n) where n = source lines
|
||
- **Dependency analysis**: O(d) where d = declarations
|
||
- **Type extraction**: O(h × d) where h = headers, d = declarations per header
|
||
- **Code generation**: O(d) where d = total declarations
|
||
|
||
**Overall**: O(n + h×d) - Linear for typical use
|
||
|
||
### Space Complexity
|
||
|
||
- **Declarations**: O(d) where d = declaration count
|
||
- **HashMaps**: O(t) where t = unique type names
|
||
- **Output**: O(d) where d = declaration count
|
||
|
||
**Peak memory**: ~2-5MB for SDL_gpu.h (169 declarations)
|
||
|
||
### Optimization Points
|
||
|
||
Current optimizations:
|
||
- HashMap-based deduplication
|
||
- Early exit when type found
|
||
- Selective parsing (only missing types)
|
||
- String interning for type names
|
||
|
||
Potential improvements:
|
||
- Cache parsed headers (avoid re-parsing)
|
||
- Parallel header processing
|
||
- Lazy header loading
|
||
|
||
## Testing Strategy
|
||
|
||
### Unit Tests (`test/`)
|
||
|
||
- Pattern matching tests (each C pattern)
|
||
- Type conversion tests
|
||
- Naming convention tests
|
||
- Dependency resolution tests
|
||
- Multi-field parsing tests
|
||
|
||
### Integration Tests
|
||
|
||
- Real SDL headers (SDL_gpu.h)
|
||
- Dependency chain resolution
|
||
- End-to-end parsing and generation
|
||
|
||
### Validation
|
||
|
||
- AST parsing of generated code
|
||
- Memory leak detection (GPA)
|
||
- No regressions (all tests must pass)
|
||
|
||
## Code Organization
|
||
|
||
```
|
||
src/
|
||
├── parser.zig # Main entry point, CLI handling
|
||
├── patterns.zig # Pattern matching and scanning
|
||
├── types.zig # C to Zig type conversion
|
||
├── naming.zig # Naming convention handling
|
||
├── codegen.zig # Zig code generation
|
||
├── mock_codegen.zig # C mock generation
|
||
└── dependency_resolver.zig # Dependency analysis and extraction
|
||
|
||
test/
|
||
└── (various test files)
|
||
|
||
docs/
|
||
├── GETTING_STARTED.md # This file
|
||
├── ARCHITECTURE.md # Architecture overview
|
||
├── DEPENDENCY_RESOLUTION.md # Dependency system details
|
||
└── ...
|
||
```
|
||
|
||
## Next Steps
|
||
|
||
- Read [Dependency Resolution](DEPENDENCY_RESOLUTION.md) for details on automatic type extraction
|
||
- See [API Reference](API_REFERENCE.md) for all command-line options
|
||
- Check [Known Issues](KNOWN_ISSUES.md) for current limitations
|
||
- Review [Development](DEVELOPMENT.md) to contribute
|
||
|
||
---
|
||
|
||
**Related Documents**:
|
||
- Technical deep dive: [docs/DEPENDENCY_FLOW.md](DEPENDENCY_FLOW.md)
|
||
- Visual diagrams: [docs/VISUAL_FLOW.md](VISUAL_FLOW.md)
|