Projects
Open-source and research projects in chip design, computer architecture, and hardware engineering. Most are open-source and under active development.
RISC-V Out-of-Order Core
activeA parameterized out-of-order RISC-V processor core with 6-wide issue, branch prediction, and non-blocking L1 cache.
- 6-wide superscalar out-of-order execution
- TAGE branch predictor with 95%+ accuracy
- Non-blocking L1 data cache with MSHRs
- Verified with riscv-dv random instruction generator
- Synthesized at 1.2 GHz on 28nm process
NPU Performance Simulator
activeCycle-accurate simulator for systolic-array based NPU architectures, supporting various dataflows and memory hierarchies.
- Configurable array size (16x16 to 128x128)
- Weight-stationary and output-stationary dataflows
- Multi-level memory hierarchy modeling
- DRAM bandwidth bottleneck analysis
- Integrated with PyTorch for workload traces
FPGA-Based Radar Signal Processor
completedReal-time FMCW radar signal processing pipeline on Xilinx Zynq, including FFT, CFAR detection, and angle estimation.
- 1024-point FFT pipeline with 4-cycle throughput
- 2D CFAR detector with configurable guard cells
- MUSIC algorithm for angle-of-arrival estimation
- AXI-Stream interface for data movement
- Real-time processing at 100 MSPS
Cache Coherence Verification Framework
completedFormal verification framework for cache coherence protocols using SystemVerilog Assertions and JasperGold.
- Support for MSI, MESI, and MOESI protocols
- Automated litmus test generation
- Coverage-driven verification methodology
- Integration with tilelink/ACE interfaces
- Reported and fixed 3 protocol-level bugs
LLVM Backend for Custom AI Accelerator
activeCustom LLVM backend for a proprietary AI accelerator ISA, including instruction selection, scheduling, and code generation.
- Custom ISA with vector and matrix extensions
- TableGen-based instruction definitions
- MLIR dialect for high-level operations
- Loop tiling and fusion optimizations
- Achieved 78% of peak theoretical throughput
Open-Source AMBA Testbench
archivedComprehensive UVM-based testbench for AMBA AXI4/AHB protocols with scoreboarding and coverage collection.
- AXI4, AXI4-Lite, and AHB-lite VIP components
- Constrained-random transaction generation
- Functional coverage model with 100% coverage target
- Reusable agent architecture
- Open-source under MIT license