Project Overview

Note on Project Description

The initial project description submitted to Iowa State University by our client (see approved projects list) stated: "This project will break up an existing U-Net model into code segments that can then be pipelined. The result will be slightly higher latency but also higher throughput of the algorithm." This description represents the client's original proposal and has not been edited or construed by our team. Our actual research findings, detailed below, differ significantly from this initial description due to the reality of the given hardware and restrictions imposed by the client.

Our Machine Learning: Semantic Segmentation Optimization project focuses on optimizing semantic segmentation algorithms for eye tracking in assistive technology applications. We aim to improve the performance of these algorithms for individuals with mobility disabilities, particularly those with conditions like cerebral palsy.

Our original approach involved pipelining the U-Net neural network algorithm into four equal parts to run concurrently. However, through rigorous testing and output validation, we discovered that the Vitis AI model compiler incorrectly scaled the last two split segments by a factor of two, requiring input tensor rescaling for correct output. Our performance analysis showed the Single Model is 9.20x faster than the Split Model with 4 segments demonstrating that the DPU architecture does not benefit from model splitting for this use case as main processing bottleneck is the single model inference constraint imposed by the DPU synthesized on the FPGA fabric. Client also imposed that the fabric and model could not be changed.

Key contributions from our research include:

  • Comprehensive analysis of U-Net model splitting for DPU-based inference
  • Discovery of Vitis AI compiler scaling issues with split model segments
  • Performance benchmarking methodology for split vs single model inference
  • Optimization for the AMD Kria KV260 development board
  • ONNX Model Decomposition and analysis
  • Proprietary Machine Learning Models
  • AMD Development Tools: Petalinux Tools, Vitis AI, Vitis AI Model Optimizer
  • Docker Development Environment
AMD Kria KV260 development board
AMD Kria KV260 development board

Project Information

Team Number

sddec25-01

Client

JR Spidell

Faculty Advisor

Dr. Namrata Vaswani

Team Members

Profile photo of Tyler Schaefer

Tyler Schaefer

ML Algorithm Analyst

Specializes in algorithm optimization and mathematical validation for ML models. Focuses on maintaining accuracy during the optimization and model splitting process.

Profile photo of Conner Ohnesorge

Conner Ohnesorge

ML Integration HWE

Specializes in hardware optimization for ML models with experience in FPGA implementation. Also serves as the development environment manager.

Profile photo of Aidan Perry

Aidan Perry

Multithreaded Program Developer

Leads threading implementation and synchronization for parallel processing. Experienced in real-time systems and FPGA programming.

Profile photo of Joey Metzen

Joey Metzen

Kria Board Manager

Responsible for hardware management and memory optimization. Leads testing and benchmarking efforts for the system.

Project Timeline

Project Initialization

Jan 2025 - Feb 2025

  • Team formation
  • Problem definition
  • Initial research

Design Phase

Mar 2025 - Apr 2025

  • Architecture planning
  • Algorithm selection
  • Hardware requirements analysis

Implementation

May 2025 - Aug 2025

  • Mathematical division of U-Net
  • Thread management system
  • Memory allocation strategy

Testing & Validation

Sep 2025 - Oct 2025

  • Comprehensive testing
  • Performance benchmarking
  • Accuracy validation

Final Delivery

Nov 2025 - Dec 2025

  • System refinement
  • Documentation completion
  • Final presentation & handover

Project Documentation

Hover over a PDF link to preview