ECE 408
ECE 408 - Applied Parallel Programming
Spring 2024
Title | Rubric | Section | CRN | Type | Hours | Times | Days | Location | Instructor |
---|---|---|---|---|---|---|---|---|---|
Applied Parallel Programming | CS483 | AB | 56564 | LAB | 0 | - | |||
Applied Parallel Programming | CS483 | AL | 56562 | LEC | 4 | 0930 - 1050 | T R | 3031 Campus Instructional Facility | Volodymyr Kindratenko |
Applied Parallel Programming | CS483 | OLB | 68235 | OLB | 0 | - | |||
Applied Parallel Programming | CS483 | OLC | 68236 | OLC | 4 | 0930 - 1050 | T R | Volodymyr Kindratenko | |
Applied Parallel Programming | CSE408 | AB | 67939 | LAB | 0 | - | |||
Applied Parallel Programming | CSE408 | AL | 67940 | LEC | 4 | 0930 - 1050 | T R | 3031 Campus Instructional Facility | Volodymyr Kindratenko |
Applied Parallel Programming | CSE408 | OLB | 68237 | OLB | 0 | - | |||
Applied Parallel Programming | CSE408 | OLC | 68238 | OLC | 4 | 0930 - 1050 | T R | Volodymyr Kindratenko | |
Applied Parallel Programming | ECE408 | AB | 56563 | LAB | 0 | - | |||
Applied Parallel Programming | ECE408 | AL | 56561 | LEC | 4 | 0930 - 1050 | T R | 3031 Campus Instructional Facility | Volodymyr Kindratenko |
Applied Parallel Programming | ECE408 | OLB | 68233 | OLB | 0 | - | |||
Applied Parallel Programming | ECE408 | OLC | 68234 | OLC | 4 | 0930 - 1050 | T R | Volodymyr Kindratenko |
See full schedule from Course Explorer
Official Description
Subject Area
- Computer Engineering
Course Director
Detailed Description and Outline
Parallel programming with emphasis on developing applications for processors with many computation cores. Computational thinking, forms of parallelism, programming model features, mapping computations to parallel hardware, efficient data structures, paradigms for efficient parallel algorithms, hardware fatures and limitations, and application case studies. Same as CS 483.
Computer Usage
Extensive usage for all programming assignments and final project
Reports
A final project report is required
Lab Projects
Lab 0 - installation and test of programming environment; Lab 1 - Parallel Vector Addition; Lab 2 - Parallel Matrix Multiplication; Lab 3 - Tiled Parallel Matrix Multiplication; Lab 4 - Parallel Reduction; Lab 5 - Parallel Scan; Lab 6 - Tiled Parallel Convolution; Lab 7 - Sparse Matrix-Vector Multiplication; Final Project that involves Project Proposal, Project Workshop, Project Presentation, and Project Report
Lab Equipment
Linux based cluster system
Lab Software
C Programming Language and CUDA Software Development Kit, WebGPU for labs, RAI for final project
Topical Prerequisites
C programming, Basic data structures, Introduction to computer organization
Texts
D. Kirk and W. Hwu, Programming Massively Parallel Processors, Morgan Kaufmann, 3rd Edition.
Required, Elective, or Selected Elective
Elective
Course Goals
The aim of this course is to provide students with knowledge and hands-on experience in developing applications software for processors with massively parallel computing resources. In general, we refer to a processor as massively parallel if it has the ability to complete more than 64 arithmetic operations per clock cycle. Many commercial offerings from NVIDIA, AMD, and Intel already offer such levels of concurrency. Effectively programming these processors requires in-depth knowledge about parallel programming principles, as well as parallelism models, communication models, hardware organizations, and resource limitations of these processors. The target audiences of the course are students who want to develop exciting applications for these processors, as well as those who want to develop programming tools and future design for these processors.
Instructional Objectives
A. After the seven machine problems (after approximately 20 seventy-five minute lectures) the student should be able to:
1. Analyze and implement common parallel algorithm patterns in a parallel programming model such as CUDA. (1, 2)
2. Design experiments to analyze the performance bottlenecks in their parallel code. (6)
3. Apply common parallel techniques to improve performance given hardware constraints. (1, 2, 6)
4. Learn about the features of a parallel debugger and use them to identify and repair code defects. (6, 7)
5. Learn about the features of a parallel profiler and use them to identify performance bottlenecks in their code. (6, 7)
B. By examination 2 (after approximately 29 seventy-five minute lectures) the student should be able to:
6. Understand and apply common parallel algorithm patterns. (1, 7)
7. Understand the major types of hardware limitations that limit parallel program performance. (1, 6, 7)
8. Understand and apply common parallel programming interface features. (1, 6, 7)
9. Review a parallel code segment and identify its behavior and potential problems. (b, e)
C. By the end of the final project (with proposal, workshop discussions, presentation, and report) the student should be able to:
10. Identify and solve a computational problem with parallel algorithm design and program. (1, 2, 6, 7)
11. Learn the necessary domain knowledge in order to solve the identified problem (7)
12. Work with domain experts and teammates from different disciplines to maximize the effective of solutions (3, 5)
13. Properly divide up the responsibilities among teammates and support each other towards success (3, 4, 5)
14. Identify design space and explore optimization opportunities for the solutions. (1, 2, 6, 7)
15. Motivate the problem and approach in a presentation. (3)
16. Properly explain the solutions experimented and justify the final decision and outcome. (1, 2, 3, 4, 6)
17. Identify limitations of the solutions and future directions (1, 2, 4, 6, 7)