CSCE 4643/5693 GPU Programming (Spring 2021)

Course Description: This course will introduce massively parallel programming using Graphics Processing Units (GPUs). Basic programming model, GPU thread hierarchy, and GPU memory architecture will be covered. Various performance optimization techniques and parallel patterns will be discussed to deal with real-life applications.
   
Credit hours: 3
   
Meetings:

M/W/F: 2:00pm - 2:50pm

Online in blackboard

   
Instructor:

Miaoqing Huang

Office: JBHT 526

Phone: 479-575-7578

Email: mqhuang AT uark.edu

   
Office Hours:

Monday: 3-4pm, Wednesday  1-2pm

   
Course Management: Blackboard (learn.uark.edu)
   
Textbook:

1. Programming Massively Parallel Processors: A Hands-on Approach (3rd edition), by David B. Kirk and Wen-mei W. Hwu, Morgan Kaufmann, 2016, ISBN: 978-0128119860

2. NVidia CUDA C Programming Guide

3. CUDA Runtime API Reference Manual

   
Course Syllabus: Please download here

 

 

Class Schedule: (subject to change)

 

Week

Date

Content

Lecture

Note

1

1/11 Syllabus and course introduction Lecture 1 Access GPU cluster: steps
1/13 CUDA Basics Lecture 2  
1/15      
2 1/18     Martin Luther King Jr. Day
1/20      
1/22 Tiled Matrix Multiplication Lecture_3  
3 1/25 CUDA Memories Lecture_4  
1/27      
1/29      
4 2/1      
2/3      
2/5 Convolution Lecture_5
5 2/8      
2/10      
2/12      
6 2/15 Performance Considerations Lecture_6  
2/17      
2/19      
7 2/22     Spring Break, no class.
2/24      
2/26      
8 3/1      
3/3 Parallel Patterns Lecture_7 Parallel Prefix Sum (Scan) with CUDA
3/5      
9 3/8      
3/10      
3/12      
10 3/15      
3/17 Histogramming Lecture_8  
3/19      
11 3/22      
3/24 Floating-point Consideration Lecture_9  
3/26     Spring Break, no class.
12 3/29      
3/31      
4/2     Spring Break, no class.
13 4/5      
4/7 Data Transfer Lecture_10  
4/9    
14 4/12
4/14      
4/16      
15 4/19      
4/21      
4/23      
16 4/26      
4/28      
4/30     Dead day
17 5/x     Final exam

Lecture Slides:

 

No.

Content

Lecture

Note

  Access GPU cluster Steps  
1 Course introduction Lecture 1  
2 CUDA Basics Lecture 2  
3 Tiled Matrix Multiplication Lecture_3  
4 CUDA Memories Lecture_4  
5 Convolution Lecture_5  
6 Performance Considerations Lecture_6  
7 Parallel Patterns Lecture_7 Parallel Prefix Sum (Scan) with CUDA
8 Histogramming Lecture_8  
9 Floating-point Consideration Lecture_9  
10 Data Transfer Lecture_10  
11 OpenCL Introduction Lecture_11  
12 OpenACC Introduction Lecture_12  

 

 

 

Homework:

 

    No homework.   

 

Exam:

 

    Final exam.

 

Labs:

 

    Labs, i.e., programming assignments, will be given in the class. The solution will be given and discussed during lecture time. Labs are taken into account for the grading.

    Please note there is no lecture hour reserved for labs. Students are supposed to carry out the lab exercises in their own time.

 

Projects:

 

    No project.

 

 

 

Course Grading:

 

    Quizzes, lab assignments, and the final exam will be used for the final grade of the course.

    Quizzes:                 10% 

    Lab assignments:     60%

    Final exam:             30%

 

Grade Distribution:

 

    A: over 90%

    B: 80% - 89%

    C: 70% - 79%

    D: 60% - 69%

    F: below 60%

 

Acknowledgment:

    (1) A lot of GPU materials are borrowed from the corresponding course delivered at UIUC by Hwu and Kirk, and from the presentation slides by NVIDIA.