|
Size: 559
Comment:
|
Size: 12736
Comment:
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 1: | Line 1: |
| = High-Performance Computing = | #acl MarshaBerger:read,write,admin All:read |
| Line 3: | Line 3: |
| || ''Class Time/Location'' || Wednesday 5-7pm, room TBD || || ''Instructor'' || Marsha Berger, Andreas Kloeckner || || ''Email'' || berger@cims.nyu.edu, kloeckner@cims.nyu.edu || || ''Office'' || Courant Institute, Warren Weaver Hall, Rooms 1121, 1105A || || ''Office Hours'' || Monday, Wednesday 4pm-5pm || |
{{attachment:flyer.png|Flyer thumbnail|align="right"}} = High-Performance Scientifc Computing = || ''Class Time/Location'' || Wednesday 5:10-7pm, Room 101 Warren Weaver Hall || || ''Instructor'' || [[http://www.cims.nyu.edu/~kloeckner|Andreas Kloeckner]], [[http://www.cims.nyu.edu/~berger|Marsha Berger]] || || ''Email'' || kloeckner@cims.nyu.edu, berger@cims.nyu.edu || || ''Office'' || Courant Institute, Warren Weaver Hall, Rooms 1105A, 1121 || || ''Office Hours'' || Andreas: Wednesdays, 2-4pm WWH 1105A || |
| Line 9: | Line 13: |
| || ''Email Listserv'' || http://lists.tiker.net/listinfo/discretefall11, discretefall11@tiker.net, [[http://lists.tiker.net/pipermail/discretefall11/|archive]] || | || ''Email Listserv'' || [[http://lists.tiker.net/listinfo/hpc12|Info page]] || [[attachment:flyer.pdf|Class advertisement]] <<TableOfContents>> == Lecture Material == || '''Lecture #''' || '''Date''' || '''Topics''' || '''Slides''' || '''Video''' || '''Code''' || '''Extra Info''' || || 1 || Sep 5 || Intro, why parallel, Vector-add in seq, OpenMP, CL || [[attachment:lec1.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-09-05.json|Video]] || [[https://github.com/hpc12/lec1-demo|Code]] || || || 2 || Sep 12 || Vector-add in MPI, Intro to OpenMP || [[attachment:lec2.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/video/2012-09-12-room.webm|Video]] || [[https://github.com/hpc12/lec2-demo|Code]] || || || 3 || Sep 19 || HW2, OMP subtleties || [[attachment:lec3.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-09-19.json|Video]] || || || || 4 || Sep 26 || Make, Intro to OpenCL || [[attachment:lec4.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-09-26.json|Video]] || [[https://github.com/hpc12/lec4-demo|Code]] || || || 5 || Oct 3 || Git, OpenCL sync/local, Intro to MPI || [[attachment:lec5.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-10-03.json|Video]] || [[https://github.com/hpc12/lec5-demo|Code]] || || || 6 || Oct 10 || Gdb, MPI point-to-point || [[attachment:lec6.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-10-10.json|Video]] || [[https://github.com/hpc12/lec6-demo|Code]] || || || 7 || Oct 17 || Valgrind, MPI collectives, Intro perf. || [[attachment:lec7.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-10-17.json|Video]] || [[https://github.com/hpc12/lec7-demo|Code]] || || || 8 || Oct 24 || Software installation, tmux, single-thread perf. || [[attachment:lec8.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-10-24.json|Video]] || [[https://github.com/hpc12/lec8-demo|Code]] || || || || Oct 31 ||<-5> NYU closed, no class because of Hurricane Sandy aftermath || || 9 || Nov 7 || Shell scripting, single/multi-thread perf. || [[attachment:lec9.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-11-07.json|Video]] || [[https://github.com/hpc12/lec9-demo|Code]] || || || 10 || Nov 14 || Profilers, parallel perf. || [[attachment:lec10.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-11-14.json|Video]] || [[https://github.com/hpc12/lec10-demo|Code]] || || || 11 || Nov 21 || Advanced git, GPU perf. || [[attachment:lec11.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-11-21.json|Video]] || [[https://github.com/hpc12/lec11-demo|Code]] || || || 12 || Nov 28 || GPU perf., patterns || [[attachment:lec12.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-11-28.json|Video]] || [[https://github.com/hpc12/lec12-demo|Code]] || [[/Lecture12/GPUMemoryAccess|GPU mem access patterns]] || || 12 || Dec 5 || Parallel patterns, 3D vis. || [[attachment:lec13.pdf|Slides]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player.html?descriptor=metadata/2012-12-05.json|Video]] || [[https://github.com/hpc12/lec13-demo|Code]] || || || || Dec 12 ||<-5> No class, NYU legislative day. Runs on a Monday schedule. || || 13 || Dec 18 || Project Presentations (part 1) ||<|2> [[/Projects]] || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player-with-audio.html?descriptor=metadata/2012-12-18.json|Video]] ||<|2> [[/Projects]] ||<|2> || || 14 || Dec 19 || Project Presentations (part 2) || [[http://video.cims.nyu.edu/courses/fall12/CSCI-GA.2945-001/video/upload/html/player-with-audio.html?descriptor=metadata/2012-12-19.json|Video]] || (You'll need an up-to-date version of [[http://google.com/chrome|Google Chrome]] to play the videos. You'll also need decent internet bandwidth to do streaming (2 MBit/s should be sufficient). If your internet accesss is too slow, you can always right click and download the video. {{{#!wiki note As far as the videos are concerned, Internet Explorer and Safari are not supported, because they do not understand the video format we're using. And Chrome behaves ''much'' better than Firefox with the new lecture video player, to the point of Firefox not even starting up properly. We're currently [[https://bugzilla.mozilla.org/show_bug.cgi?id=795686|investigating]]. In the meantime, please use Chrome. }}} == Updates == Dec 13, 2012:: I've posted the [[/ProjectPresentationsSchedule]], now in its final form. See you next week! Sep 9, 2012:: We're moving to a bigger room! We'll be meeting in room 101 of Warren Weaver Hall from September 12 onward. Aug 21, 2012:: If you're from outside of Courant, you may encounter some difficulty registering for the class. We're fighting with the NYU administration to make this better. In the meantime, please get in touch with us. This class is most definitely open to students from other departments, NYU Albert apparently just hasn't gotten the memo. Aug 9, 2012:: Less than a month to go! Class starts on September 5, 2012, from 5-7pm. We've also been assigned a room. We will be meeting in Warren Weaver Hall, room 512 (but check back here just in case there are changes in the meantime). See you then! == Grading/Evaluation == If you will be taking the class for credit, there will be * Weekly homework (60% of your grade) * A more ambitious final project, which may be inspired by your own research needs (40% of your grade) (also see [[/ProjectSubmissionGuidelines]]) If you're planning on auditing or just sitting in, you are more than welcome. == Homework == * [[attachment:hw1.pdf|Homework 1]] due September 12 * [[attachment:hw2.pdf|Homework 2]] due --(September 19)-- September 23 * [[attachment:hw3.pdf|Homework 3]] due October 3 ('''updated 9/30''' for sign bug in formula) * [[attachment:hw4.pdf|Homework 4]] due October 10 * [[attachment:hw5.pdf|Homework 5]] due October 17 * [[attachment:hw6.pdf|Homework 6]] due --(November 1)-- --(November 4 because of the storm)-- November 7 because of protracted power outage * Final project presentations ''in class around Dec 17--19'' == Material == === Books === * [[http://www.springerlink.com/content/978-3-642-04817-3/|Parallel Programming: for Multicore and Cluster Systems]] (available for in PDF form free online from within the NYU network, also from off-campus via this [[https://ezproxy.library.nyu.edu/login?url=http://www.springerlink.com/content/978-3-642-04817-3/|EZProxy link]]) For OpenCL and GPU programming, we will also be referring to the following sources: || OpenCL in Action: How to Accelerate Graphics and Computation || [[http://proquestcombo.safaribooksonline.com/book/-/9781617290176|from NYU net]] || [[https://ezproxy.library.nyu.edu/login?url=http://proquestcombo.safaribooksonline.com/book/-/9781617290176|ezproxy]] || [[http://my.safaribooksonline.com/9781617290176|via publisher]] || || OpenCL Programming Guide || [[http://proquestcombo.safaribooksonline.com/book/-/9780132488006|from NYU net]] || [[https://ezproxy.library.nyu.edu/login?url=http://proquestcombo.safaribooksonline.com/book/-/9780132488006|ezproxy]] || [[http://my.safaribooksonline.com/9780132488006|via publisher]] || || Heterogeneous Computing with OpenCL || [[http://proquestcombo.safaribooksonline.com/book/-/9780123877666|from NYU net]] || [[https://ezproxy.library.nyu.edu/login?url=http://proquestcombo.safaribooksonline.com/book/-/9780123877666|ezproxy]] || [[http://my.safaribooksonline.com/9780123877666|via publisher]] || '''Update 9/12:''' Fixed ezproxy links. === Primary source material === These are the technical standards on which this class will be based. While sometimes a bit technical, these documents define whether the programs you write are ''correct'' (or not) or, perhaps result in ''undefined behavior'': * [[http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf|C99 specification]] * [[http://www.openmp.org/mp-documents/OpenMP3.1.pdf|OpenMP 3.1 specification]] ([[https://computing.llnl.gov/tutorials/openMP/|tutorial]]) * [[http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf|MPI 3.0 specification]] ([[https://computing.llnl.gov/tutorials/mpi/|tutorial]] /!\ not up-to-date, teaches functions ''removed'' from MPI 3) * [[http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf|OpenCL 1.2 specification]] === Secondary Sources === * [[https://www.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html|Is Parallel Programming Hard, And, If So, What Can You Do About It?]], edited by Paul Mc``Kenney * [[http://www-users.cs.umn.edu/~karypis/parbook/index.html|Parallel Programming]] Lecture slides (and book) by Karypis et al. More theory-heavy (less practical? :) ) than this class * [[http://www.cs.berkeley.edu/~demmel/cs267_Spr12/|CS267 Spring '12]] Lecture by Jim Demmel et al, Berkeley * [[http://z.cs.utexas.edu/wiki/rvdg.wiki/HowToOptimizeGemm|DGEMM optimization]] === Collected Wisdom === * [[ToolCheatSheet]] === Virtual Machine Images === * [[https://www.virtualbox.org/wiki/Downloads|VirtualBox Downloads]] * [[http://bit.ly/hpc12-vm|32-bit VM image]] << '''Recommended''' unless you're ''certain'' you've got a 64-bit OS * [[http://bit.ly/hpc12-vm-64|64-bit VM image]] ==== Installing MPE into the virtual machine ==== [[http://www.mcs.anl.gov/research/projects/perfvis/download/index.htm|MPE]] and [[http://www.mcs.anl.gov/research/projects/perfvis/software/viewers/index.htm#Jumpshot-4|Jumpshot]] for visualization of MPI execution were demonstrated during lecture 7. If you'd like to have those in your own virtual machine, download the following script: [[attachment:install-mpe.sh]] Change into the directory where the script resides and start the installation: {{{ sudo bash install-mpe.sh }}} If the script says {{{ *** SUCCESSFULLY INSTALLED MPE }}} then you should be able to use `mpecc` and `jumpshot` from now on. Note that `jumpshot` seems to have a habit of creating its main window title bar underneath the top panel, so that you can't move it. A solution is to right-click the task bar entry for Jumpshot, and click "Move". Notes: * You need to be online for the entire run time of the script. * Depending on your machine and internet connection, the script may take around half an hour to finish. (10 minutes just now on my laptop, but that's a fairly fast machine on a fast connection) * It is best to leave the computer alone while it is processing the script. * This script is ''only'' intended for the class virtual machine, and even then you are using it at your own risk. I highly recommend that you use Virtualbox to create a system restore point before you attempt the installation, in case something goes awry. Do not attempt to use this on a Mac or another Linux machine. == Links == * [[http://github.com/hpc12|Code from lectures and homework]] * [[http://forge.tiker.net|Class collaboration space]] ("forge") * [[https://wikis.nyu.edu/display/NYUHPC|NYU HPC systems group]] == Previous Editions of the Class == * [[http://cs.nyu.edu/courses/fall10/G22.2945-001/index.html|Fall '10]] * [[http://cs.nyu.edu/courses/fall08/G22.2945-001/index.html|Fall '08]] |
High-Performance Scientifc Computing
Class Time/Location |
Wednesday 5:10-7pm, Room 101 Warren Weaver Hall |
Instructor |
|
Office |
Courant Institute, Warren Weaver Hall, Rooms 1105A, 1121 |
Office Hours |
Andreas: Wednesdays, 2-4pm WWH 1105A |
Class Webpage |
|
Email Listserv |
Contents
Lecture Material
Lecture # |
Date |
Topics |
Slides |
Video |
Code |
Extra Info |
1 |
Sep 5 |
Intro, why parallel, Vector-add in seq, OpenMP, CL |
|
|||
2 |
Sep 12 |
Vector-add in MPI, Intro to OpenMP |
|
|||
3 |
Sep 19 |
HW2, OMP subtleties |
|
|
||
4 |
Sep 26 |
Make, Intro to OpenCL |
|
|||
5 |
Oct 3 |
Git, OpenCL sync/local, Intro to MPI |
|
|||
6 |
Oct 10 |
Gdb, MPI point-to-point |
|
|||
7 |
Oct 17 |
Valgrind, MPI collectives, Intro perf. |
|
|||
8 |
Oct 24 |
Software installation, tmux, single-thread perf. |
|
|||
|
Oct 31 |
NYU closed, no class because of Hurricane Sandy aftermath |
||||
9 |
Nov 7 |
Shell scripting, single/multi-thread perf. |
|
|||
10 |
Nov 14 |
Profilers, parallel perf. |
|
|||
11 |
Nov 21 |
Advanced git, GPU perf. |
|
|||
12 |
Nov 28 |
GPU perf., patterns |
||||
12 |
Dec 5 |
Parallel patterns, 3D vis. |
|
|||
|
Dec 12 |
No class, NYU legislative day. Runs on a Monday schedule. |
||||
13 |
Dec 18 |
Project Presentations (part 1) |
|
|||
14 |
Dec 19 |
Project Presentations (part 2) |
||||
(You'll need an up-to-date version of Google Chrome to play the videos. You'll also need decent internet bandwidth to do streaming (2 MBit/s should be sufficient). If your internet accesss is too slow, you can always right click and download the video.
As far as the videos are concerned, Internet Explorer and Safari are not supported, because they do not understand the video format we're using.
And Chrome behaves much better than Firefox with the new lecture video player, to the point of Firefox not even starting up properly. We're currently investigating. In the meantime, please use Chrome.
Updates
- Dec 13, 2012
I've posted the /ProjectPresentationsSchedule, now in its final form. See you next week!
- Sep 9, 2012
- We're moving to a bigger room! We'll be meeting in room 101 of Warren Weaver Hall from September 12 onward.
- Aug 21, 2012
- If you're from outside of Courant, you may encounter some difficulty registering for the class. We're fighting with the NYU administration to make this better. In the meantime, please get in touch with us. This class is most definitely open to students from other departments, NYU Albert apparently just hasn't gotten the memo.
- Aug 9, 2012
- Less than a month to go! Class starts on September 5, 2012, from 5-7pm. We've also been assigned a room. We will be meeting in Warren Weaver Hall, room 512 (but check back here just in case there are changes in the meantime). See you then!
Grading/Evaluation
If you will be taking the class for credit, there will be
- Weekly homework (60% of your grade)
A more ambitious final project, which may be inspired by your own research needs (40% of your grade) (also see /ProjectSubmissionGuidelines)
If you're planning on auditing or just sitting in, you are more than welcome.
Homework
Homework 1 due September 12
Homework 2 due September 19 September 23
Homework 3 due October 3 (updated 9/30 for sign bug in formula)
Homework 4 due October 10
Homework 5 due October 17
Homework 6 due November 1 November 4 because of the storm November 7 because of protracted power outage
Final project presentations in class around Dec 17--19
Material
Books
Parallel Programming: for Multicore and Cluster Systems (available for in PDF form free online from within the NYU network, also from off-campus via this EZProxy link)
For OpenCL and GPU programming, we will also be referring to the following sources:
OpenCL in Action: How to Accelerate Graphics and Computation |
|||
OpenCL Programming Guide |
|||
Heterogeneous Computing with OpenCL |
Update 9/12: Fixed ezproxy links.
Primary source material
These are the technical standards on which this class will be based. While sometimes a bit technical, these documents define whether the programs you write are correct (or not) or, perhaps result in undefined behavior:
MPI 3.0 specification (tutorial
not up-to-date, teaches functions removed from MPI 3)
Secondary Sources
Is Parallel Programming Hard, And, If So, What Can You Do About It?, edited by Paul McKenney
Parallel Programming Lecture slides (and book) by Karypis et al. More theory-heavy (less practical?
) than this class CS267 Spring '12 Lecture by Jim Demmel et al, Berkeley
Collected Wisdom
Virtual Machine Images
32-bit VM image << Recommended unless you're certain you've got a 64-bit OS
Installing MPE into the virtual machine
MPE and Jumpshot for visualization of MPI execution were demonstrated during lecture 7. If you'd like to have those in your own virtual machine, download the following script:
Change into the directory where the script resides and start the installation:
sudo bash install-mpe.sh
If the script says
*** SUCCESSFULLY INSTALLED MPE
then you should be able to use mpecc and jumpshot from now on. Note that jumpshot seems to have a habit of creating its main window title bar underneath the top panel, so that you can't move it. A solution is to right-click the task bar entry for Jumpshot, and click "Move".
Notes:
- You need to be online for the entire run time of the script.
- Depending on your machine and internet connection, the script may take around half an hour to finish. (10 minutes just now on my laptop, but that's a fairly fast machine on a fast connection)
- It is best to leave the computer alone while it is processing the script.
This script is only intended for the class virtual machine, and even then you are using it at your own risk. I highly recommend that you use Virtualbox to create a system restore point before you attempt the installation, in case something goes awry. Do not attempt to use this on a Mac or another Linux machine.
Links
