GithubHelp home page GithubHelp logo

cm-1 / cs344 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from udacity/cs344

0.0 0.0 0.0 71.74 MB

Introduction to Parallel Programming class code

CMake 2.54% Cuda 47.21% C 5.32% C++ 40.70% Makefile 4.23%

cs344's Introduction

About this fork

This is my fork of the Github directory supplied by Udacity for their cs344 course, "Introduction to Parallel Programming".

At the time of updating this README, I've only attempted the first two problem sets. If I have time and think it's relevant to my research, I'll return to attempt the others at a later date.

My changes to CMakeLists.txt are a bit non-ideal for including OpenCV because I guess there's something with my built-from-source Windows 10 OpenCV installation that maybe didn't go as find_package() would expect. Since I currently don't have time to debug that, I went with a hard-coded path for now. The changes to the setting of CUDA_NVCC_FLAGS could also possibly be improved, but I don't have time to research that at the moment.

For other notes/info, see the "otherPics" directory, as well as some .txt notes I've made in the various Lesson Code Snippets and Problem Sets directories. There are also notes/files I've left for my reference in the "stuffToGitignore", but I have not committed them primarily out of IP/copyright concerns.

TODO

I currently wrote my HW2 solution where pixels near any of the four edges are responsible for copying the "halo" pixels outside of the block into shared memory. Another solution I found online, though, shifts things so that most pixels just copy one pixel, but it's the one shifted by (-halfFilterWidth, -halfFilterWidth) relative to them instead of the one exactly matching their coordinates, and then for all pixels within the "full" filterWidth - 1 from the two "max" edges, they copy extra pixels to the sides as needed. After doing some reading about what warps are, I realize this way is probably better for minimizing warp divergence. So I'd probably rewrite mine to match.

And then, of course, I need to finish all of the problems after HW2 at some point.

I'd also like to improve CMakeLists.txt, especially the CUDA_ARCHITECTURES part.

Good other forks to look at:

The README for https://github.com/ernestyalumni/cs344 looks super comprehensive, especially in its discussion of HW2. It also links a paper that analyzes optimal block sizes for stencils. "Demystifying the 16 x 16 thread-block for stencils on the GPU" by Tabik et al.

The README of https://github.com/ilyakava/cs344 seems to contain a really good list of other resources.

The contents of Udacity's original README are below.


cs344 =====

Introduction to Parallel Programming class code

Building on OS X

These instructions are for OS X 10.9 "Mavericks".

  • Step 1. Build and install OpenCV. The best way to do this is with Homebrew. However, you must slightly alter the Homebrew OpenCV installation; you must build it with libstdc++ (instead of the default libc++) so that it will properly link against the nVidia CUDA dev kit. This entry in the Udacity discussion forums describes exactly how to build a compatible OpenCV.

  • Step 2. You can now create 10.9-compatible makefiles, which will allow you to build and run your homework on your own machine:

mkdir build
cd build
cmake ..
make

cs344's People

Contributors

msarahan avatar chenghanlee avatar chenghan avatar cm-1 avatar cpowell avatar has207 avatar jiridanek avatar madan-ram avatar mortennobel avatar nayoungkim avatar bucienator avatar flyfy1 avatar macias avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.