Changes

From Genome Analysis Wiki
Jump to navigationJump to search
m
Line 1: Line 1:  
== Objective ==
 
== Objective ==
   −
In this winter, Biostatistics 615/815 aims for providing students with a practical understanding of computational aspects in implementing statistical methods. Although C++ language will be used throughout the course, using Java programming language for homework and project will be acceptable.
+
In Fall 2011, Biostatistics 615/815 aims for providing students with a practical understanding of computational aspects in implementing statistical methods. Although C++ language will be used throughout the course, using Java programming language for homework and project will be acceptable.
    
== Target Audience ==
 
== Target Audience ==
Line 10: Line 10:     
== Textbook ==
 
== Textbook ==
* Required Textbook : Cormen, Leiserson, Rivest, and Stein, "Introduction to Algorithms", Third Edition, The MIT Press, 2009 [http://mitpress.mit.edu/algorithms/ [Official Book Web Site]]
+
* Recommended Textbook : Cormen, Leiserson, Rivest, and Stein, "Introduction to Algorithms", Third Edition, The MIT Press, 2009 [http://mitpress.mit.edu/algorithms/ [Official Book Web Site]]
 
* Optional Textbook : Press, Teukolsky, Vetterling, Flannery, "Numerical Recipes", 3rd Edition, Cambridge University Press, 2007 [http://www.nr.com/ [Official Book Web Site]]
 
* Optional Textbook : Press, Teukolsky, Vetterling, Flannery, "Numerical Recipes", 3rd Edition, Cambridge University Press, 2007 [http://www.nr.com/ [Official Book Web Site]]
    
== Class Schedule ==
 
== Class Schedule ==
   −
Classes are scheduled for Tuesday and Thursdays, 8:30 - 10:00 am at SPH II M4318
+
Classes are scheduled for Tuesday and Thursdays, 8:30 - 10:00 am at SPH II M4332
    
== Topics ==
 
== Topics ==
Line 21: Line 21:  
The following contents are planned to be covered.  
 
The following contents are planned to be covered.  
   −
=== Part I : Algorithms 101 ===
+
=== Part I : C++ Basics and Introductory Algorithms ===
* Understanding of Computational Time Complexity
+
* Computational Time Complexity
 
* Sorting
 
* Sorting
 
* Divide and Conquer Algorithms
 
* Divide and Conquer Algorithms
Line 28: Line 28:  
* Key Data Structure
 
* Key Data Structure
 
* Dynamic Programming
 
* Dynamic Programming
 +
* Hidden Markov Models
   −
=== Part II : Matrix Operations and Numerical Optimizations ===
+
=== Part II : Numerical Methods and Randomized Algorithms ===
* Matrix decomposition (LU, QR, SVD)
+
* Random Numbers
* Implementation of Linear Models
+
* Matrix Operations and Least Square Methods
* Numerical Optimizations
+
* Importance Sampling
 
  −
=== Part III : Advanced Statistical Methods ===
  −
* Hidden Markov Models
   
* Expectation Maximization
 
* Expectation Maximization
 
* Markov-Chain Monte Carlo Methods
 
* Markov-Chain Monte Carlo Methods
 +
* Simulated Annealing
 +
* Gibbs Sampling
    
== Class Notes ==
 
== Class Notes ==
* Lecture 1 : Statistical Computing -- [[Media:Biostat615-lecture1-handout.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture1-presentation.pdf | (Presentation mode - PDF)]]
+
* Lecture 1 : Introduction to Statistical Computing -- [[Media:Biostat615-Fall2011-lecture01-handout.pdf | (Handout PDF)]] [[Media:Biostat615-Fall2011-lecture01-presentation.pdf | (Presentation PDF)]]
* Lecture 2 : C++ Basics and Precisions -- [[Media:Biostat615-lecture2-handout-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture2-presentation.pdf | (Presentation mode - PDF)]]
+
* Lecture 2 : Introduction to C++ Programming -- [[Media:Biostat615-Fall2011-lecture02-handout.pdf | (Handout PDF)]] [[Media:Biostat615-Fall2011-lecture02-presentation.pdf | (Presentation PDF)]]
* Lecture 3 : Implementing Fisher's Exact Test -- [[Media:Biostat615-lecture3-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture3.pdf | (Presentation mode - PDF)]]
+
* Lecture 3 : C++ Basics and Fisher's Exact Test -- [[Media:Biostat615-Fall2011-lecture03-handout.pdf | (Handout PDF)]] [[Media:Biostat615-Fall2011-lecture03-presentation.pdf | (Presentation PDF)]]
* Lecture 4 : Classes and STLs -- [[Media:Biostat615-lecture4-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture4-presentation.pdf | (Presentation mode - PDF)]]
+
* Lecture 4 : Standard Template Libraries + Divide and Conquer Algorithms-- [[Media:Biostat615-Fall2011-lecture04-handout.pdf | (Handout PDF)]] [[Media:Biostat615-Fall2011-lecture04-presentation.pdf | (Presentation PDF)]]
* Lecture 5 : Divide and Conquer Algorithms -- [[Media:Biostat615-lecture5-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture5.pdf | (Presentation mode - PDF)]]
+
* Lecture 5 : Sorting Algorithms -- [[Media:Biostat615-Fall2011-lecture05-handout.pdf | (Handout PDF)]] [[Media:Biostat615-Fall2011-lecture05-presentation.pdf | (Presentation PDF)]]
* Lecture 6 : Linear Sorting Algorithms and Elementary Data Structures -- [[Media:Biostat615-lecture6-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture6-presentation.pdf | (Presentation mode - PDF)]]
+
* Lecture 6 : Sorting Algorithms & Arrays -- [[Media:Biostat615-Fall2011-lecture06-handout.pdf | (Handout PDF)]] [[Media:Biostat615-Fall2011-lecture06-presentation.pdf | (Presentation PDF)]]
* Lecture 7 : Data Structures -- [[Media:Biostat615-lecture7-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture7-presentation.pdf | (Presentation mode - PDF)]]
+
* Lecture 7 : List and Binary Search Trees -- [[Media:Biostat615-Fall2011-lecture07-handout.pdf | (Handout PDF)]] [[Media:Biostat615-Fall2011-lecture07-presentation.pdf | (Presentation PDF)]]
* Lecture 8 : Hash Tables -- [[Media:Biostat615-lecture8-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture8-presentation.pdf | (Presentation mode - PDF)]]
+
* Lecture 8 : Hash Tables -- [[Media:Biostat615-Fall2011-lecture08-handout.pdf | (Handout PDF)]] [[Media:Biostat615-Fall2011-lecture08-presentation.pdf | (Presentation PDF)]]
* Lecture 9 : Dyamic Programming -- [[Media:Biostat615-lecture9-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture9-presentation.pdf | (Presentation mode - PDF)]]
+
* Lecture 9 : Dynamic Programming -- [[Media:Biostat615-Fall2011-lecture09-handout.pdf | (Handout PDF)]] [[Media:Biostat615-Fall2011-lecture09-presentation.pdf | (Presentation PDF)]]
* Lecture 10 : Boost Libraries and Graphical Algorithms -- [[Media:Biostat615-lecture10-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture10-presentation.pdf | (Presentation mode - PDF)]]
+
* Review : Dynamic Programming & Midterm Review -- [[Media:Biostat615-Fall2011-midterm-2011-winter-.pdf | (PDF)]]
* Lecture 11 : Hidden Markov Models -- [[Media:Biostat615-lecture11-nup.pdf | (Handout mode - PDF)]] [[Media:Biostat615-lecture11.pdf | (Presentation mode - PDF)]]
+
* Lecture 10 : Hidden Markov Model-- [[Media:Biostat615-Fall2011-lecture10-handout.pdf | (PDF)]] '''(UPDATED on Oct 29th at 12:51PM)'''
 +
* Lecture 11 : Hidden Markov Model (cont'd) -- [[Media:Biostat615-lecture11-2011-10-20.pdf | (PDF)]] '''(UPDATED on Oct 28th at 1:42PM)'''
 +
* Lecture 12 : Boost Library & Random Numbers -- [[Media:Biostat615-lecture12-2011-10-25.pdf | (PDF)]]
 +
* Lecture 13 : Single dimensional optimization -- [[Media:Biostat615-lecture13-2011-10-27.pdf | (PDF)]]
 +
* Lecture 14 : Single and multi dimensional optimizations -- [[Media:Biostat615-fall2011-lecture14.pdf | (PDF)]] (Updated on Nov 3rd 1:25AM)
 +
* Lecture 15 : Multi dimensional optimizations -- [[Media:Biostat615-fall2011-lecture15.pdf | (PDF)]] (Updated Nov 8 10:35AM)
 +
* Lecture 16 : E-M algorithm -- [[Media:Biostat615-fall2011-lecture16.pdf | (PDF)]] (Updated Nov 8 10:35AM)
 +
* Lecture 17 : Simulated Annealing -- [[Media:Biostat615-fall2011-lecture17.pdf | (PDF)]]
 +
* Lecture 18 : Gibbs Sampling -- [[Media:Biostat615-fall2011-lecture18.pdf | (PDF)]] (Updated Nov 16 10:00PM)
 +
* Lecture 19 : Importace Sampling -- [[Media:Biostat615-fall2011-lecture19.pdf | (PDF)]]
 +
* Lecture 20 : Advanced Hidden Markov Models -- [[Media:Biostat615-fall2011-lecture20.pdf | (PDF)]]
 +
* Lecture 21 : Linear Algebra in C++ -- [[Media:Biostat615-fall2011-lecture21.pdf | (PDF)]]
 +
* Lecture 22 : More Linear Algebra in C++ -- [[Media:Biostat615-fall2011-lecture22.pdf | (PDF)]]
 +
* Lecture 23 : Interfacing between C++ and R -- [[Media:Biostat615-fall2011-lecture23.pdf | (PDF)]]
 +
* Review : Final Review -- [[Media:Biostat615-winter2011-final.pdf | (PDF)]] [[Media:Biostat615-homework-review.pdf | (Homework)]]
    
== Problem Sets ==
 
== Problem Sets ==
* Problem Set 1 : Due on January 20th, 2011 -- [[Media:Biostat615-homework1.pdf | (Problem Set 1 - PDF)]]
+
* Problem Set 0 - Running screenshots of helloWorld.cpp and towerOfHanoi.cpp - Due before the submission of Problem Set 1
* Problem Set 2 : Due on February 8th, 2011 -- [[Media:Biostat615-homework2.pdf | (Problem Set 2 - PDF)]]
+
* Problem Set 1 -- Due on Tuesday September 27th, 2011 [[Media:Biostat615-Fall2011-homework01.pdf | (PDF)]] [[Media:Biostat615-Fall2011-homework01-solutions.pdf | (PDF-SOLUTIONS)]]
 +
* Problem Set 2 -- Due on Thursday October 6th, 2011 [[Media:Biostat615-Fall2011-homework02.pdf | (PDF)]] [[Media:Biostat615-Fall2011-homework02-solutions.pdf | (PDF-SOLUTIONS)]]
 +
** (Update Oct 2, 2011 : Note that the problem 1 and 3 are slightly updated for clarification)
 +
** (If you can't decompress the files above properly, use this alternative link by [http://dl.dropbox.com/u/1850834/biostat615-homework02-datasets.tar.gz CLICKING HERE] )
 +
* Problem Set 3 -- Due on Tuesday November 1st, 2011 [[Media:Biostat615-homework03.pdf | (PDF)]] (UPDATED on Oct 25th at 11:10AM)
 +
* Problem Set 4 -- Due on Tuesday November 15th, 2011 [[Media:Biostat615-homework04.pdf | (PDF)]]
 +
* Problem Set 5 -- Due on Tuesday November 29th, 2011 [[Media:Biostat615-fall2011-homework05.pdf | (PDF)]]
 +
* Problem Set 6 -- Due on Tuesday December 13th, 2011 [[Media:Biostat615-fall2011-homework06.pdf | (PDF)]]  
 +
** [http://www.sph.umich.edu/csg/abecasis/class/2006/ModelFittingData.txt DOWNLOAD DATA FOR PROBLEM 1]
 +
** [http://dl.dropbox.com/u/1850834/zip_01.zip DOWNLOAD DATA FOR PROBLEM 3]
 +
 
 +
=== Supplementary Data sets for Problem Sets ===
 +
* Problem Set 2
 
** [[Media:Shuf-1M.txt.gz| (Example data - shuf-1M.txt.gz)]] 1,000,000 randomly shuffled data (gzipped)
 
** [[Media:Shuf-1M.txt.gz| (Example data - shuf-1M.txt.gz)]] 1,000,000 randomly shuffled data (gzipped)
 
** [[Media:Rand-1M-3digits.txt.gz| (Example data - Rand-1M-3digits.txt.gz)]] 1,000,000 random data from 1 to 1,000]] (gzipped)  
 
** [[Media:Rand-1M-3digits.txt.gz| (Example data - Rand-1M-3digits.txt.gz)]] 1,000,000 random data from 1 to 1,000]] (gzipped)  
 
** [[Media:Rand-50k.txt.gz | (Example data - Rand-50k.txt.gz)]] 50,000 random data from 1 to 1,000,000)]] (gzippd)
 
** [[Media:Rand-50k.txt.gz | (Example data - Rand-50k.txt.gz)]] 50,000 random data from 1 to 1,000,000)]] (gzippd)
* Problem Set 3 : Due on February 17th, 2011 -- [[Media:Biostat615-homework3.pdf | (Problem Set 3 - PDF)]]
+
* Problem Set 3
** For problem 2 - Refer to the complete code for Visual C++ users [[Biostatistics 615/815: Main Page#Code_for_Problem_3_-_Problem_2]]
+
** Example output data for problem 3-1 (input is the second column) '''(NOTE : ADDED on Oct 25 11:45PM)''' -- This is also reflected in lecture 11 class note.
 
+
  TIME TOSS P(FAIR) P(BIAS) MLSTATE
=== Source code from Homework 2 - Problem 3 ===
+
1 H 0.5950 0.4050 FAIR
#include <iostream>
+
2 T 0.8118 0.1882 FAIR
#include <vector>
+
  3 H 0.8071 0.1929 FAIR
#include <ctime>
+
  4 T 0.8584 0.1416 FAIR
#include <fstream>
+
  5 H 0.7613 0.2387 FAIR
#include <set>
+
  6 H 0.7276 0.2724 FAIR
#include "mySortedArray.h"
+
  7 T 0.7495 0.2505 FAIR
#include "myTree.h"
+
  8 H 0.5413 0.4587 BIASED
  #include "myList.h"                                           
+
  9 H 0.4187 0.5813 BIASED
+
  10 H 0.3533 0.6467 BIASED
int main(int argc, char** argv) {
+
  11 H 0.3301 0.6699 BIASED
  int tok;
+
  12 H 0.3436 0.6564 BIASED
  std::vector<int> v;
+
  13 H 0.3971 0.6029 BIASED
  if ( argc > 1 ) {
+
  14 T 0.5028 0.4972 BIASED
    std::ifstream fin(argv[1]);
+
  15 H 0.3725 0.6275 BIASED
    while( fin >> tok ) { v.push_back(tok); }
+
  16 H 0.2985 0.7015 BIASED
    fin.close();
+
  17 H 0.2635 0.7365 BIASED
  }
+
  18 H 0.2596 0.7404 BIASED
  else {
+
  19 H 0.2858 0.7142 BIASED
    while( std::cin >> tok ) { v.push_back(tok); }
+
  20 H 0.3482 0.6518 BIASED
  }                                           
+
** Example output data for problem 3-2 (input is the second column) '''(NOTE : UPDATED on Oct 25 11:23PM)'''
  mySortedArray<int> c1;
+
  TIME TOSS Pr(F) Pr(HB) Pr(TB) MLSTATE
  myList<int> c2;
+
  1 T 0.8844 0.0326 0.0830 FAIR
  myTree<int> c3;
+
  2 H 0.9012 0.0791 0.0198 FAIR
  std::set<int> s;
+
  3 H 0.9075 0.0735 0.0189 FAIR
   
+
  4 T 0.9091 0.0145 0.0764 FAIR
  clock_t start = clock();
+
  5 T 0.9068 0.0114 0.0818 FAIR
  for(int i=0; i < (int)v.size(); ++i) {
+
  6 H 0.9058 0.0440 0.0502 FAIR
    c1.insert(v[i]);
+
  7 T 0.8834 0.0275 0.0891 FAIR
  }
+
  8 H 0.8520 0.0698 0.0783 FAIR
  clock_t finish = clock();
+
  9 T 0.7713 0.0347 0.1940 FAIR
  double duration = (double)(finish-start)/CLOCKS_PER_SEC;  
+
  10 T 0.6927 0.0823 0.2249 FAIR
  std::cout << "Sorted Array (Insert) " << duration << std::endl;
+
  11 H 0.4730 0.4984 0.0286 HEAD-BIASED
   
+
  12 H 0.3227 0.6706 0.0066 HEAD-BIASED
  start = clock();
+
  13 H 0.2236 0.7726 0.0037 HEAD-BIASED
  for(int i=0; i < (int)v.size(); ++i) {
+
  14 H 0.1589 0.8381 0.0031 HEAD-BIASED
    c2.insert(v[i]);
+
  15 H 0.1169 0.8803 0.0028 HEAD-BIASED
  }
+
  16 H 0.0902 0.9072 0.0026 HEAD-BIASED
  finish = clock();
+
  17 H 0.0740 0.9235 0.0025 HEAD-BIASED
  duration = (double)(finish-start)/CLOCKS_PER_SEC;  
+
  18 H 0.0654 0.9321 0.0025 HEAD-BIASED
  std::cout << "List (Insert) " << duration << std::endl;
+
  19 H 0.0630 0.9346 0.0025 HEAD-BIASED
 
+
20 H 0.0661 0.9314 0.0025 HEAD-BIASED
  start = clock();
+
  21 H 0.0755 0.9219 0.0026 HEAD-BIASED
  for(int i=0; i < (int)v.size(); ++i) {
+
22 H 0.0926 0.9038 0.0036 HEAD-BIASED
    c3.insert(v[i]);
+
  23 H 0.1204 0.8684 0.0113 HEAD-BIASED
  }
+
24 H 0.1603 0.7586 0.0811 HEAD-BIASED
  finish = clock();
+
  25 T 0.1904 0.0858 0.7238 TAIL-BASED
  duration = (double)(finish-start)/CLOCKS_PER_SEC;  
+
  26 T 0.1819 0.0118 0.8063 TAIL-BASED
  std::cout << "Tree (Insert) " << duration << std::endl;
+
  27 T 0.1797 0.0036 0.8167 TAIL-BASED
   
+
  28 T 0.1894 0.0028 0.8077 TAIL-BASED
  start = clock();
+
  29 T 0.2136 0.0038 0.7826 TAIL-BASED
  for(int i=0; i < (int)v.size(); ++i) {
+
  30 T 0.2561 0.0123 0.7317 TAIL-BASED
    s.insert(v[i]);
+
** Example input/output data for problem 3-3 (Applying 2-state HMM in Problem 3-1): Download using [http://dl.dropbox.com/u/1850834/biostat615-homework3-3-20k-examples.zip THIS LINK]
  }
  −
  finish = clock();
  −
  duration = (double)(finish-start)/CLOCKS_PER_SEC;  
  −
  std::cout << "std::set (Insert) " << duration << std::endl;
  −
   
  −
  start = clock();
  −
  for(int i=0; i < (int)v.size(); ++i) {
  −
    c1.search(v[i]);
  −
  }
  −
  finish = clock();
  −
  duration = (double)(finish-start)/CLOCKS_PER_SEC;  
  −
  std::cout << "Sorted Array (Search) " << duration << std::endl;
  −
   
  −
  start = clock();
  −
  for(int i=0; i < (int)v.size(); ++i) {
  −
    c2.search(v[i]);
  −
  }
  −
  finish = clock();
  −
  duration = (double)(finish-start)/CLOCKS_PER_SEC;  
  −
  std::cout << "List (Search) " << duration << std::endl;
  −
   
  −
  start = clock();
  −
  for(int i=0; i < (int)v.size(); ++i) {
  −
    c3.search(v[i]);
  −
  }
  −
  finish = clock();
  −
  duration = (double)(finish-start)/CLOCKS_PER_SEC;  
  −
  std::cout << "Tree (Search) " << duration << std::endl;
  −
   
  −
  start = clock();
  −
  for(int i=0; i < (int)v.size(); ++i) {
  −
    s.find(v[i]);
  −
  }
  −
  finish = clock();
  −
  duration = (double)(finish-start)/CLOCKS_PER_SEC;  
  −
  std::cout << "std::set (Search) " << duration << std::endl;
  −
  }
  −
 
  −
=== The header file of mySortedArray.h (For Homework 2 Problem 3)===
  −
 
  −
  #include <iostream>
  −
  #define DEFAULT_ALLOC 1024
  −
template <class T> // template supporting a generic type
  −
class mySortedArray {
  −
  protected:    // member variables hidden from outside
  −
    T *data;    // array of the genetic type
  −
    int size;  // number of elements in the container
  −
    int nalloc; // # of objects allocated in the memory
  −
  mySortedArray(mySortedArray& a) {};
  −
  int search(const T& x, int begin, int end);
  −
  public:      // abstract interface visible to outside
  −
    mySortedArray();        // default constructor
  −
    ~mySortedArray();        // desctructor
  −
    void insert(const T& x); // insert an element x
  −
    int search(const T& x); // search for an element x and return its location
  −
    bool remove(const T& x); // delete a particular element
  −
  };
  −
   
  −
  template <class T>
  −
  mySortedArray<T>::mySortedArray() {    // default constructor
  −
  size = 0;              // array do not have element initially
  −
  nalloc = DEFAULT_ALLOC;
  −
  data = new T[nalloc]; // allocate default # of objects in memory
  −
  }
  −
  −
template <class T>
  −
  mySortedArray<T>::~mySortedArray() {    // destructor
  −
  if ( data != NULL ) {
  −
    delete [] data;      // delete the allocated memory before destorying
  −
  }                      // the object. otherwise, memory leak happens
  −
  }
  −
  −
template <class T>
  −
  void mySortedArray<T>::insert(const T& x) {
  −
  if ( size >= nalloc ) { // if container has more elements than allocated
  −
    T* newdata = new T[nalloc*2];  // make an array at doubled size
  −
    for(int i=0; i < nalloc; ++i) {
  −
      newdata[i] = data[i];        // copy the contents of array
  −
    }
  −
    delete [] data;                // delete the original array
  −
    data = newdata;                // and reassign data ptr
  −
  }
  −
  −
  int i;
  −
  for(i=size-1; (i >= 0) && (data[i] > x); --i) {
  −
    data[i+1] = data[i];            // insert the list into right position
  −
  }
  −
  data[i+1] = x;
  −
  ++size;                          // increase the size
  −
  }
  −
  −
template <class T>
  −
int mySortedArray<T>::search(const T& x) {
  −
  return search(x, 0, size-1);
  −
  }
  −
  −
template <class T>
  −
int mySortedArray<T>::search(const T& x, int begin, int end) {
  −
  if ( begin > end ) {
  −
    return -1;
  −
  }
  −
  else {
  −
    int mid = (begin+end)/2;
  −
    if ( data[mid] == x ) {
  −
      return mid;
  −
    }
  −
    else if ( data[mid] < x ) {
  −
      return search(x, mid+1, end);
  −
    }
  −
    else {
  −
      return search(x, begin, mid);
  −
    }
  −
  }
  −
  }
  −
  −
template <class T>
  −
bool mySortedArray<T>::remove(const T& x) {
  −
  int i = search(x);  // try to find the element
  −
  if ( i >= 0 ) {      // if found
  −
    for(int j=i; j < size-1; ++j) {
  −
      data[i] = data[i+1]; // shift all the elements by one
  −
    }
  −
    --size;          // and reduce the array size
  −
    return true;      // successfully removed the value
  −
  }
  −
  else {
  −
    return false;    // cannot find the value to remove
  −
  }
  −
  }
  −
 
  −
=== Example code to generare all possible permutations ===
  −
 
  −
Note that http://www.cplusplus.com/reference/algorithm/next_permutation/ provides much simpler solution for generating permutation. (Thanks to Matthew Snyder)
  −
 
  −
  // Code to generate permutations from 1..n
  −
// These code was adopted from
  −
// http://geeksforgeeks.org/?p=767 by Hyun Min Kang
  −
#include <iostream>
  −
#include <vector>
  −
  −
// swaps two elements
  −
void swap(std::vector<int>& v, int i, int j) {
  −
  int tmp = v[i];
  −
  v[i] = v[j];
  −
  v[j] = tmp;
  −
  return;
  −
}
  −
  −
// actual engine - recursively calls permutation
  −
  void permute(std::vector<int>& v, int from, int to) {
  −
  if ( from == to ) {          // print the permutation
  −
    for(int i=0; i < v.size(); ++i) {
  −
      std::cout << " " << v[i];
  −
    }
  −
    std::cout << std::endl;
  −
  }
  −
  else {
  −
    for(int i = from; i <= to; ++i) {
  −
      swap(v,from,i);          // swaps two elements
  −
      permute(v, from+1, to);  // permute the rest
  −
      swap(v,from,i);          // recover the vector to the original state
  −
    }
  −
  }
  −
}
  −
  −
// Permutation Example
  −
int main(int argc, char** argv) {
  −
  if ( argc != 2 ) {
  −
    std::cerr << "Usage: " << argv[0] << " " << "[# of samples]" << std::endl;
  −
    return -1;
  −
  }
  −
   
  −
  // takes input from 1..n
  −
  int n = atoi(argv[1]);
  −
  std::vector<int> input;
  −
  for(int i=1; i <= n; ++i) {
  −
    input.push_back(i);
  −
  }
  −
  permute(input, 0, n-1); // print all possible permutations
  −
  return 0;
  −
}
  −
 
  −
#include <iostream>                      // for input/output
  −
#include <boost/graph/adjacency_list.hpp> // for using graph type
  −
#include <boost/graph/dijkstra_shortest_paths.hpp> // for dijkstra algorithm
  −
using namespace std;  // allow to omit prefix 'std::'
  −
using namespace boost; // allow to omit prefix 'boost::'
  −
  −
int getNodeID(int node) {
  −
  return ((node/5)+1)*10+(node%5+1);
  −
}
  −
 
  −
=== Code for Problem 3 - Problem 2 ===
  −
   
  −
int main(int argc, char** argv) {
  −
  // defining a graph type
  −
  // 1. edges are stored as std::list internally
  −
  // 2. verticies are stored as std::vector internally
  −
  // 3. the graph is directed (undirectedS, bidirectionalS can be used)
  −
  // 4. vertices do not carry particular properties
  −
  // 5. edges contains weight property as integer value
  −
  typedef adjacency_list< listS, vecS, directedS, no_property, property< edge_weight_t, int> > graph_t;
  −
  −
  // vertex_descriptor is a type for representing vertices
  −
  typedef graph_traits< graph_t >::vertex_descriptor vertex_descriptor;
  −
  // a nodes is represented as an integer, and an edge is a pair of integers
  −
  typedef std::pair<int, int> E;
  −
  −
  // Connect between verticies as in the Manhattan Tourist Problem
  −
  // Each node is numbered as a two-digit integer of [#row] and [#col]
  −
  enum { N11, N12, N13, N14, N15,
  −
N21, N22, N23, N24, N25,
  −
N31, N32, N33, N34, N35,
  −
N41, N42, N43, N44, N45,
  −
N51, N52, N53, N54, N55 };
  −
   
  −
  E edges [] = { E(N11,N12), E(N12,N13), E(N13,N14), E(N14,N15),
  −
E(N21,N22), E(N22,N23), E(N23,N24), E(N24,N25),
  −
E(N31,N32), E(N32,N33), E(N33,N34), E(N34,N35),
  −
E(N41,N42), E(N42,N43), E(N43,N44), E(N44,N45),
  −
E(N51,N52), E(N52,N53), E(N53,N54), E(N54,N55),
  −
E(N11,N21), E(N12,N22), E(N13,N23), E(N14,N24), E(N15,N25),
  −
E(N21,N31), E(N22,N32), E(N23,N33), E(N24,N34), E(N25,N35),
  −
E(N31,N41), E(N32,N42), E(N33,N43), E(N34,N44), E(N35,N45),
  −
E(N41,N51), E(N42,N52), E(N43,N53), E(N44,N54), E(N45,N55) };
  −
  // Assign weights for each edge
  −
  int weight [] = { 4, 2, 0, 7,    // horizontal weights
  −
    7, 4, 5, 9,
  −
    6, 8, 1, 0,
  −
    1, 6, 4, 7,
  −
    1, 5, 8, 5,
  −
    0, 6, 6, 2, 4,  // vertical weights
  −
    9, 7, 1, 0, 6,
  −
    1, 8, 4, 8, 9,
  −
    3, 6, 6, 0, 7 };
  −
   
  −
  // define a graph as an array of edges and weights
  −
#if defined(BOOST_MSVC) && BOOST_MSVC <= 1300 // special routine for MS VC++
  −
  graph_t g(25);
  −
  for (std::size_t j = 0; j < sizeof(edges)/sizeof(E); ++j) {
  −
    edge_descriptor e;
  −
    bool inserted;
  −
    boost::tie(e, inserted) = add_edge(edge_array[j].first, edge_array[j].second, g);
  −
  }
  −
  #else  // for regulat compilers
  −
  graph_t g(edges, edges + sizeof(edges) / sizeof(E), weight, 25);
  −
  #endif
  −
   
  −
  std::vector<vertex_descriptor> p(num_vertices(g));
  −
  std::vector<int> d(num_vertices(g));
  −
  vertex_descriptor s = vertex(N11, g);
  −
   
  −
#if defined(BOOST_MSVC) && BOOST_MSVC <= 1300 // for VC++
  −
  property_map<graph_t, vertex_index_t>::type indexmap = get(vertex_index, g);
  −
  dijkstra_shortest_paths(g, s, &p[0], &d[0], weightmap, indexmap,
  −
                          std::less<int>(), closed_plus<int>(),
  −
                          (std::numeric_limits<int>::max)(), 0,
  −
                          default_dijkstra_visitor());
  −
  #else  // for regular compilers
  −
  dijkstra_shortest_paths(g, s, predecessor_map(&p[0]).distance_map(&d[0]));
  −
#endif
  −
  graph_traits < graph_t >::vertex_iterator vi, vend;
  −
  −
  std::cout << "Backtracking the optimal path from the destination to source" << std::endl;
  −
  for(int node = N55; node != N11; node = p[node]) {
  −
    std::cout << "Path: N" << getNodeID(p[node]) << " -> N" << getNodeID(node) << ", Distance from origin is " << d[node] << std::endl;
  −
  }
  −
  return 0;
  −
}
      
== Office Hours ==
 
== Office Hours ==
* Friday 9:30AM-12:30PM
+
* Friday 9:00AM-10:30PM
 
  −
== Information on Biostatistics Cluster ==
  −
* You may want to request an account to the biostatistics cluster. Refer to https://webservices.itcs.umich.edu/mediawiki/Biostatistics/index.php/Cluster for further information.
      
== Standards of Academic Conduct ==
 
== Standards of Academic Conduct ==
Line 399: Line 153:     
== Course History ==
 
== Course History ==
 
+
* Winter 2011 Course Web Site [[Biostatistics_615/815_Winter_2011]]
Goncalo Abecasis taught it in several academic years previously. For previous course notes, see [[http://www.sph.umich.edu/csg/abecasis/class Goncalo's older class notes]].
+
* Goncalo Abecasis taught it in several academic years previously. For previous course notes, see [[http://www.sph.umich.edu/csg/abecasis/class Goncalo's older class notes]].

Navigation menu