Greedy Algorithms

Interval Scheduling

The Problem

Given a set of jobs with a start and end time. Find the largest set of jobs such that no two jobs are running at the same time (the jobs are compatible).

Input: A Set of jobs with a start and end time

Output: The largest set of jobs such that all are compatible with each other

Strategy

  • Sort the jobs by end time in ascending order
  • Create an empty set of selected jobs
  • Iterate though the set of jobs
    • If a job is compatible with the current selected jobs then add it
Checking For compatibility
  • keep a record of the last job added
  • if the start time is after, or at the finish time of the last job added the jobs are compatible

The set is compatible with itself and the last job added has the last finish time so any job that occurs after the finish is compatible with the rest fo the set.

Pseudocode

jobs <- The available jobs with .start and .end

jobs = jobs.sort(job.end) // Sort the job by end time in increasing order O(nlogn)
selected = [] // empty set
for( job : jobs ){
    // the first job is added by default
    if (selected.length ==0 || last.finish <= job.start){
        selected.add(job)
        last=job
    }
}
return selected // An array of selected Jobs

Proof

It is necessary to prove that this algorithm generates the most efficient set.

proof by contradiction

  • Assume the set (\(i_1,i_2,i_3...i_n\)) generated by the algorithm is not the most efficient
  • The most efficient set (\(j_1,j_2,j_3...j_m\)) also exists such that the largest \(r\) can be found
  • Select an \(r\) such that \(\forall v < r : i_v = j_v \land i_r \neq j_r\) (r is the largest value such that all pairs before r are equal)

As $i_r$ is selected by the finish first algorithm; no job exists that is compatible with \(\{ i_1...i_{r-1} \}\) and finishes before \(i_r\) so \(j_r\) finishes after or with \(i_r\)

This implies that \(i_r\) can replace \(j_r\) this is contradictory as it goes against the maximality of \(r\)

A more efficient solution can not be found, so the algorithm provides the most efficient solution

Interval Partitioning Problem

Given a set of jobs (intervals) schedule the jobs into the least number of rooms such that at any given time a room only has one job in it.

Input: A set of jobs with a start and end time

Output: A set of schedules for each room

Strategy

  • Sort the jobs by start time
  • Going through each job in turn
    • check if the new schedule and add the job

**Checking For compatibility

  • Store the schedules in a priority queue
  • use the end time of the last job added as the key
  • Check compatibility with queue.Find_Min
  • if is not compatible is not compatible with any
  • if is compatible add to it

Pseudocode

rooms= new PriorityQueue
jobs = jobs.sort(job.start)
for (job in jobs){
    // the first job is added by default
    if (rooms.length>0 && rooms.find_Min_Key <= job.start){
        room=rooms.pop_Min()
        room.add(job)
        rooms.add(room,job.end)
    }else{
        room=[job]
        rooms.add(room,job.end)
    }
}
return rooms.values();

Proof

terminology

The depth of a set of jobs is the maximum number of jobs running at the same time.

equally

The depth of a set of intervals is the maximum number of intervals that contain any given point

When rooms.find_Min_Key is called the room with the earliest finish time is found, there are no rooms that has a finish time earlier. This shows the depth = roomCount. Adding the job means depth+=1 and roomCount+=1, roomCount matches depth. A more efficient solution can’t be found as depth \(\nless\) roomCount

Minimizing Lateness

Given a set of jobs with a running time and deadline, order the jobs to result in the smallest maximum lateness.

Input: Jobs with running time and deadline

Output: Schedule of Jobs

Strategy

  • Sort the jobs by the deadline

This is the optimal strategy adding start and end times can be added

Pseudocode

jobs <- The available jobs with .time and .deadline

jobs = jobs.sort(job.deadline) // Sort the job by deadline time in increasing order O(nlogn)

//not needed for correct order just being explicit
time=0
for (job in jobs){
    job.setStart(time);
    job.setEnd(time+job.time);
    time+=job.time;
}
return jobs

Proof

Prove that the strategy produces an optimal solution

Observation 1: There exists an optimal solution with no idle time

Observation 2: The earliest-deadline-first strategy produces no idle time

Definition: Given a schedule S an inversion is a pair of jobs \(i\) , \(j\) where \(i \lt j\) (\(i\)’s deadline is before \(j\)’s) however \(j\) appears before \(i\)

Observation 3: The earliest-deadline-first strategy produces the unique schedule with no inversions

Observation 4: if an idle-free schedule has an inversion it has an adjacent inversion

Proof:

  • Let \(i , j\) be the closest inversion
  • Let \(k\) be the element to the right of \(j\)
  • Case 1 [\(j \gt k\)] then \(j ,k\) is an adjacent inversion
  • Case 2 [\(j \lt k\)] then \(i,k\) is a closer inversion as \(i \lt j \lt K\) this is contradictory as \(i,j\) is a closer inversion

Key Claim: Exchanging two adjacent inverted jobs reduces the number of inverses by 1 and does not increase the max lateness. When applied to an optimal solution the result will also be an optimal solution

Proof of claim

let \(l_x\) be the lateness before exchange and \(l'_x\) after for job \(x\).

let \(i,j\) be jobs s.t \(i \lt j\) but \(j\) is before \(i\) in the solution \(S\) (inversion)

\(\forall k \neq i,j : l'_k = l_k\)(1)

\(l'_j \leq l_j\) (2)

if job \(j\) is late (start time is getting later) \(l'_j = f'_j -d_j\)(3) \(l'_j = f_i -d_j\)(4) \(l'_j \leq f_i -d_i\)(5) \(l'_j \leq l_i\)(6)

  1. There is no change from the exchange to the lateness of any other jobs.
  2. The start time of \(j\) was moved forward so its new lateness has improved or remained the same
  3. The new lateness of \(j\) is \(f'_j-d_j\) definition (finish - delay)
  4. The new finish time of \(j\) is equal to the old finish time of \(i\)
  5. as \(i \lt j\) by definition \(d_i \lt d_j\) so \(f_i -d_j \leq f_i-d_i\)
  6. relying on the definition \(l_i = f_i-d_i\)

Shows that the new lateness of \(j)\) is no worse than the old lateness of \(i\) so the solution is still just as optimal.

Final Proof

Let \(S\) be the solution generated bu the earliest-deadline-first strategy

Let \(S'\) be an optimal solution with the fewest inversionsGoodVibes

  • Can assume \(S'\) has no idle time (claim 1)
  • Case 1 [\(S'\) has no inversions]
    • Then \(S'=S\) the optimal solution was found
  • Case 2 [\(S'\)] has an inversion
    • let \(i-j\) be an adjacent inversion
    • exchange \(i,j\), is still an optimal solution
    • contradicts the “optimal solution with the fewest inversions”

So \(S =S'\) the earliest-deadline-first strategy is optimal

Generic Proof Strategies

Greedy Algorithm Stays Ahead

Show that at every stage of the the greedy algorithm produces a solution that is better then or as good as an optimal solution.

Structural

Discover structural bounds asserting that every solution must have a set value. Then show your algorithm achieves this bound.

Exchange

Gradually transform any optimal solution to the one found by the greedy algorithm without hurting the solution quality.