Greedy Algorithms
Interval Scheduling
The Problem
Given a set of jobs with a start and end time. Find the largest set of jobs such that no two jobs are running at the same time (the jobs are compatible).
Input: A Set of jobs with a start and end time
Output: The largest set of jobs such that all are compatible with each other
Strategy
- Sort the jobs by end time in ascending order
- Create an empty set of selected jobs
- Iterate though the set of jobs
-
- If a job is compatible with the current selected jobs then add it
Checking For compatibility
- keep a record of the last job added
- if the start time is after, or at the finish time of the last job added the jobs are compatible
The set is compatible with itself and the last job added has the last finish time so any job that occurs after the finish is compatible with the rest fo the set.
Pseudocode
jobs <- The available jobs with .start and .end
jobs = jobs.sort(job.end) // Sort the job by end time in increasing order O(nlogn)
selected = [] // empty set
for( job : jobs ){
// the first job is added by default
if (selected.length ==0 || last.finish <= job.start){
selected.add(job)
last=job
}
}
return selected // An array of selected Jobs
Proof
It is necessary to prove that this algorithm generates the most efficient set.
proof by contradiction
- Assume the set (\(i_1,i_2,i_3...i_n\)) generated by the algorithm is not the most efficient
- The most efficient set (\(j_1,j_2,j_3...j_m\)) also exists such that the largest \(r\) can be found
- Select an \(r\) such that \(\forall v < r : i_v = j_v \land i_r \neq j_r\) (r is the largest value such that all pairs before r are equal)
As $i_r$ is selected by the finish first algorithm; no job exists that is compatible with \(\{ i_1...i_{r-1} \}\) and finishes before \(i_r\) so \(j_r\) finishes after or with \(i_r\)
This implies that \(i_r\) can replace \(j_r\) this is contradictory as it goes against the maximality of \(r\)
A more efficient solution can not be found, so the algorithm provides the most efficient solution
Interval Partitioning Problem
Given a set of jobs (intervals) schedule the jobs into the least number of rooms such that at any given time a room only has one job in it.
Input: A set of jobs with a start and end time
Output: A set of schedules for each room
Strategy
- Sort the jobs by start time
- Going through each job in turn
-
- check if the new schedule and add the job
**Checking For compatibility
- Store the schedules in a priority queue
- use the end time of the last job added as the key
- Check compatibility with queue.Find_Min
- if is not compatible is not compatible with any
- if is compatible add to it
Pseudocode
rooms= new PriorityQueue
jobs = jobs.sort(job.start)
for (job in jobs){
// the first job is added by default
if (rooms.length>0 && rooms.find_Min_Key <= job.start){
room=rooms.pop_Min()
room.add(job)
rooms.add(room,job.end)
}else{
room=[job]
rooms.add(room,job.end)
}
}
return rooms.values();
Proof
terminology
The
depth
of a set of jobs is the maximum number of jobs running at the same time.
equally
The
depth
of a set of intervals is the maximum number of intervals that contain any given point
When rooms.find_Min_Key is called the room with the earliest finish time is found, there are no rooms that has a finish time earlier. This shows the depth = roomCount
. Adding the job means depth+=1
and roomCount+=1
, roomCount matches depth. A more efficient solution can’t be found as depth
\(\nless\) roomCount
Minimizing Lateness
Given a set of jobs with a running time and deadline, order the jobs to result in the smallest maximum lateness.
Input: Jobs with running time and deadline
Output: Schedule of Jobs
Strategy
- Sort the jobs by the deadline
This is the optimal strategy adding start and end times can be added
Pseudocode
jobs <- The available jobs with .time and .deadline
jobs = jobs.sort(job.deadline) // Sort the job by deadline time in increasing order O(nlogn)
//not needed for correct order just being explicit
time=0
for (job in jobs){
job.setStart(time);
job.setEnd(time+job.time);
time+=job.time;
}
return jobs
Proof
Prove that the strategy produces an optimal solution
Observation 1: There exists an optimal solution with no idle time
Observation 2: The earliest-deadline-first strategy produces no idle time
Definition: Given a schedule S an inversion is a pair of jobs \(i\) , \(j\) where \(i \lt j\) (\(i\)’s deadline is before \(j\)’s) however \(j\) appears before \(i\)
Observation 3: The earliest-deadline-first strategy produces the unique schedule with no inversions
Observation 4: if an idle-free schedule has an inversion it has an adjacent inversion
Proof:
- Let \(i , j\) be the closest inversion
- Let \(k\) be the element to the right of \(j\)
- Case 1 [\(j \gt k\)] then \(j ,k\) is an adjacent inversion
- Case 2 [\(j \lt k\)] then \(i,k\) is a closer inversion as \(i \lt j \lt K\) this is contradictory as \(i,j\) is a closer inversion
Key Claim: Exchanging two adjacent inverted jobs reduces the number of inverses by 1 and does not increase the max lateness. When applied to an optimal solution the result will also be an optimal solution
Proof of claim
let \(l_x\) be the lateness before exchange and \(l'_x\) after for job \(x\).
let \(i,j\) be jobs s.t \(i \lt j\) but \(j\) is before \(i\) in the solution \(S\) (inversion)
\(\forall k \neq i,j : l'_k = l_k\)(1)
\(l'_j \leq l_j\) (2)
if job \(j\) is late (start time is getting later) \(l'_j = f'_j -d_j\)(3) \(l'_j = f_i -d_j\)(4) \(l'_j \leq f_i -d_i\)(5) \(l'_j \leq l_i\)(6)
- There is no change from the exchange to the lateness of any other jobs.
- The start time of \(j\) was moved forward so its new lateness has improved or remained the same
- The new lateness of \(j\) is \(f'_j-d_j\) definition (finish - delay)
- The new finish time of \(j\) is equal to the old finish time of \(i\)
- as \(i \lt j\) by definition \(d_i \lt d_j\) so \(f_i -d_j \leq f_i-d_i\)
- relying on the definition \(l_i = f_i-d_i\)
Shows that the new lateness of \(j)\) is no worse than the old lateness of \(i\) so the solution is still just as optimal.
Final Proof
Let \(S\) be the solution generated bu the earliest-deadline-first strategy
Let \(S'\) be an optimal solution with the fewest inversionsGoodVibes
- Can assume \(S'\) has no idle time (claim 1)
- Case 1 [\(S'\) has no inversions]
-
- Then \(S'=S\) the optimal solution was found
- Case 2 [\(S'\)] has an inversion
-
- let \(i-j\) be an adjacent inversion
-
- exchange \(i,j\), is still an optimal solution
-
- contradicts the “optimal solution with the fewest inversions”
So \(S =S'\) the earliest-deadline-first strategy is optimal
Generic Proof Strategies
Greedy Algorithm Stays Ahead
Show that at every stage of the the greedy algorithm produces a solution that is better then or as good as an optimal solution.
Structural
Discover structural bounds asserting that every solution must have a set value. Then show your algorithm achieves this bound.
Exchange
Gradually transform any optimal solution to the one found by the greedy algorithm without hurting the solution quality.