Scheduling a Single Machine with Multiple Due Dates per Job: A Comprehensive Analysis

Problem Statement

In many real-world systems — like manufacturing units, server queues, or printer management — we often deal with the problem of scheduling multiple jobs on a single machine. The complication arises when each job consists of multiple tasks, each with its own due date. Our aim is to schedule these tasks one at a time on the machine so that the overall lateness is minimized.

Introduction

In the realm of operations research and production management, scheduling plays a crucial role in optimizing performance, reducing delays, and ensuring the efficient use of limited resources. One of the fundamental problems in this space is the Single Machine Scheduling Problem — a simplified yet powerful model that reflects real-world scenarios where tasks must be processed one at a time on a single resource.

Imagine a scenario where a printer processes print jobs, a machine shop schedules machining tasks, or a server handles user requests — all one at a time, in a queue. But what happens when each of these jobs isn't just a single task, but a collection of subtasks, each with its own deadline or due date? This introduces a layer of complexity that demands careful planning.

This project explores the Single Machine Scheduling Problem with Multiple Due Dates per Job — a variant where each job comprises multiple tasks, and each task has a processing time and a specific due date. The challenge is to determine the optimal sequence of tasks such that a performance metric — in our case, total lateness — is minimized.

The beauty of this problem lies in its simplicity of setup but complexity of solution. To address this challenge, we implement the Earliest Due Date (EDD) rule — a greedy algorithm where tasks are sorted based on their due dates and scheduled sequentially. The algorithm prioritizes jobs with earlier due dates, aiming to minimize lateness by completing the tasks that are most urgent first.

Despite its simplicity, this approach is highly effective in certain scenarios and serves as a valuable starting point for understanding how scheduling rules impact performance. However, as the study reveals, scheduling with multiple due dates per job introduces significant complexity. When processing times vary between jobs, the problem becomes strongly NP-hard, indicating that finding an optimal solution is computationally intensive. Even when processing times are uniform across jobs, the problem remains as challenging as the longstanding open problem of weighted tardiness with equal processing times and release dates.

This study provides the perfect foundation for experimenting with classic scheduling rules such as the EDD algorithm, as well as data processing techniques and visualization strategies that help interpret scheduling outcomes. The insights gained through this project will help understand how algorithms like EDD perform under varying conditions and how they can be adapted or extended to solve more complex scheduling problems.

Key Findings

Optimal for Tardiness Minimization: EDD is ideal for minimizing tardiness in single-machine scheduling problems with known due dates.
Efficient and Easy to Implement: The algorithm runs in O(n log n) time and is straightforward to implement.
Limitations in Complex Scheduling: EDD is not suitable for multi-machine environments or problems with highly variable processing times.
Greedy but Effective: While the greedy approach often leads to optimal solutions, it may not always work well in complex scheduling environments.
Not Multi-Objective: EDD focuses only on tardiness minimization and does not address other potential scheduling goals.

Objective

Implement the EDD Algorithm: Apply the EDD rule and assess its ability to minimize tardiness in job scheduling based on due dates, comparing its performance across different job sets.
Evaluate Computational Complexity: Explore the computational complexity of the problem, especially in cases with varying processing times, and analyze the efficiency of the EDD algorithm.
Compare Scheduling Rules: Compare the performance of the EDD algorithm with other heuristics, such as SPT and LPT, to identify the most effective strategy for minimizing lateness.
Extend to Multi-Machine Problems: Investigate how the EDD algorithm could be adapted or extended for more complex multi-machine scheduling problems.
Real-World Application Insights: Apply the findings to practical scenarios, such as manufacturing or service industries, where multiple due dates need to be managed on a single resource.
Develop Visualization Tools: Create tools and visualizations to help analyze and interpret scheduling outcomes, aiding in the understanding of how changes in parameters affect performance.

Methodology
The Earliest Due Date (EDD) algorithm is a greedy scheduling strategy used to minimize tardiness in single-machine scheduling problems where each job has a processing time and a due date. The basic principle behind EDD is to prioritize jobs with the earliest due dates to minimize the likelihood of missing deadlines.

Here's a detailed explanation of the methodology of how the EDD algorithm works:

1. Identify the Jobs

The first step is to define the set of jobs that need to be scheduled. Each job should have two key parameters:
- Processing Time ( $p_j$ ): The amount of time it takes to complete the job.
- Due Date ( $d_j$ ): The deadline by which the job should ideally be completed.

2. Sort Jobs by Due Date

Once all jobs are identified, the next step is to sort the jobs in ascending order based on their due dates ( $d_j$ ).
Sorting is the key step that sets up the scheduling order.
- This ensures that jobs with the earliest deadlines are prioritized.

Why Sort by Due Date?
Sorting the jobs by due date helps minimize tardiness because it allows jobs with early due dates to be completed as soon as possible, reducing the chance that their tardiness will be significant. Jobs with later due dates are scheduled later, ensuring that early jobs are given priority.

Example:
Suppose we have the following jobs:

Job	Processing Time ( $p_j$ )	Due Date ( $d_j$ )
A	3	4
B	5	2
C	2	6
D	4	3
E	6	7

After sorting by due date, the jobs are ordered as follows:

Job	Processing Time ( $p_j$ )	Due Date ( $d_j$ )
B	5	2
D	4	3
A	3	4
C	2	6
E	6	7

3. Schedule the Jobs Sequentially

Once the jobs are sorted by due dates, the algorithm proceeds to schedule them in the sorted order.
The key idea here is to schedule jobs sequentially based on the sorted due dates.
- Start Time of Each Job: The first job starts at time 0, and each subsequent job starts immediately after the previous one finishes.
- Completion Time: The completion time of a job is the start time plus its processing time.
- Update Current Time: After scheduling each job, the current time is updated to reflect the finish time of the current job.

Example:
If we schedule the jobs in the sorted order (B, D, A, C, E), their start and finish times would look as follows:

Job	Start Time	Finish Time	Tardiness
B	0	5	3
D	5	9	6
A	9	12	8
C	12	14	8
E	14	20	13

4. Calculate Tardiness

Tardiness is the difference between the finish time and the due date of a job, but only if the finish time exceeds the due date.
For each job, calculate the tardiness using the formula:
$\text{Tardiness of Job } j = \max(0, \text{Finish Time of Job } j - \text{Due Date of Job } j)$
If a job completes before its due date, its tardiness is 0.

Example:

From the table above, the tardiness of each job is calculated as follows:

Job B: Finish time = 5, Due date = 2 → Tardiness = 5 - 2 = 3
Job D: Finish time = 9, Due date = 3 → Tardiness = 9 - 3 = 6
Job A: Finish time = 12, Due date = 4 → Tardiness = 12 - 4 = 8
Job C: Finish time = 14, Due date = 6 → Tardiness = 14 - 6 = 8
Job E: Finish time = 20, Due date = 7 → Tardiness = 20 - 7 = 13

The total tardiness is the sum of the individual tardiness values:

Total Tardiness = 3 + 6 + 8 + 8 + 13 = 38

5. Output the Schedule and Total Tardiness

After scheduling all the jobs and calculating the tardiness for each, the algorithm outputs:
1. The schedule of jobs (with start times, finish times).
2. The total tardiness, which is the sum of the tardiness values for all jobs.

In this case, the output is:

Scheduled Jobs: B → D → A → C → E
Total Tardiness: 38

Result and Discussion

We implement the Earliest Due Date (EDD) rule — a greedy algorithm where tasks are sorted based on their due dates and scheduled sequentially.

Here's the basic pseudocode for the EDD algorithm:

1. Sort all jobs by their due dates in ascending order.
2. Initialize the current time as 0 (the machine starts idle).
3. For each job in the sorted list:
   a. If the current time is less than or equal to the job's due date, schedule the job.
   b. If the job is completed after its due date, compute the tardiness.
   c. Update the current time after completing the job.
4. Output the schedule and total tardiness.



Dataset Description

We use a CSV file (data.csv) with the following columns:

job_id	task_id	processing_time	due_date
J1	T1	3	5
J1	T2	2	10
J2	T1	4	7
...	...	...	...

Sample Data

data.csv

job_id,task_id,processing_time,due_date
J1,T1,3,5
J1,T2,2,10
J2,T1,4,7
J2,T2,3,12
J3,T1,1,4
J3,T2,2,9
J4,T1,5,6
J5,T1,2,8


We used Python with pandas for scheduling logic. Here's the main logic:
import pandas as pd

# Load data
df = pd.read_csv('data.csv')

# Sort all tasks by due date (Earliest Due Date rule)
df_sorted = df.sort_values(by='due_date')

# Schedule each task
current_time = 0
schedule = []

for index, row in df_sorted.iterrows():
    start_time = current_time
    end_time = start_time + row['processing_time']
    schedule.append({
        'job_id': row['job_id'],
        'task_id': row['task_id'],
        'start_time': start_time,
        'end_time': end_time,
        'due_date': row['due_date'],
        'lateness': max(0, end_time - row['due_date'])
    })
    current_time = end_time

# Show schedule
schedule_df = pd.DataFrame(schedule)
print(schedule_df)

# Calculate total lateness
total_lateness = schedule_df['lateness'].sum()
print(f"\nTotal Lateness: {total_lateness}")
OUTPUT -

job_id task_id start_time end_time due_date lateness 0 J3 T1 0 1 4 0 1 J1 T1 1 4 5 0 2 J4 T1 4 9 6 3 3 J2 T1 9 13 7 6 4 J5 T1 13 15 8 7 5 J3 T2 15 17 9 8 6 J1 T2 17 19 10 9 7 J2 T2 19 22 12 10 Total Lateness: 43

Gantt Chart Code (using matplotlib)

import pandas as pd
import matplotlib.pyplot as plt

# Read or reuse the result DataFrame (example data)
schedule = [
    {'job_id': 'J3', 'task_id': 'T1', 'start_time': 0, 'end_time': 1},
    {'job_id': 'J1', 'task_id': 'T1', 'start_time': 1, 'end_time': 4},
    {'job_id': 'J4', 'task_id': 'T1', 'start_time': 4, 'end_time': 9},
    {'job_id': 'J2', 'task_id': 'T1', 'start_time': 9, 'end_time': 13},
    {'job_id': 'J5', 'task_id': 'T1', 'start_time': 13, 'end_time': 15},
    {'job_id': 'J3', 'task_id': 'T2', 'start_time': 15, 'end_time': 17},
    {'job_id': 'J1', 'task_id': 'T2', 'start_time': 17, 'end_time': 19},
    {'job_id': 'J2', 'task_id': 'T2', 'start_time': 19, 'end_time': 22},
]

df = pd.DataFrame(schedule)

# Assign a unique color for each job_id
job_colors = {job_id: f"C{i}" for i, job_id in enumerate(df['job_id'].unique())}

# Create a figure
plt.figure(figsize=(10, 5))

for idx, row in df.iterrows():
    plt.barh(
        y=row['job_id'],
        width=row['end_time'] - row['start_time'],
        left=row['start_time'],
        height=0.4,
        color=job_colors[row['job_id']],
        edgecolor='black'
    )
    plt.text(
        x=(row['start_time'] + row['end_time']) / 2,
        y=row['job_id'],
        s=row['task_id'],
        va='center',
        ha='center',
        color='white',
        fontsize=9,
        fontweight='bold'
    )

plt.xlabel('Time')
plt.ylabel('Jobs')
plt.title('Single Machine Scheduling - Gantt Chart')
plt.grid(True, axis='x', linestyle='--', alpha=0.7)
plt.tight_layout()
plt.show()

OUTPUT-

Implications and Future Directions

Implications

Improved Operations Management: The study highlights how efficient scheduling, such as using the Earliest Due Date (EDD) algorithm, can reduce lateness and improve on-time performance in resource-limited industries like manufacturing or services.

Decision-Making Support: Insights gained from this research can guide managers in prioritizing jobs and allocating resources more effectively, leading to better productivity and decision-making.

Computational Efficiency: While EDD is computationally efficient for small-scale problems, the study underscores the challenges of scaling to larger systems, especially with multiple tasks and machines.

Advancements in Scheduling Theory: The research contributes to the field by analyzing classic scheduling algorithms and offering insights that can shape future scheduling strategies.

Future Directions

Multi-Machine Scheduling: Future work could focus on adapting EDD for multi-machine systems, addressing resource contention and task dependencies.

Handling Uncertain Processing Times: Incorporating variable or uncertain processing times would make the model more applicable to real-world scenarios, such as manufacturing with unpredictable machine performance.

Metaheuristic Approaches: Hybrid algorithms or metaheuristics like Genetic Algorithms could be explored for solving NP-hard scheduling problems more efficiently in large, complex environments.

Real-Time and Dynamic Scheduling: Developing algorithms that can adapt to dynamic changes in due dates or processing times would be valuable in environments with constantly changing job requirements.

Multi-Objective Optimization: Future research could focus on optimizing multiple metrics (e.g., lateness, throughput, machine utilization) simultaneously, providing a more holistic approach to scheduling.

Conclusion

The Single Machine Scheduling with Multiple Due Dates per Job problem might appear simple at first glance, but it reflects a core challenge faced in many real-world systems — managing limited resources under time constraints. Whether it's a manufacturing unit with a single workstation or a server processing user requests sequentially, efficient scheduling is key to minimizing delays and optimizing throughput.

In this project, we explored this scheduling problem using the Earliest Due Date (EDD) rule — a greedy algorithm that prioritizes tasks with the earliest due dates. By implementing the logic in Python and visualizing the results with a Gantt chart, we gained insights into how scheduling affects overall performance, particularly total lateness.

Code Link - https://github.com/VaishnaviPujari04/single-machine-scheduling

References

Research paper

Kühn, R., Weiß, C., Ackermann, H. et al. Scheduling a single machine with multiple due dates per job. J Sched 27, 565–585 (2024). https://doi.org/10.1007/s10951-024-00825-w
Lenstra, J. K., Rinnooy Kan, A. H. G., & Brucker, P. (1977). Complexity of machine scheduling problems (Vol. 1, pp. 343–362). Elsevier. https://doi.org/10.1016/S0167-5060(08)70743-X
Moore, J. M. (1968). An n Job, One Machine Sequencing Algorithm for Minimizing the Number of Late Jobs. Management Science, 15(1), 102–109. https://doi.org/10.1287/MNSC.15.1.102
Chen, Rubing & Dong, Xinyu & Yuan, Jinjiang & Ng, C.T. & Cheng, T.C.E., 2025. "Single-machine preemptive scheduling with assignable due dates or assignable weights to minimize total weighted late work," European Journal of Operational Research, Elsevier, vol. 322(2), pages 467-479.

Online Resources and Tutorials

Wikipedia

Search This Blog

Scheduling a single machine with multiple due dates per job