# Problem description

Prezi uses virtual machines (VMs) to complete three different kind of jobs:

• export

• url

• default

These jobs have pre-determined processing times, which are distributed according to some general distribution $$G$$ (see figures). Each job type has a corresponding type of VM. VM running costs are charged in hourly increments. This can be represented using a cost function $$C(t)$$ where $$t$$ is the running time of a server in seconds

$C(t) = \left\lceil \frac{t}{3600}\right\rceil$

VMs can be turned on and off, but it takes 2 minutes (120 seconds) for a virtual machine to start. This time is charged.

The objective is to minimize the total running cost of the servers, while ensuring that all jobs are handled within 5 seconds. If a job waits for more than 5 seconds, a penalty is incurred. A job is not allowed to wait for more than 2 minutes. The holding cost of a job $$H(t)$$ is defined

$H(t) = \max\left(\left\{ 0, \frac{t - 5}{40} \right\} \right)$

Two datasets are given, each spanning a period of 48 hours, starting at 10pm GMT (so 2/5pm on the West/East coast). They are of the form timestamp job-duration unique-id queue , for example 1378072800 2.133 9ea24acd-cee3-4248-bf43-fb33e7e9ac4b url with 475858 and 669642 jobs in the respective datasets.

# Modelling

## Arrival rate

We can assume our job arrivals to be governed by a Poisson process with parameter $$\lambda$$. However, our arrival times differ depending on the time of day (e.g. les