org.apache.hadoop.mapred
Class JobInProgress

java.lang.Object
  extended by org.apache.hadoop.mapred.JobInProgress

public class JobInProgress
extends Object

JobInProgress maintains all the info for keeping a Job on the straight and narrow. It keeps its JobProfile and its latest JobStatus, plus a set of tables for doing bookkeeping of its Tasks. ***********************************************************


Nested Class Summary
static class JobInProgress.Counter
           
 
Constructor Summary
protected JobInProgress(JobID jobid, JobConf conf)
          Create an almost empty JobInProgress, which can be used only for tests
  JobInProgress(JobID jobid, JobTracker jobtracker, JobConf default_conf)
          Create a JobInProgress with the given job file, plus a handle to the tracker.
  JobInProgress(JobID jobid, JobTracker jobtracker, JobConf default_conf, int rCount)
           
 
Method Summary
 void cleanUpMetrics()
          Called when the job is complete
 boolean completedTask(TaskInProgress tip, TaskStatus status)
          A taskid assigned to this JobInProgress has reported in successfully.
 int desiredMaps()
           
 int desiredReduces()
           
 void failedTask(TaskInProgress tip, TaskAttemptID taskid, String reason, TaskStatus.Phase phase, TaskStatus.State state, String trackerName)
          Fail a task with a given reason, but without a status object.
 TaskStatus findFinishedMap(int mapId)
          Find the details of someplace where a map has finished
 int finishedMaps()
           
 int finishedReduces()
           
 TaskInProgress[] getCleanupTasks()
          Get the list of cleanup tasks
 Counters getCounters()
          Returns the total job counters, by adding together the job, the map and the reduce counters.
 long getFinishTime()
           
 Counters getJobCounters()
          Returns the job-level counters.
 JobID getJobID()
           
 long getLaunchTime()
           
 Counters getMapCounters()
          Returns map phase counters by summing over all map tasks in progress.
 TaskInProgress[] getMapTasks()
          Get the list of map tasks
 JobPriority getPriority()
           
 JobProfile getProfile()
           
 Counters getReduceCounters()
          Returns map phase counters by summing over all map tasks in progress.
 TaskInProgress[] getReduceTasks()
          Get the list of reduce tasks
 Object getSchedulingInfo()
           
 TaskInProgress[] getSetupTasks()
          Get the list of setup tasks
 long getStartTime()
           
 JobStatus getStatus()
           
 TaskCompletionEvent[] getTaskCompletionEvents(int fromEventId, int maxEvents)
           
 TaskInProgress getTaskInProgress(TaskID tipid)
          Return the TaskInProgress that matches the tipid.
 boolean inited()
          Check if the job has been initialized.
 void initTasks()
          Construct the splits, etc.
 void kill()
          Kill the job and all its component tasks.
 org.apache.hadoop.mapred.Task obtainJobCleanupTask(TaskTrackerStatus tts, int clusterSize, int numUniqueHosts, boolean isMapSlot)
          Return a CleanupTask, if appropriate, to run on the given tasktracker
 org.apache.hadoop.mapred.Task obtainJobSetupTask(TaskTrackerStatus tts, int clusterSize, int numUniqueHosts, boolean isMapSlot)
          Return a SetupTask, if appropriate, to run on the given tasktracker
 org.apache.hadoop.mapred.Task obtainNewLocalMapTask(TaskTrackerStatus tts, int clusterSize, int numUniqueHosts)
           
 org.apache.hadoop.mapred.Task obtainNewMapTask(TaskTrackerStatus tts, int clusterSize, int numUniqueHosts)
          Return a MapTask, if appropriate, to run on the given tasktracker
 org.apache.hadoop.mapred.Task obtainNewNonLocalMapTask(TaskTrackerStatus tts, int clusterSize, int numUniqueHosts)
           
 org.apache.hadoop.mapred.Task obtainNewReduceTask(TaskTrackerStatus tts, int clusterSize, int numUniqueHosts)
          Return a ReduceTask, if appropriate, to run on the given tasktracker.
 org.apache.hadoop.mapred.Task obtainTaskCleanupTask(TaskTrackerStatus tts, boolean isMapSlot)
           
 int pendingMaps()
           
 int pendingReduces()
           
 Vector<TaskInProgress> reportCleanupTIPs(boolean shouldBeComplete)
          Return a vector of cleanup TaskInProgress objects
 Vector<TaskInProgress> reportSetupTIPs(boolean shouldBeComplete)
          Return a vector of setup TaskInProgress objects
 Vector<TaskInProgress> reportTasksInProgress(boolean shouldBeMap, boolean shouldBeComplete)
          Return a vector of completed TaskInProgress objects
 int runningMaps()
           
 int runningReduces()
           
 boolean scheduleReduces()
           
 void setPriority(JobPriority priority)
           
 void setSchedulingInfo(Object schedulingInfo)
           
 void updateMetrics()
          Called periodically by JobTrackerMetrics to update the metrics for this job.
 void updateTaskStatus(TaskInProgress tip, TaskStatus status)
          Assuming JobTracker is locked on entry.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

JobInProgress

protected JobInProgress(JobID jobid,
                        JobConf conf)
Create an almost empty JobInProgress, which can be used only for tests


JobInProgress

public JobInProgress(JobID jobid,
                     JobTracker jobtracker,
                     JobConf default_conf)
              throws IOException
Create a JobInProgress with the given job file, plus a handle to the tracker.

Throws:
IOException

JobInProgress

public JobInProgress(JobID jobid,
                     JobTracker jobtracker,
                     JobConf default_conf,
                     int rCount)
              throws IOException
Throws:
IOException
Method Detail

updateMetrics

public void updateMetrics()
Called periodically by JobTrackerMetrics to update the metrics for this job.


cleanUpMetrics

public void cleanUpMetrics()
Called when the job is complete


inited

public boolean inited()
Check if the job has been initialized.

Returns:
true if the job has been initialized, false otherwise

initTasks

public void initTasks()
               throws IOException,
                      org.apache.hadoop.mapred.JobInProgress.KillInterruptedException
Construct the splits, etc. This is invoked from an async thread so that split-computation doesn't block anyone.

Throws:
IOException
org.apache.hadoop.mapred.JobInProgress.KillInterruptedException

getProfile

public JobProfile getProfile()

getStatus

public JobStatus getStatus()

getLaunchTime

public long getLaunchTime()

getStartTime

public long getStartTime()

getFinishTime

public long getFinishTime()

desiredMaps

public int desiredMaps()

finishedMaps

public int finishedMaps()

desiredReduces

public int desiredReduces()

runningMaps

public int runningMaps()

runningReduces

public int runningReduces()

finishedReduces

public int finishedReduces()

pendingMaps

public int pendingMaps()

pendingReduces

public int pendingReduces()

getPriority

public JobPriority getPriority()

setPriority

public void setPriority(JobPriority priority)

getMapTasks

public TaskInProgress[] getMapTasks()
Get the list of map tasks

Returns:
the raw array of maps for this job

getCleanupTasks

public TaskInProgress[] getCleanupTasks()
Get the list of cleanup tasks

Returns:
the array of cleanup tasks for the job

getSetupTasks

public TaskInProgress[] getSetupTasks()
Get the list of setup tasks

Returns:
the array of setup tasks for the job

getReduceTasks

public TaskInProgress[] getReduceTasks()
Get the list of reduce tasks

Returns:
the raw array of reduce tasks for this job

reportTasksInProgress

public Vector<TaskInProgress> reportTasksInProgress(boolean shouldBeMap,
                                                    boolean shouldBeComplete)
Return a vector of completed TaskInProgress objects


reportCleanupTIPs

public Vector<TaskInProgress> reportCleanupTIPs(boolean shouldBeComplete)
Return a vector of cleanup TaskInProgress objects


reportSetupTIPs

public Vector<TaskInProgress> reportSetupTIPs(boolean shouldBeComplete)
Return a vector of setup TaskInProgress objects


updateTaskStatus

public void updateTaskStatus(TaskInProgress tip,
                             TaskStatus status)
Assuming JobTracker is locked on entry.


getJobCounters

public Counters getJobCounters()
Returns the job-level counters.

Returns:
the job-level counters.

getMapCounters

public Counters getMapCounters()
Returns map phase counters by summing over all map tasks in progress.


getReduceCounters

public Counters getReduceCounters()
Returns map phase counters by summing over all map tasks in progress.


getCounters

public Counters getCounters()
Returns the total job counters, by adding together the job, the map and the reduce counters.


obtainNewMapTask

public org.apache.hadoop.mapred.Task obtainNewMapTask(TaskTrackerStatus tts,
                                                      int clusterSize,
                                                      int numUniqueHosts)
                                               throws IOException
Return a MapTask, if appropriate, to run on the given tasktracker

Throws:
IOException

obtainTaskCleanupTask

public org.apache.hadoop.mapred.Task obtainTaskCleanupTask(TaskTrackerStatus tts,
                                                           boolean isMapSlot)
                                                    throws IOException
Throws:
IOException

obtainNewLocalMapTask

public org.apache.hadoop.mapred.Task obtainNewLocalMapTask(TaskTrackerStatus tts,
                                                           int clusterSize,
                                                           int numUniqueHosts)
                                                    throws IOException
Throws:
IOException

obtainNewNonLocalMapTask

public org.apache.hadoop.mapred.Task obtainNewNonLocalMapTask(TaskTrackerStatus tts,
                                                              int clusterSize,
                                                              int numUniqueHosts)
                                                       throws IOException
Throws:
IOException

obtainJobCleanupTask

public org.apache.hadoop.mapred.Task obtainJobCleanupTask(TaskTrackerStatus tts,
                                                          int clusterSize,
                                                          int numUniqueHosts,
                                                          boolean isMapSlot)
                                                   throws IOException
Return a CleanupTask, if appropriate, to run on the given tasktracker

Throws:
IOException

obtainJobSetupTask

public org.apache.hadoop.mapred.Task obtainJobSetupTask(TaskTrackerStatus tts,
                                                        int clusterSize,
                                                        int numUniqueHosts,
                                                        boolean isMapSlot)
                                                 throws IOException
Return a SetupTask, if appropriate, to run on the given tasktracker

Throws:
IOException

scheduleReduces

public boolean scheduleReduces()

obtainNewReduceTask

public org.apache.hadoop.mapred.Task obtainNewReduceTask(TaskTrackerStatus tts,
                                                         int clusterSize,
                                                         int numUniqueHosts)
                                                  throws IOException
Return a ReduceTask, if appropriate, to run on the given tasktracker. We don't have cache-sensitivity for reduce tasks, as they work on temporary MapRed files.

Throws:
IOException

completedTask

public boolean completedTask(TaskInProgress tip,
                             TaskStatus status)
A taskid assigned to this JobInProgress has reported in successfully.


kill

public void kill()
Kill the job and all its component tasks. This method should be called from jobtracker and should return fast as it locks the jobtracker.


failedTask

public void failedTask(TaskInProgress tip,
                       TaskAttemptID taskid,
                       String reason,
                       TaskStatus.Phase phase,
                       TaskStatus.State state,
                       String trackerName)
Fail a task with a given reason, but without a status object. Assuming JobTracker is locked on entry.

Parameters:
tip - The task's tip
taskid - The task id
reason - The reason that the task failed
trackerName - The task tracker the task failed on

getTaskInProgress

public TaskInProgress getTaskInProgress(TaskID tipid)
Return the TaskInProgress that matches the tipid.


findFinishedMap

public TaskStatus findFinishedMap(int mapId)
Find the details of someplace where a map has finished

Parameters:
mapId - the id of the map
Returns:
the task status of the completed task

getTaskCompletionEvents

public TaskCompletionEvent[] getTaskCompletionEvents(int fromEventId,
                                                     int maxEvents)

getJobID

public JobID getJobID()
Returns:
The JobID of this JobInProgress.

getSchedulingInfo

public Object getSchedulingInfo()

setSchedulingInfo

public void setSchedulingInfo(Object schedulingInfo)


Copyright © 2009 The Apache Software Foundation