check output folder
calculate splits
application master gets progress and completion reports from tasks. it also requests containers for map tasks and reduce tasks. it starts container by the nodemanager after container is assigned for task.
if uber task is enabled (mapreduce.job.ubertask.enable), uber task runs inside the application master if it's less than 10 mappers, one reducer or size of input within one block.
all map task must be completed by the sort phase of reduce.
resource requests are per-job basis, see mapreduce.map.memory.mb, mapreduce.reduce.memory.mb, mapreduce.map.cpu.vcores and mapreduce.reduce.cpu.vcores.
when job is completed, delete temp files, commit job and archive job history
- task failure (heart beats to AM)
user function error or JVM error
default retry times is four. can be configured by mapreduce.map.maxattempts and mapreduce.reduce.maxattempts
mapreduce.map.failures.maxpercent and mapreduce.reduce.failures.maxpercent
- application master failure (heart beats to RM)
default retry times is 2. mapreduce.am.max-attempts and yarn.resourcemanager.am.max-attempts
use job history to recover completed tasks
- node manager failure (heart beats to RM)
could be blacklisted if application failures on the node exceed configured max values mapreduce.job.maxtaskfailures.per.tracker.
- resource manager failure (HA, stand-by resource manager)
all application info are persisted in zookeepr or shared state.
need to restart all application masters if it's failed
- shuffle and sort
- Map
number of partitions is same as number of reducer tasks
multipe spill files for spills. combiner function runs after sort running by background process
single output file after map task is completed. need to merge multiple spill files into a sorted file.
- Reduce
- Configuration (tuning on different parameters, buffer size, spill percentage, background processes...)
- task execution
- speculative task
- output commit
public abstract class OutputCommitter {
public abstract void setupJob(JobContext jobContext) throws IOException;
public void commitJob(JobContext jobContext) throws IOException { }
public void abortJob(JobContext jobContext, JobStatus.State state)
throws IOException { }
public abstract void setupTask(TaskAttemptContext taskContext)
throws IOException;
public abstract boolean needsTaskCommit(TaskAttemptContext taskContext)
throws IOException;
public abstract void commitTask(TaskAttemptContext taskContext)
throws IOException;
public abstract void abortTask(TaskAttemptContext taskContext)
throws IOException;
}
}