Evaluation of Results using Mean Average Precision
There are several reasons why the evaluation of results on datasets like the Pascal-VOC and ILSRVC is hard. It is well described in Pascal VOC 2009 challenge paper. Here are some of these: Images may contain instances of multiple classes so it is not sufficient to simply ask, “Which one of the m classes does this image belong to?” and then use the predicted result to compare with the actual....