evaluate.evaluate_topk

pai4sk.simsearch.evaluate.evaluate_topk(D, K, labels_X, labels_Y=None)

Evaluates search accuracy for the top K samples

Parameters
  • D (ndarray, shape (n_samples_x, n_samples_y)) – A two-dimensional distance matrix with distances between two datasets (X and Y)

  • K (int) – Number of top samples for which we want to calculate the accuracy of the similarity search algorithm

  • labels_X (array-like, shape (n_samples_x,)) – Labels corresponding to dataset X

  • labels_Y (array-like, shape (n_samples_y,)) – Labels corresponding to dataset Y

Notes

labels_Y = None if we want to evaluate the precision of the similarity search algorithm for documents/images within a

single dataset

Returns

  • k_vec (ndarray, shape (log(K)+1,)) – Indicates the top-K number for which we are calculating the precision values (in powers of 2: 1,2,4,8,16,32 if the value of input argument K is 32)

  • prec_vec (ndarray, shape (log(K)+1,)) – Indicates the corresponding precision values

  • topk_indices (ndarray, shape (n_samples_x, K)) – Indicates the number K for which we store the precision values

  • topk_values (ndarray, shape (n_samples_x, K)) – Indicates the precision values corresponding to the topk_indices