evaluate.evaluate_topk¶

pai4sk.simsearch.evaluate.evaluate_topk(D, K, labels_X, labels_Y=None)¶

Evaluates search accuracy for the top K samples

Parameters:

D (ndarray, shape (n_samples_x, n_samples_y)) – A two-dimensional distance matrix with distances between two datasets (X and Y)
K (int) – Number of top samples for which we want to calculate the accuracy of the similarity search algorithm
labels_X (array-like, shape (n_samples_x,)) – Labels corresponding to dataset X
labels_Y (array-like, shape (n_samples_y,)) – Labels corresponding to dataset Y

Notes

labels_Y = None if we want to evaluate the precision of the similarity search algorithm for documents/images within a: single dataset

Returns:

k_vec (ndarray, shape (log(K)+1,)) – Indicates the top-K number for which we are calculating the precision values (in powers of 2: 1,2,4,8,16,32 if the value of input argument K is 32)
prec_vec (ndarray, shape (log(K)+1,)) – Indicates the corresponding precision values
topk_indices (ndarray, shape (n_samples_x, K)) – Indicates the number K for which we store the precision values
topk_values (ndarray, shape (n_samples_x, K)) – Indicates the precision values corresponding to the topk_indices