All Questions

Filter by
Sorted by
Tagged with
11 votes
3 answers
10k views

How to get cosine distance between two vectors in postgres?

I am wondering if there is a way to get cosine distance of two vectors in postgres. For storing vectors I am using CUBE data type. Below is my table definition: test=# \d vectors ...
Anant's user avatar
  • 3,077
5 votes
1 answer
3k views

Best way to identify dissimilarity: Euclidean Distance, Cosine Distance, or Simple Subtraction?

I'm new to data science and am currently learning different techniques that I can do with Python. Currently, I'm trying it out with Spotify's API for my own playlists. The goal is to find the most ...
Mustafa's user avatar
  • 337
4 votes
1 answer
2k views

Calculating similarity based on attributes

My objective is to calculate the degree of similarity between two users based on their attributes. For instance let's consider a player and consider age, salary, and points as attributes. Also I ...
user1010101's user avatar
  • 2,088
3 votes
4 answers
7k views

How to find most optimal number of clusters with K-Means clustering in Python

I am new to clustering algorithms. I have a movie dataset with more than 200 movies and more than 100 users. All the users rated at least one movie. A value of 1 for good, 0 for bad and blank if the ...
ToBeEXP's user avatar
  • 61
3 votes
1 answer
2k views

Does Euclidean Distance measure the semantic similarity?

I want to measure the similarity between sentences. Can I use sklearn and Euclidean Distance to measure the semantic similarity between sentences. I read about Cosine similarity also. Can someone ...
jenyK's user avatar
  • 71
3 votes
1 answer
1k views

Distance calculation in mongodb aggregate using cosine

I am saving face embedding as numpy array in mongodb and using this aggrigate to find distance between to array using euclidean algorithm. Can someone please help to calculate distance using cosine? ...
Archish's user avatar
  • 870
3 votes
1 answer
617 views

Calculate Distance Metric between Homomorphic Encrypted Vectors

Is there a way to calculate a distance metric (euclidean or cosine similarity or manhattan) between two homomorphically encrypted vectors? Specifically, I'm looking to generate embeddings of documents ...
Brian Behe's user avatar
3 votes
1 answer
2k views

How to calculate weighted similarity with scipy.spatial.distance.cosine?

From the function definition: https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cosine.html scipy.spatial.distance.cosine(u, v, w=None) but my codes got some errors: from ...
DataHolic's user avatar
3 votes
0 answers
3k views

Normalising Data to use Cosine Distance in Kmeans (Python)

I am currently solving a problem where I have to use Cosine distance as the similarity measure for Kmeans clustering. However, the standard Kmeans clustering package (from Sklearn package) uses ...
MSalty's user avatar
  • 4,344
2 votes
0 answers
631 views

Find top five similar image using cosine similarity

I have a feature list of images with length n. feature_list -> [array[img1], array[img2]....n] I can find top 5 using sklearn.neighbors.NearestNeighbors. by following neighbors = NearestNeighbors(...
siam's user avatar
  • 109
2 votes
1 answer
4k views

Euclidean Distance or cosine similarity? [closed]

I was reading Similarity Measure and suddenly my whole world was falling apart. I have implemented a search engine using Clustering Technique. For Clustering , I used K Means which has distance ...
Hooli's user avatar
  • 721
1 vote
1 answer
3k views

How to calculate Cosine similarity and Euclidean distance between two tensors in TF2.0?

I have two tensors (OQ, OA) with shapes as below at the end of last layers in my model. OQ shape: (1, 600) OA shape: (1, 600) These tensors are of type 'tensorflow.python.framework.ops.Tensor' How ...
Raghu's user avatar
  • 457
1 vote
2 answers
2k views

Get indices of results from scipy.pdist(myArray,metric="jaccard") to map back to original array?

I am trying to calculate jaccard similarity y= 1 - scipy.spatial.distance.pdist(X,metric="jaccard") X is a m x n matrix and I get a 1-D array of size m choose 2 as a result of this function. How ...
anonuser0428's user avatar
1 vote
1 answer
1k views

Finding most similar items by euclidean and cosine

How do I go about finding similarities in R? In particular, the similarity metrics I care most about are cosine and a KNN-# value. I guess the key aspect of this is so that the data comes out in a ...
runningbirds's user avatar
  • 6,425
1 vote
2 answers
1k views

Correctly interpreting Cosine Angular Distance Similarity & Euclidean Distance Similarity

As an example, let's say I have a very simple data set. I am given a csv with three columns, user_id, book_id, rating. The rating can be any number 0-5, where 0 means the user has NOT rated the book. ...
Wendell Blatt's user avatar
1 vote
1 answer
183 views

How to convert TS-SS result to similarity measure between 0 - 1?

I'm currently developing a question plugin for some LMS that auto grade the answer based on the similarity between the answer and answer key with cosine similarity. But lately, I found that there is a ...
newtocoding's user avatar
1 vote
0 answers
221 views

Pyspark Euclidean and Cosine distance between 2 arrays

I have a pyspark data frame with data shaped like the following (data made up): Dataframe I would like to calculate various distance metrics (such as cosine, euclidean) between the 2 vectors, vec1 ...
mlman's user avatar
  • 11
1 vote
0 answers
60 views

distance calculation whan Nan is the maximum possible distance

I really tried my best to find a solution to my problem. Given that I have 2 customers with several attributes as given below; cust1 = [4.0, 75.0, 2.0, 155.0, 58.0, 3.0, 7.0, 4.0, 0.0, 4.0, 0.0, 1.0, ...
Hande's user avatar
  • 21
1 vote
1 answer
782 views

Using relative frequency for euclidean distance

How do I calculate the euclidean distance(similarity) between two documents eg D1 and D2 using relative frequency?. Below is an example of both cosine and euclidean distance between two documents ...
user avatar
1 vote
2 answers
225 views

Similarity of documents function

I am trying to create matrices for cosine and euclidean distances of a document. not too sure how I would approach this question. Any advice would be appreciated. Thanks. The function takes the ...
nickp's user avatar
  • 43
0 votes
1 answer
450 views

Weighted Euclidean Distance while Merging Feature Vectors?

I have two groups of features (describing an image, in a machine learning context). The first group A, consisting of 3 features, and group B consisting of 15 features. A = [f1, f2, f3] B = [f4, f5, ..,...
Franc Weser's user avatar
0 votes
1 answer
874 views

Euclidean vs Cosine for text data

IF I use tf-idf feature representation (or just document length normalization), then is euclidean distance and (1 - cosine similarity) basically the same? All text books I have read and other forums, ...
Soumyajit's user avatar
  • 445
0 votes
1 answer
429 views

R studio: Is there a way to calculate the cosine & euclidean distance between 2 time series with a single & multiple variables of interest?

Let's say I have time series data of City A, City B, City C & City D that looks like this: +------------+--------+--------+--------+--------+ | Dates | City A | City B | City C | City D | +---...
DPatrick's user avatar
  • 431
0 votes
1 answer
373 views

Similarity Metrics

I am trying to research on different metrics and found many ssimilarity metrics : Euclidean distance Dynamic Time Warping, Edit Distance with Real Penalty DISSIM , Sequence Weighted Alignment model, ...
user2359877's user avatar
0 votes
1 answer
189 views

Why do my t-SNE plots with euclidean and cosine distances look similar

I have a question about two t-SNE plots I made. I have a set of 850 articles for which I wanted to check which articles are similar to each other. This was done by pre-processing the articles first, ...
HenkieTee's user avatar
0 votes
0 answers
166 views

Smart Semantic Category Clustering Using R

Got 2 data frames, did the below: library(tm) v<- Corpus(VectorSource(as.vector(bothsources[,1]))) inspect(head(v,3)) v <- tm_map(v, removeWords, stopwords("english")) v <- tm_map(v, ...
Wiam Nasr's user avatar
0 votes
0 answers
816 views

Is there any package in R to use jaccard or cosine distance for k-medoid clustering?

I am using function pam in package cluster for partitioning around medoids. pam(x, k, diss = inherits(x, "dist"), metric = "euclidean", medoids = NULL, stand = FALSE, cluster.only = FALSE, ...
Hadij's user avatar
  • 4,082
0 votes
1 answer
76 views

Good similarity measure for comparing users

I want to compare users based on responses to 10 questions. My original idea was to resolve each question to an integer [1, 5], but this idea won't work all the time. For example: vec1 = [1,1,1,1,1,1,...
Jeremy Fisher's user avatar
0 votes
1 answer
1k views

There are other useful similarity or distance metrics?

I'm developing an approximate computation system. Defining how much similar two objects are is a basic operation in such a system. Usually in computer science and math, similarity is synonym of ...
justHelloWorld's user avatar
0 votes
2 answers
2k views

Measuring distance between vectors

I have a set of 300.000 or so vectors which I would like to compare in some way, and given one vector I want to be able to find the closest vector I have thought of three methods. Simple Euclidian ...
halfdanr's user avatar
  • 393