All Questions
Tagged with euclidean-distance similarity
32
questions
44
votes
2
answers
21k
views
Compare similarity algorithms
I want to use string similarity functions to find corrupted data in my database.
I came upon several of them:
Jaro,
Jaro-Winkler,
Levenshtein,
Euclidean and
Q-gram,
I wanted to know what is ...
5
votes
1
answer
3k
views
Best way to identify dissimilarity: Euclidean Distance, Cosine Distance, or Simple Subtraction?
I'm new to data science and am currently learning different techniques that I can do with Python. Currently, I'm trying it out with Spotify's API for my own playlists.
The goal is to find the most ...
5
votes
2
answers
3k
views
r distance between rows
I apologize this is my attempt at redeeming myself after a disastrous earlier attempt . Now I have a bit more clarity. So here I go again.
My goal is to find rows that are similar. So first I am ...
4
votes
1
answer
2k
views
Calculating similarity based on attributes
My objective is to calculate the degree of similarity between two users based on their attributes. For instance let's consider a player and consider age, salary, and points as attributes.
Also I ...
3
votes
3
answers
8k
views
measuring similarity between two rgb images in python
I have two rgb images of same size, and I would like to compute a similarity metric. I thought of starting out with euclidean distance:
import scipy.spatial.distance as dist
import cv2
im1 = cv2....
2
votes
4
answers
9k
views
How do I create a simliarity matrix in MATLAB?
I am working towards comparing multiple images. I have these image data as column vectors of a matrix called "images." I want to assess the similarity of images by first computing their Eucledian ...
1
vote
3
answers
919
views
Find euclidean distance of two array of different length
I want to find Euclidean distance to check similarity of strings.
From above image in a painting object field there are many image types in database. Images is displaying using this paining_object ...
1
vote
1
answer
1k
views
Extract distances after running scipy.spatial.distance.pdist
I have a Pandas data frame (see small example below). I want to calculate Euclidean distances between observations (rows) based on their values in 3 columns (features). I am using scipy.spatial....
1
vote
1
answer
926
views
I just started to use Eigen Matrix algebra library and aim to create a similarity matrix of a dataset, suggestions?
I try to create a similarity matrix with eigen library on a dataset. I just read the csv file into eigen matrix but know as a matlab customer I am looking for something like bsxfun or something to ...
1
vote
1
answer
625
views
Javascript Clusterfck Metric
So I am converting an old data visualization to a new platform and I am a little bit stuck on their community sorting feature. In the original code, it looks like the author uses agglomerative ...
1
vote
1
answer
2k
views
Pearson vs Euclidean vs Manhattan Results
Using Python 3.6. I am not getting logical results when using Manhattan distance for similarity measurement. Even comparing to the results from Pearson and Euclidean correlation, the units for ...
1
vote
1
answer
1k
views
Finding most similar items by euclidean and cosine
How do I go about finding similarities in R? In particular, the similarity metrics I care most about are cosine and a KNN-# value. I guess the key aspect of this is so that the data comes out in a ...
1
vote
2
answers
1k
views
Correctly interpreting Cosine Angular Distance Similarity & Euclidean Distance Similarity
As an example, let's say I have a very simple data set. I am given a csv with three columns, user_id, book_id, rating. The rating can be any number 0-5, where 0 means the user has NOT rated the book.
...
1
vote
2
answers
362
views
Euclidian distance between posts based on tags
I am playing with the euclidian distance example from programming collective intelligence book,
# Returns a distance-based similarity score for person1 and person2
def sim_distance(prefs,person1,...
1
vote
1
answer
183
views
How to convert TS-SS result to similarity measure between 0 - 1?
I'm currently developing a question plugin for some LMS that auto grade the answer based on the similarity between the answer and answer key with cosine similarity. But lately, I found that there is a ...
1
vote
0
answers
36
views
Why does the result of ItemSimilarityJob lack some similarities of itemId-pair?
Given that I have the following ratings.csv
userId,itemId,rating
1,1,1
1,2,2
1,3,3
2,2,4
2,3,2
2,5,4
2,6,5
3,1,5
3,3,1
3,6,2
4,4,4
Using org.apache.mahout.cf.taste.hadoop.item.RecommenderJob, we have
...
1
vote
0
answers
139
views
How to calc the similarity of two images
I'm trying to examine two images for similarity with the usage of SIFT. The result should be a percentage.
I have understood how to extract the features and descriptors from the images using OpenCV ...
1
vote
1
answer
2k
views
euclidean distance and similarity
My teacher has given me these set of questions as homework and I don't know if I'm understanding it right.
The following customers have rated a number of DVD's as shown in the table. Calculate the ...
1
vote
0
answers
63
views
How to find similar wiki pages with n-gram?
Let's suppose there's a wiki, and for every wiki page I'd like to show a widget - with the list of similar pages.
It could be done in two steps:
Step 1 - convert each page into feature vector with ...
1
vote
1
answer
782
views
Using relative frequency for euclidean distance
How do I calculate the euclidean distance(similarity) between two documents eg D1 and D2 using relative frequency?.
Below is an example of both cosine and euclidean distance between two documents ...
1
vote
2
answers
225
views
Similarity of documents function
I am trying to create matrices for cosine and euclidean distances of a document. not too sure how I would approach this question. Any advice would be appreciated. Thanks.
The function takes the ...
0
votes
1
answer
160
views
How to go from a vector to a similarity matrix?
I would like to reconstruct a similarity matrix between two vectors from a vector containing the similarity between each pair of elements in the two vectors. Does anyone know how I could do it?
To ...
0
votes
1
answer
2k
views
How to change the code to find the euclidean distance (not cosine) between words in a word2vec impementation?
The following code when run gives the cosine distance between two words.
model.wv.distance('word1','word2')
How do I find the euclidean distance between two words?
I am using gensim for word2vec ...
0
votes
1
answer
567
views
Measuring the distance between two relative frequency vectors
I am having a problem in choosing a adequate distance function to measure the similarity (dissimilarity) between two relative frequency vectors.
More specifically, I am using shape feature vectors ...
0
votes
1
answer
582
views
Proper similarity measure for clustering
I have problems in finding a proper similarity measure for clustering. I have around 3000 arrays of sets, where each set contains features of certain domain (e.g., number, color, days, alphabets, etc)....
0
votes
1
answer
2k
views
Find the most similar row to user input from pandas dataframe
I want to find the most similar row to user input from my dataset.
My dataset looks like this:
And This is the user input :
I used scipy and sklearn with a lot of distance metrics (euclidean, ...
0
votes
0
answers
149
views
Coefficient of Euclidean Distance
I have been trying to calculate correlation coefficient (say r) and euclidean distance (say d) between two random variables X and Y. It is known that -1 <= r <= 1, whereas d >= 0. To compare ...
0
votes
0
answers
34
views
How can I look for similarities across an entire python dataframe?
Suppose I have the following dataframe:
FG% FT% 3P%
Player A .56 .80 .45
Player B .22 .60 .20
Player C .48 .71 .39
etc...
I'd like to iterate over each row (player) to find out ...
0
votes
0
answers
816
views
Is there any package in R to use jaccard or cosine distance for k-medoid clustering?
I am using function pam in package cluster for partitioning around medoids.
pam(x, k, diss = inherits(x, "dist"), metric = "euclidean",
medoids = NULL, stand = FALSE, cluster.only = FALSE,
...
0
votes
0
answers
43
views
nested transformations apache spark
I need some help about my code, which doesn't response.
I need to compute similarities between items mutually based on their ratings, these similarities will be used to construct the similarity matrix....
0
votes
1
answer
3k
views
Detecting a black/blank frame in video using OpenCV
I'm using OpenCV 2.4.2 VideoCapture class to grab frames from multiple videos and my aim is to compare the frames between videos to retrieve similar videos (visually similar).
I'm facing two issues. ...
-1
votes
1
answer
463
views
Item Based Similarity Metric
I am using Mahout Apache to write an item based recommender (based on similar item ratings by users) and I was wondering which of the following two similarity metrics would be the best to use:
...