All Questions

Filter by
Sorted by
Tagged with
21 votes
3 answers
42k views

Distance calculation between rows in Pandas Dataframe using a distance matrix

I have the following Pandas DataFrame: In [31]: import pandas as pd sample = pd.DataFrame({'Sym1': ['a','a','a','d'],'Sym2':['a','c','b','b'],'Sym3':['a','c','b','d'],'Sym4':['b','b','b','a']},index=[...
Clayton's user avatar
  • 1,545
11 votes
2 answers
18k views

Find euclidean distance from a point to rows in pandas dataframe

i have a dataframe id lat long 1 12.654 15.50 2 14.364 25.51 3 17.636 32.53 5 12.334 25.84 9 32.224 15.74 I want to find the euclidean distance of these ...
Shubham R's user avatar
  • 7,562
10 votes
1 answer
13k views

Compute Euclidean distance between rows of two pandas dataframes

I have two pandas dataframes d1 and d2 that look like these: d1 looks like: output value1 value2 value2 1 100 103 87 1 201 97.5 88.9 1 ...
j1897's user avatar
  • 1,547
8 votes
2 answers
16k views

Calculating pairwise Euclidean distance between all the rows of a dataframe

How can I calculate the Euclidean distance between all the rows of a dataframe? I am trying this code, but it is not working: zero_data = data distance = lambda column1, column2: pd.np.linalg.norm(...
Quicklearner.gk's user avatar
5 votes
1 answer
3k views

Best way to identify dissimilarity: Euclidean Distance, Cosine Distance, or Simple Subtraction?

I'm new to data science and am currently learning different techniques that I can do with Python. Currently, I'm trying it out with Spotify's API for my own playlists. The goal is to find the most ...
Mustafa's user avatar
  • 337
4 votes
2 answers
4k views

calculating average distance of nearest neighbours in pandas dataframe

I have a set of objects and their positions over time. I would like to get the distance between each car and their nearest neighbour, and calculate an average of this for each time point. An example ...
UserR6's user avatar
  • 503
4 votes
4 answers
4k views

How to apply euclidean distance function to a groupby object in pandas dataframe?

I have a set of objects and their positions over time. I would like to get the average distance between objects for each time point. An example dataframe is as follows: time = [0, 0, 0, 1, 1, 2, 2] x ...
UserR6's user avatar
  • 503
3 votes
3 answers
1k views

Calculate euclidean distance between groups in a data frame

I have weekly data for various stores in the following form: pd.DataFrame({'Store':['S1', 'S1', 'S1', 'S2','S2','S2','S3','S3','S3'], 'Week':[1, 2, 3,1,2,3,1,2,3], 'Sales' :...
bakas's user avatar
  • 323
3 votes
2 answers
2k views

Fastest way to calculate the shortest (euclidean) distance between points, in pandas dataframe

Consider the following pandas dataframe: print(df) Id X Y Type X of Closest Y of Closest 0 201 73.91 34.84 A NaN NaN 1 201 74.67 32.64 A ...
MRHarv's user avatar
  • 503
2 votes
2 answers
2k views

Pairwise Euclidean distance with pandas ignoring NaNs

I start with a dictionary, which is the way my data was already formatted: import pandas as pd dict2 = {'A': {'a':1.0, 'b':2.0, 'd':4.0}, 'B':{'a':2.0, 'c':2.0, 'd':5.0}, 'C':{'b':1.0,'c':2.0, 'd':4....
Jabernet's user avatar
  • 401
2 votes
2 answers
436 views

How to get minimum values in dataframe below a certain threshold?

I have 2 dataframes in pandas containing locational information of cars and trees. df1 x y car 3 216 13 4 218 12 ...
UserR6's user avatar
  • 503
2 votes
1 answer
111 views

Average distance within group in pandas

I have a dataframe like this df = pd.DataFrame({ 'id': ['A','A','B','B','B'], 'x': [1,1,2,2,3], 'y': [1,2,2,3,3] }) The output I want is the average distance for each point in the group, ...
d_frEak's user avatar
  • 440
2 votes
2 answers
524 views

Get most similar words for matrix of word vectors

So I computed a matrix of word vectors manually using keras which looks like this: >>> word_embeddings 0 1 2 3 movie 0.007964 0.004251 -0....
Zwiebak's user avatar
  • 354
2 votes
1 answer
716 views

How to calculate sum of Euclidean distances from one datapoint to all other datapoints from pandas dataframe?

I have the following pandas dataframe: import pandas as pd import math df = pd.DataFrame() df['x'] = [2, 1, 3] df['y'] = [2, 5, 6] df['weight'] = [11, 12, 13] print(df) x y weight 0 ...
arizamoona's user avatar
2 votes
1 answer
228 views

Applying Euclidean distance between two separate pandas dataframes

I have two dataframes with the same size, 100 rows in both dataframe. I want to calculate the Euclidean distance between the two dataframes. and return the results "the distance" in another ...
MohammedE's user avatar
2 votes
0 answers
557 views

How can I improve the silhouette score of my k-?means clustering

I have a dataset with 18000 lines about some Customers, like this: and I am trying to do some clustering using k-means algorithm. Since I have both categorical and continuous variables I created some ...
Fábio Pires's user avatar
2 votes
1 answer
3k views

Calculate Euclidean Distance for Latitude and Longitude - Pandas DataFrame Python [duplicate]

I have a pandas df of origin and destination latitude and longitude. df = pd.DataFrame({'orig_lat': [32.8111, 34.3424], 'orig_long': [-122.2221,-132.2133], 'dest_lat': [33.2231, 35.3394], '...
Jos Butler's user avatar
1 vote
2 answers
660 views

How to optimize my code to calculate Euclidean distance

I am trying to find Euclidean distance between two points. I have around 13000 number of rows in Dataframe. I have to find Euclidean distance for each each row against all 13000 number of rows and ...
Mahsaga's user avatar
  • 31
1 vote
2 answers
2k views

Finding pairs of latitude and longitude within a certain radius in Python

Given a dataframe df as follows: id location lon lat 0 1 Onyx Spire 116.35425 39.87760 1 2 Unison Lookout 116.44333 39.93237 2 3 ...
ah bon's user avatar
  • 9,697
1 vote
3 answers
1k views

Find all shortest Euclidean distances between two groups of point coordinates

I have a Pandas DataFrame, where columns X1, Y1 have point coordinates for the first group of coordinates and columns X2, Y2 have point coordinates for the second group of coordinates. Both groups are ...
Dimon's user avatar
  • 436
1 vote
2 answers
444 views

Finding euclidean distance from multiple mean vectors

This is what I am trying to do - I was able to do steps 1 to 4. Need help with steps 5 onward Basically for each data point I would like to find euclidean distance from all mean vectors based upon ...
user2543622's user avatar
  • 6,228
1 vote
1 answer
1k views

Extract distances after running scipy.spatial.distance.pdist

I have a Pandas data frame (see small example below). I want to calculate Euclidean distances between observations (rows) based on their values in 3 columns (features). I am using scipy.spatial....
user3245256's user avatar
  • 1,918
1 vote
3 answers
2k views

Calculating euclidean distance from a dataframe with several column features

I have a dataframe like below and I need to calculate the euclidean distance. a,b,c,d,e 10,11,13,14,9 11,12,14,15,10 12,13,15,16,11 13,14,16,17,12 14,15,17,18,13 15,16,18,19,14 16,17,19,20,15 17,18,20,...
GKC's user avatar
  • 447
1 vote
1 answer
191 views

Calculating and using Euclidean Distance in Python

I am trying to calculate the Euclidean Distance between two datasets in python. I can do this using the following: np.linalg.norm(df-signal) With df and signal being my two datasets. This returns a ...
Darragh MacKenna's user avatar
1 vote
1 answer
461 views

How to adjust this code to also Return second and third "Nearest Neighbors"?

Based on this code from calculating average distance of nearest neighbours in pandas dataframe, how can I adjust it so that it returns the second and third nearest neighbor into new columns? (Or ...
user avatar
1 vote
1 answer
428 views

Find distance between rows in pandas dataframe but with reference to 1 row

In this pandas dataframe: y_train feat1 feat2 0 9.596113 -7.900107 1 -1.384157 2.685313 2 -8.211954 5.214797 How do I go about adding a "distance from Class 0" column at ...
Joe's user avatar
  • 387
1 vote
0 answers
54 views

Manual kth Nearest Neighbor Euclidean Distance

I have to modify the following code in order to use 1, 3, and 5 neighbors and print the accuracy of each one. I can not use the sklearn library KNeighborsClassifier so I am stuck because I don't know ...
Jaime's user avatar
  • 25
1 vote
2 answers
161 views

A windowed operation on a pandas data frame to list Euclidean distance from the previous n entries

I have a sorted (on 'values') dataframe that looks like the following. The unnamed col us the index. x_cord y_cord value 3384209 1650 1741 0.009752 3382265 1650 1740 0.009481 ...
Dharmender Tathgur's user avatar
1 vote
1 answer
2k views

Euclidean distance between a single point to multiple points in a pandas data frame

My data frame has 16 x and y coordinates for positions{x1,x2...x16,y1,y2...y16} continuous from 0 to 566(m). I want to calculate Euclidean distance between x1,y1 wrt the remaining 15 coordinates and ...
user132605's user avatar
1 vote
0 answers
58 views

vectorise nested iterations by using groupby methods

I have written code to iterate through a dataset that has a demarcation column. This column consist of a value shared by all equally demarked rows. The code iterate through each demarcated section ...
Eb_J's user avatar
  • 13
1 vote
0 answers
215 views

pandas daily moving windows eucledian distance

I have a pandas dataframe which has two columns (time-series) that I need to compare. These time-series are hourly based, but I need to compare them every day (24h => compare, then I move the window ...
marcodena's user avatar
  • 570
0 votes
2 answers
1k views

Calculating Euclidean distance with a lot of pairs of points is too slow in Python

The main goal is to generate the customer similarity based on Euclidean distance, and find the 5 most similar customers for each customer. I have 400,000 customers data, each of them has 40 attributes....
ZhaiShang's user avatar
  • 123
0 votes
2 answers
909 views

euclidean distance between two big pandas dataframes

I have three dataframes df1 with 1 160 164 rows and 4 variables,df2 with 11241 rows and 4 variables, and df3 with 1 630 644 rows and 6 variables df1 looks like : df2 looks like : The observations in ...
Cocogne's user avatar
  • 11
0 votes
1 answer
234 views

How to incorporate elevation into euclidean distance matrix in pandas?

I have the following dataframe in pandas: import pandas as pd df = pd.DataFrame({ "CityId": { "0": 0, "1": 1, "2": 2, "3": 3, "4": 4 }, "X": {...
ZeroStack's user avatar
  • 1,069
0 votes
1 answer
42 views

Finding Distant Pairs in Python taking advantage of pandas

I have this file: This can be read as: data = np.loadtxt('test_2_stack_overflow.csv', delimiter=',') or dfr = pd.read_csv('test_2_stack_overflow.csv',header=None) The column index represent the x ...
diedro's user avatar
  • 563
0 votes
1 answer
42 views

Replace a 2D point in one dataframe with a 2D point in another dataframe if the Euclidean between them is the lowest

I have a data frame df1 with two columns V1 and V2 representing two coordinates of a point. df1 V1 V2 1.30344679 0.060199021 1.256628917 0.095897457 0.954959945 0.237514922 1.240081297 0....
vp_050's user avatar
  • 510
0 votes
1 answer
637 views

Calculate the distance between each GPS points of two lists in python

Let assume that we have a pandas dataframe contain of two columns as ("longitude" and "latitude"), which split by (comma) for example: longitude latitude [116.415642, 116.41832, ...
M_Fatih89's user avatar
0 votes
1 answer
31 views

How to find from a pandas DataFrame the three closest values?

I have a dataframe which contain different emotion, every emotion is a category and has three different float values. I would like to find the closest emotion giving three values. Example: ...
Y4RD13's user avatar
  • 966
0 votes
1 answer
253 views

Calculation looping through dataframe of lists and list of arrays

I want to calculate the Euclidean distance using a list of arrays. import numpy as np import pandas as pd from scipy.spatial import distance #Dataframe data = [np.array([[1, 2], [1, 3], [1, 1]]), ...
Sp_95's user avatar
  • 133
0 votes
1 answer
307 views

Find the distance between a list of points in two columns with one list comprehension in Python

I need to create a column that will be made up of a list of lists that represents the distance between points. I am trying to create this list of distances in one list comprehension or the most ...
Dre's user avatar
  • 713
0 votes
1 answer
2k views

Pandas - convert columns to grouped array coordinates

I have a DataFrame of (x, y) coordinates that I would like to transform into array's to perform pairwise distance calculations on. df = pd.DataFrame({'type': ['a', 'a', 'a', 'b', 'b', 'c', 'c', '...
chris_l's user avatar
0 votes
1 answer
73 views

Computing values for a column in pandas using other columns

I have a data-frame containing 3 columns: 'longitude', 'latitude', and 'country'. For some longitude and latitudes, the value in the country columns is 'unknown'. Here is an overview of the data-frame:...
Rifly's user avatar
  • 29
0 votes
1 answer
125 views

calculate the minimum distance between 2 dataframe and estimate the missing points location in one dataframe

Estimate dataframe: index x y 1 0.47 0.46 2 0.44 0.46 3 0.41 0.45 4 0.38 0.45 5 0.35 0.45 6 0.33 0.44 7 0.30 0.43 8 0.30 0.39 real_dataframe: index x y 1 0.46 0.463 4 0.40 0.453 5 0.37 0....
Yumeng Xu's user avatar
  • 213
0 votes
1 answer
136 views

Pandas: concatenate dataframe with distance matrix

I tried to concatenate two Pandas DataFrames, but it concatenates wrong. Initial dataset looks like: df >>> well qoil cum_oil wct top_perf bot_perf st x ...
Alex's user avatar
  • 1
0 votes
0 answers
68 views

Calculate Euclidean distance between two points in a pandas dataframe [duplicate]

My data in a pandas dataframe looks like this: Sample Point Axis Value1 Value2 S1 P1 X 1.2 1.28 S1 P1 Y 3.4 3.6 S1 P1 Z 1.4 1.6 ...
anaz8's user avatar
  • 115
0 votes
1 answer
1k views

Distance Between Two Vectors As Columns Of Pandas DataFrame

I have a DataFrame which has two vectors as columns. I want to produce a third column that is the Euclidean distance between the two vectors. I've been using np.linalg.norm, but I've been getting ...
ben890's user avatar
  • 1,083
0 votes
0 answers
86 views

Compute pairwise euclidean distance values in pandas DataFrame using itertools.combinations [duplicate]

I have a pandas dataframe after reading in a .csv file that resembles: import itertools as it import pandas as pd import numpy as np import scipy as sp x = np.random.randn(5) y = np.sin(x) z = np....
lf208's user avatar
  • 77
0 votes
1 answer
2k views

Given specific lat/lon calculate closest point from csv list of lat/lon

Need help with efficient python code(using pandas) to find which vehicle at what time passed closest to incident_sw =(35.7158, -120.7640). I'm having trouble formulating a Euclidean distance to sort ...
CatLady's user avatar
  • 31
0 votes
1 answer
700 views

Measuring the distance between points and groups

I am trying to measure the distance between points inside a pandas dataframe. I first and looking to measure the distance between points that are in a sub region and get the average distance for that ...
rontho1992's user avatar
-1 votes
1 answer
8k views

Distance between two Points in pandas csv Data-frame

I want to calculate the distance between two coordinates points(Lat1,long1, and Lat2,Long2) for the below data frame. `name_x rnc_x lat1 long1 scrambling_code name_y rnc_y lat2 long2 ...
DHANANJAY CHAUBEY's user avatar