Questions tagged [weighted]
questions about problems that use a weight function, e.g. weighted mean, weighted sampling
weighted
626
questions
122
votes
7
answers
85k
views
Weighted standard deviation in NumPy
numpy.average() has a weights option, but numpy.std() does not. Does anyone have suggestions for a workaround?
55
votes
12
answers
33k
views
Weighted percentile using numpy
Is there a way to use the numpy.percentile function to compute weighted percentile? Or is anyone aware of an alternative python function to compute weighted percentile?
thanks!
40
votes
6
answers
53k
views
Calculating weighted mean and standard deviation
I have a time series x_0 ... x_t. I would like to compute the exponentially weighted variance of the data. That is:
V = SUM{w_i*(x_i - x_bar)^2, i=1 to T} where SUM{w_i} = 1 and x_bar=SUM{w_i*x_i}
...
38
votes
6
answers
50k
views
How can I make a random choice according to probabilities stored in a list (weighted random distribution)?
Given a list of probabilities like:
P = [0.10, 0.25, 0.60, 0.05]
(I can ensure that the sum of all the variables in P is always 1)
How can I write a function that randomly returns a valid index, ...
22
votes
6
answers
35k
views
Frequency tables with weighted data in R
I need to calculate the frequency of individuals by age and marital status so normally I'd use:
table(age, marital_status)
However each individual has a different weight after the sampling of ...
17
votes
6
answers
3k
views
How can I get a weighted random pick from Python's Counter class?
I have a program where I'm keeping track of the success of various things using collections.Counter — each success of a thing increments the corresponding counter:
import collections
scoreboard = ...
16
votes
3
answers
1k
views
Maximize number of subgraphs with a given minimum weight
I have an undirected planar graph where each node has a weight. I want to split the graph into as many connected disjoint subgraphs as possible (EDIT: or to reach a minimum mean weight of the ...
15
votes
3
answers
17k
views
Weighted Pearson's Correlation?
I have a 2396x34 double matrix named y wherein each row (2396) represents a separate situation consisting of 34 consecutive time segments.
I also have a numeric[34] named x that represents a single ...
15
votes
1
answer
11k
views
Adding a weighted least squares trendline in ggplot2
I am preparing a plot using ggplot2, and I want to add a trendline that is based on a weighted least squares estimation.
In base graphics this can be done by sending a WLS model to abline:
mod0 <...
14
votes
1
answer
12k
views
"weighted" regression in R
I have created a script like the one below to do something I called as "weighted" regression:
library(plyr)
set.seed(100)
temp.df <- data.frame(uid=1:200,
bp=sample(x=c(100:...
13
votes
2
answers
8k
views
More efficient weighted Gini coefficient in Python
Per https://stackoverflow.com/a/48981834/1840471, this is an implementation of the weighted Gini coefficient in Python:
import numpy as np
def gini(x, weights=None):
if weights is None:
...
13
votes
4
answers
9k
views
Select a random item from a weighted list
I am trying to write a program to select a random name from the US Census last name list. The list format is
Name Weight Cumulative line
----- ----- ----- -
SMITH 1....
13
votes
3
answers
1k
views
Plot weighted frequency matrix
This question is related to two different questions I have asked previously:
1) Reproduce frequency matrix plot
2) Add 95% confidence limits to cumulative plot
I wish to reproduce this plot in R:
...
12
votes
6
answers
11k
views
Gomoku array-based AI-algorithm?
Way way back (think 20+ years) I encountered a Gomoku game source code in a magazine that I typed in for my computer and had a lot of fun with.
The game was difficult to win against, but the core ...
11
votes
4
answers
11k
views
Select element from array with probability proportional to its value
I have an array of doubles and I want to select a value from it with the probability of each value being selected being inversely proportional to its value. For example:
arr[0] = 100
arr[1] = 200
In ...
11
votes
5
answers
3k
views
Whats the most concise way to pick a random element by weight in c#?
Lets assume:
List<element> which element is:
public class Element {
int Weight { get; set; }
}
What I want to achieve is, select an element randomly by the weight.
For example:
Element_1....
11
votes
3
answers
3k
views
Weighted random sampling in Elasticsearch
I need to obtain a random sample from an ElasticSearch index, i.e. to issue a query that retrieves some documents from a given index with weighted probability Wj/ΣWi (where Wj is a weight of row j and ...
10
votes
6
answers
699
views
How to weight a list of ranks by a numeric value by individual in R
In R I want to allocate projects to people based on their rank preferences but also their performance. Say I have 5 projects and 3 people. In this case, all three people want project A because it's ...
10
votes
2
answers
7k
views
weighted mean in dplyr for multiple columns
I'm trying to calculate the weighted mean for multiple columns using dplyr. at the moment I'm stuck with summarize_each which to me seems to be part of the solution. here's some example code:
library(...
10
votes
1
answer
9k
views
How to use weights in a logistic regression
I want to calculate (weighted) logistic regression in Python. The weights were calculated to adjust the distribution of the sample regarding the population. However, the results don´t change if I use ...
10
votes
4
answers
6k
views
Weighted win percentage by number of games played
Im looking to create a ranking system for users on a gaming site.
The system should be based of a weighted win percentage with the weighted element being the number of games played.
For instance:
...
9
votes
2
answers
10k
views
Fastest way to take the weighted sum of the columns of a matrix in R
I need the weighted sum of each column of a matrix.
data <- matrix(1:2e7,1e7,2) # warning large number, will eat up >100 megs of memory
weights <- 1:1e7/1e5
system.time(colSums(data*weights))...
8
votes
2
answers
1k
views
C++. Weighted std::shuffle
Is there a way to do nice and elegant weighted shuffling using standard library?
There is std::discrete_distribution.
What I want is something like this:
std::vector<T> data { N elements };
...
8
votes
1
answer
484
views
Elasticsearch random selection based on weighting out of 100
I have been running a Rails site for a couple of years and some articles are being pulled from the DB based on a weight field. The data structure is:
{name: 'Content Piece 1', weight: 50}
{name: '...
7
votes
1
answer
3k
views
Which algorithm/implementation for weighted similarity between users by their selected, distanced attributes?
Data Structure:
User has many Profiles
(Limit - no more than one of each profile type per user, no duplicates)
Profiles has many Attribute Values
(A user can have as many or few attribute ...
7
votes
1
answer
4k
views
Weighted sum of variables by groups with data.table
I am looking for a solution to compute weighted sum of some variables by groups with data.table. I hope the example is clear enough.
require(data.table)
dt <- data.table(matrix(1:200, nrow = 10))
...
6
votes
1
answer
4k
views
Weighted linear regression in R with lm() and svyglm(). Same model, different results
I want to do a linear regression applying survey weights in R studio. I have seen that it is possible to do this with the lm() function, which enables me to specify the weights I want to use. However, ...
6
votes
2
answers
9k
views
PostgreSQL - making ts_rank take the ts_vector position as-is or defining a custom ts_rank function
I'm performing weighted search on a series of items in an e-commerce platform. The problem I have is ts_rank is giving me the exact same value for different combinations of words, even if the ...
6
votes
3
answers
1k
views
Weighted Shuffle of an Array or Arrays?
What is a good algorithm that shuffles an array or arrays using weights from the nested arrays?
Example:
$array = array(
array("name"=>"John", "rank"=>3),
array("name"=>"Bob", "rank"=&...
6
votes
1
answer
3k
views
XGBRegressor with weights and base_margin: out of sample validation possible?
I have an old linear model which I wish to improve using XGBoost. I have the predictions from the old model, which I wish to use as a base margin. Also, due to the nature of what I'm modeling, I need ...
5
votes
2
answers
3k
views
In Ruby, how can one make a weighted random selection by least weight?
If I have the array :
ar = [1,3,5,3,6,1,4,6,7,6,6,6,6,6]
I could reduce this to the amount of occurrences :
counts = {1=>2, 3=>2, 5=>1, 6=>7, 4=>1, 7=>1}
Now I would like to ...
5
votes
1
answer
2k
views
Creating a weighted undirected graph in "igraph" in C/C++
Problem:
I want to make a weighted undirected graph from adjacency matrix stored in a .csv file using igraph and then do the minimum spanning tree and some other algorithms on it.
I started with ...
4
votes
4
answers
694
views
How to pick 4 unique items from a weighted list?
So I've got a list of weighted items, and I'd like to pick 4 non-duplicate items from this list.
Item Weight
Apple 5
Banana 7
Cherry 12
...
Orange 8
Pineapple 50
What is the most ...
4
votes
1
answer
801
views
Weighted mean across several matrices - element by element
I have 'mylist" - a list of same size matrices:
mylist <- vector("list", 5)
set.seed(123)
for(i in 1:5){
mylist[[i]] <- matrix(rnorm(9), nrow = 3)
}
I also have a vector of weights 'mywgts' ...
4
votes
5
answers
3k
views
Weighted Randomized Ordering
The problem:
I have items that have weights. The higher the weight, the greater chance they have the item will go first. I need to have a clean, simple way of doing this that is based on core Java (...
4
votes
1
answer
4k
views
How to use the R survey package to analyze multiple response questions in a weighted sample?
I'm relatively new to R. I am wondering how to use the 'survey' package (http://r-survey.r-forge.r-project.org/survey/) to analyze a multiple response question for a weighted sample? The tricky bit is ...
4
votes
1
answer
4k
views
How to find probability of path in a directed graph?
I have a directed weighted graph G=(V,E).
In this graph the weight of edge(v[i],v[j]) is the count of transition between v[i] and v[j].
I am trying to determine the best way to accomplish task: how ...
4
votes
1
answer
897
views
Weighted sampling in Fortran
In a Fortran program I would like to choose at random a specific variable (specifically its index) by using weights. The weights would be provided in a separate vector (element 1 would contain weight ...
4
votes
2
answers
2k
views
Algorithm to modify the weights of the edges of a graph, given a shortest path
Given a graph with edges having positive weights, a pair of nodes, and a path between the nodes, what's the best algorithm that will tell me how to modify the edge weights of the graph to the minimum ...
4
votes
1
answer
637
views
Subtracting from random values in a weighted matrix in R
and thanks in advance for your help!
This question is related to one I posted before, but I think it deserves its own post because it is a separate challenge.
Last time I asked about randomly ...
4
votes
1
answer
686
views
How to weight station to Order Least Squares in python?
I have 10 climate stations data about precipitation and it's DEM.
I had done a linear regression follow:
DEM = [200, 300, 400, 500, 600, 300, 200, 100, 50, 200]
Prep = [50, 95, 50, 59, 99, 50, 23, ...
4
votes
2
answers
4k
views
Algorithm for optimal pairing strategy of items in two ordered lists while maintaining order
I have the the following two ordered lists of items:
A = ["apples","oranges","bananas","blueberries"]
B = ["apples","blueberries","oranges","bananas"]
Each item has a score which is equal to the ...
4
votes
2
answers
955
views
SQL Server 2008 Containstable generate negative rank with weighted_term
I have a table with full text search enabled on Title column. I try to make a weighted search with a containstable but i get an Arithmetic overflow for the Rank value. The query is as follow
SELECT ...
4
votes
0
answers
846
views
Keras "class_weights" parameter usable for regression?
I am training a Sequential model for regression. I noticed there is a "class_weights" parameter in the fitting function.
As I understand one can give a different importance to a class during ...
4
votes
0
answers
550
views
fractional frequency weights in R's lm()
I understand that lm treats weights as "analytic" weights, meaning that observations are just weighted against each other (e.g. lm will weigh an observation with weight= 2 twice as much as one with ...
4
votes
3
answers
163
views
Weighted distribution among buckets with no look-ahead
I have N workers that need to process incoming batches of data. Each worker is configured so that it knows that it is "worker X of N".
Each incoming batch of data has a random unique ID (being random,...
4
votes
0
answers
156
views
Is there a faster method of doing a random weighted choice for a large list of items [duplicate]
Assume I have a list of items each with a weight and I want to pick a random item. The easiest way to implement this is to keep the list of items and a running sum of the weights sorted by the sum. ...
3
votes
6
answers
2k
views
How to weight a random number based on an array
I've been thinking about how to implement something that, frankly, is beyond my mathematical skills. So here goes, feel free to try and point me in the right direction rather than complete code ...
3
votes
3
answers
10k
views
Adding Class Weights for imbalanced dataset in Convolutional Neural Network
I have a dataset of images that has the following distribution:
Class 0: 73,5%
Class 1: 7%
Class 2: 15%
Class 3: 2,5%
Class 4: 2%
I think I need to add Class Weights to make up for the low amount ...
3
votes
2
answers
5k
views
weighted regression sklearn
I'd like to add weights to my training data based on its recency.
If we look at a simple example:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.preprocessing import ...