Normalizing a list of numbers in Python

Question

I need to normalize a list of values to fit in a probability distribution, i.e. between 0.0 and 1.0.

I understand how to normalize, but was curious if Python had a function to automate this.

I'd like to go from:

raw = [0.07, 0.14, 0.07]

to

normed = [0.25, 0.50, 0.25]

@Joran Because OP wants sum(normed) == 1.0 (ignoring floating point errors). — Kevin, Nov 6, 2014 at 17:16
See this post if you would like to normalize between a different range. How to normalize a list of positive and negative decimal number to a specific range — salomonvh, Sep 22, 2015 at 7:57

shivank01 · Accepted Answer · 2018-06-19 10:07:10Z

112

Use :

norm = [float(i)/sum(raw) for i in raw]

to normalize against the sum to ensure that the sum is always 1.0 (or as close to as possible).

use

norm = [float(i)/max(raw) for i in raw]

to normalize against the maximum

edited Jun 19, 2018 at 10:07

shivank01

1,0254 gold badges18 silver badges35 bronze badges

answered Nov 6, 2014 at 17:17

Tony Suffolk 66

9,5283 gold badges31 silver badges34 bronze badges

35

Nice. It's maybe worth noting that computing the sum in advance, rather than for each element in the comprehension, would be more efficient. So: s = sum(raw); norm = [float(i)/s for i in raw]
– mattsilver
May 5, 2015 at 23:43
Is that the same as (np.array(x) / np.array(x).sum()) / np.array(x).max() ?
– alvas
Feb 21, 2018 at 2:40
1

@alvas sorry - I can't be sure about numpy - but assuming dividing an array by a single value divides each value in the array; then it looks right.
– Tony Suffolk 66
Feb 21, 2018 at 14:18
1

Note that in python 3, float(i) can be replaced with just i to get the same effect.
– Inertial Ignorance
Jan 20 at 19:39

Add a comment |

blaylockbk · Accepted Answer · 2018-05-02 19:05:09Z

18

if your list has negative numbers, this is how you would normalize it

a = range(-30,31,5)
norm = [(float(i)-min(a))/(max(a)-min(a)) for i in a]

answered May 2, 2018 at 19:05

blaylockbk

2,8712 gold badges29 silver badges44 bronze badges

Add a comment |

Anh-Thi DINH · Accepted Answer · 2020-08-26 13:48:16Z

11

For ones who wanna use scikit-learn, you can use

from sklearn.preprocessing import normalize

x = [1,2,3,4]
normalize([x]) # array([[0.18257419, 0.36514837, 0.54772256, 0.73029674]])
normalize([x], norm="l1") # array([[0.1, 0.2, 0.3, 0.4]])
normalize([x], norm="max") # array([[0.25, 0.5 , 0.75, 1.]])

answered Aug 26, 2020 at 13:48

Anh-Thi DINH

2,0871 gold badge24 silver badges19 bronze badges

Or for a completely different kind of normalization: from sklearn.utils.extmath import softmax or from scipy.special import softmax
– Stef
Dec 8, 2021 at 13:55

Add a comment |

gboffi · Accepted Answer · 2014-11-06 17:32:36Z

How long is the list you're going to normalize?

def psum(it):
    "This function makes explicit how many calls to sum() are done."
    print "Another call!"
    return sum(it)

raw = [0.07,0.14,0.07]
print "How many calls to sum()?"
print [ r/psum(raw) for r in raw]

print "\nAnd now?"
s = psum(raw)
print [ r/s for r in raw]

# if one doesn't want auxiliary variables, it can be done inside
# a list comprehension, but in my opinion it's quite Baroque    
print "\nAnd now?"
print [ r/s  for s in [psum(raw)] for r in raw]

Output

# How many calls to sum()?
# Another call!
# Another call!
# Another call!
# [0.25, 0.5, 0.25]
# 
# And now?
# Another call!
# [0.25, 0.5, 0.25]
# 
# And now?
# Another call!
# [0.25, 0.5, 0.25]

Anzel · Accepted Answer · 2014-11-06 17:18:14Z

6

try:

normed = [i/sum(raw) for i in raw]

normed
[0.25, 0.5, 0.25]

answered Nov 6, 2014 at 17:18

Anzel

20.2k5 gold badges51 silver badges52 bronze badges

Add a comment |

wnnmaw · Accepted Answer · 2014-11-06 17:19:34Z

4

There isn't any function in the standard library (to my knowledge) that will do it, but there are absolutely modules out there which have such functions. However, its easy enough that you can just write your own function:

def normalize(lst):
    s = sum(lst)
    return map(lambda x: float(x)/s, lst)

Sample output:

>>> normed = normalize(raw)
>>> normed
[0.25, 0.5, 0.25]

answered Nov 6, 2014 at 17:19

wnnmaw

5,4943 gold badges39 silver badges63 bronze badges

This is one of the two answers that extract sum() from the loop... I still prefer mine but I think this is a + exactly for the auxiliary variable s = sum(lst).
– gboffi
Nov 6, 2014 at 17:37
4

normalize([1,0,-1]) will raise ZeroDivisionError :)
– Yan Foto
Nov 14, 2015 at 14:06

Add a comment |

Tengerye · Accepted Answer · 2018-09-18 12:16:13Z

4

If you consider using numpy, you can get a faster solution.

import random, time
import numpy as np

a = random.sample(range(1, 20000), 10000)
since = time.time(); b = [i/sum(a) for i in a]; print(time.time()-since)
# 0.7956490516662598

since = time.time(); c=np.array(a);d=c/sum(a); print(time.time()-since)
# 0.001413106918334961

answered Sep 18, 2018 at 12:16

Tengerye

1,8761 gold badge28 silver badges47 bronze badges

Ru sure this equation is right? I am getting vals in d < 0. Not sure if this should happen. Maybe I did something wrong. I am inputting vals from ~ -0.5 to 05.?
– ScipioAfricanus
Sep 2, 2019 at 21:43
@ScipioAfricanus random.sample only works on integer. If float is required, check `np.random.uniform' or something similar instead.
– Tengerye
Sep 3, 2019 at 1:59

Add a comment |

Nurul Akter Towhid · Accepted Answer · 2018-03-29 07:51:49Z

Try this :

from __future__ import division

raw = [0.07, 0.14, 0.07]  

def norm(input_list):
    norm_list = list()

    if isinstance(input_list, list):
        sum_list = sum(input_list)

        for value in input_list:
            tmp = value  /sum_list
            norm_list.append(tmp) 

    return norm_list

print norm(raw)

This will do what you asked. But I will suggest to try Min-Max normalization.

min-max normalization :

def min_max_norm(dataset):
    if isinstance(dataset, list):
        norm_list = list()
        min_value = min(dataset)
        max_value = max(dataset)

        for value in dataset:
            tmp = (value - min_value) / (max_value - min_value)
            norm_list.append(tmp)

    return norm_list

thanks for the code for min max normalization
– mightyandweakcoder
Feb 4, 2022 at 3:21 — mightyandweakcoder, Feb 4, 2022 at 3:21

vespertine venus · Accepted Answer · 2020-03-02 00:01:36Z

If working with data, many times pandas is the simple key

This particular code will put the raw into one column, then normalize by column per row. (But we can put it into a row and do it by row per column, too! Just have to change the axis values where 0 is for row and 1 is for column.)

import pandas as pd


raw = [0.07, 0.14, 0.07]  

raw_df = pd.DataFrame(raw)
normed_df = raw_df.div(raw_df.sum(axis=0), axis=1)
normed_df

where normed_df will display like:

and then can keep playing with the data, too!

Jeff Hykin · Accepted Answer · 2021-03-02 14:20:25Z

1

Here is a not-terribly-inefficient one liner similar to the top answer (only performs summation once)

norm = (lambda the_sum:[float(i)/the_sum for i in raw])(sum(raw))

A similar method can be done for a list with negative numbers

norm = (lambda the_max, the_min: [(float(i)-the_min)/(the_max-the_min) for i in raw])(max(raw),min(raw))

answered Mar 2, 2021 at 14:20

Jeff Hykin

2,25318 silver badges26 bronze badges

Add a comment |

keramat · Accepted Answer · 2020-12-24 08:01:31Z

0

Use scikit-learn:

from sklearn.preprocessing import MinMaxScaler
data = np.array([1,2,3]).reshape(-1, 1)
scaler = MinMaxScaler()
scaler.fit(data)
print(scaler.transform(data))

answered Dec 24, 2020 at 8:01

keramat

4,4717 gold badges26 silver badges40 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Normalizing a list of numbers in Python

11 Answers 11

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
probability
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythonprobability or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
probability
or ask your own question.