Are there any functions (as part of a math library) which will calculate mean, median, mode and range from a set of numbers.
7 Answers
Yes, there does seem to be 3rd libraries (none in Java Math). Two that have come up are:
http://www.iro.umontreal.ca/~simardr/ssj/indexe.html
but, it is actually not that difficult to write your own methods to calculate mean, median, mode and range.
MEAN
public static double mean(double[] m) {
double sum = 0;
for (int i = 0; i < m.length; i++) {
sum += m[i];
}
return sum / m.length;
}
MEDIAN
// the array double[] m MUST BE SORTED
public static double median(double[] m) {
int middle = m.length/2;
if (m.length%2 == 1) {
return m[middle];
} else {
return (m[middle-1] + m[middle]) / 2.0;
}
}
MODE
public static int mode(int a[]) {
int maxValue, maxCount;
for (int i = 0; i < a.length; ++i) {
int count = 0;
for (int j = 0; j < a.length; ++j) {
if (a[j] == a[i]) ++count;
}
if (count > maxCount) {
maxCount = count;
maxValue = a[i];
}
}
return maxValue;
}
UPDATE
As has been pointed out by Neelesh Salpe, the above does not cater for multi-modal collections. We can fix this quite easily:
public static List<Integer> mode(final int[] numbers) {
final List<Integer> modes = new ArrayList<Integer>();
final Map<Integer, Integer> countMap = new HashMap<Integer, Integer>();
int max = -1;
for (final int n : numbers) {
int count = 0;
if (countMap.containsKey(n)) {
count = countMap.get(n) + 1;
} else {
count = 1;
}
countMap.put(n, count);
if (count > max) {
max = count;
}
}
for (final Map.Entry<Integer, Integer> tuple : countMap.entrySet()) {
if (tuple.getValue() == max) {
modes.add(tuple.getKey());
}
}
return modes;
}
ADDITION
If you are using Java 8 or higher, you can also determine the modes like this:
public static List<Integer> getModes(final List<Integer> numbers) {
final Map<Integer, Long> countFrequencies = numbers.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
final long maxFrequency = countFrequencies.values().stream()
.mapToLong(count -> count)
.max().orElse(-1);
return countFrequencies.entrySet().stream()
.filter(tuple -> tuple.getValue() == maxFrequency)
.map(Map.Entry::getKey)
.collect(Collectors.toList());
}
-
thanks, but I would prefer to use something out of the box if possible Nov 16, 2010 at 6:47
-
This class will have issues if you have a very large array or have to calculate values on the fly. It can be written without an array for mean and standard deviation; not as certain for median and mode.– duffymoNov 16, 2010 at 10:30
-
The MODE algorithm is not considering cases with more than one mode (bimodal, trimodal, ...) - it happens when there is more than one number appearing in the same number of times as maxCount. Considering this, it should return an array instead of a single int value. Jan 13, 2011 at 19:54
-
1As mentioned in my comment on Adeel's answer, sorting the whole array to get the median is pretty inefficient. Aug 24, 2012 at 18:47
-
median function throw an ArrayIndexOutOfBoundsException if array has one entry only Aug 22, 2014 at 21:13
Check out commons math from apache. There is quite a lot there.
-
See comment on Adeel's answer: Apache Commons Math appears to use a pretty inefficient median algorithm. Aug 24, 2012 at 18:47
public static Set<Double> getMode(double[] data) {
if (data.length == 0) {
return new TreeSet<>();
}
TreeMap<Double, Integer> map = new TreeMap<>(); //Map Keys are array values and Map Values are how many times each key appears in the array
for (int index = 0; index != data.length; ++index) {
double value = data[index];
if (!map.containsKey(value)) {
map.put(value, 1); //first time, put one
}
else {
map.put(value, map.get(value) + 1); //seen it again increment count
}
}
Set<Double> modes = new TreeSet<>(); //result set of modes, min to max sorted
int maxCount = 1;
Iterator<Integer> modeApperance = map.values().iterator();
while (modeApperance.hasNext()) {
maxCount = Math.max(maxCount, modeApperance.next()); //go through all the value counts
}
for (double key : map.keySet()) {
if (map.get(key) == maxCount) { //if this key's value is max
modes.add(key); //get it
}
}
return modes;
}
//std dev function for good measure
public static double getStandardDeviation(double[] data) {
final double mean = getMean(data);
double sum = 0;
for (int index = 0; index != data.length; ++index) {
sum += Math.pow(Math.abs(mean - data[index]), 2);
}
return Math.sqrt(sum / data.length);
}
public static double getMean(double[] data) {
if (data.length == 0) {
return 0;
}
double sum = 0.0;
for (int index = 0; index != data.length; ++index) {
sum += data[index];
}
return sum / data.length;
}
//by creating a copy array and sorting it, this function can take any data.
public static double getMedian(double[] data) {
double[] copy = Arrays.copyOf(data, data.length);
Arrays.sort(copy);
return (copy.length % 2 != 0) ? copy[copy.length / 2] : (copy[copy.length / 2] + copy[(copy.length / 2) - 1]) / 2;
}
If you only care about unimodal distributions, consider sth. like this.
public static Optional<Integer> mode(Stream<Integer> stream) {
Map<Integer, Long> frequencies = stream
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
return frequencies.entrySet().stream()
.max(Comparator.comparingLong(Map.Entry::getValue))
.map(Map.Entry::getKey);
}
public class Mode {
public static void main(String[] args) {
int[] unsortedArr = new int[] { 3, 1, 5, 2, 4, 1, 3, 4, 3, 2, 1, 3, 4, 1 ,-1,-1,-1,-1,-1};
Map<Integer, Integer> countMap = new HashMap<Integer, Integer>();
for (int i = 0; i < unsortedArr.length; i++) {
Integer value = countMap.get(unsortedArr[i]);
if (value == null) {
countMap.put(unsortedArr[i], 0);
} else {
int intval = value.intValue();
intval++;
countMap.put(unsortedArr[i], intval);
}
}
System.out.println(countMap.toString());
int max = getMaxFreq(countMap.values());
List<Integer> modes = new ArrayList<Integer>();
for (Entry<Integer, Integer> entry : countMap.entrySet()) {
int value = entry.getValue();
if (value == max)
modes.add(entry.getKey());
}
System.out.println(modes);
}
public static int getMaxFreq(Collection<Integer> valueSet) {
int max = 0;
boolean setFirstTime = false;
for (Iterator iterator = valueSet.iterator(); iterator.hasNext();) {
Integer integer = (Integer) iterator.next();
if (!setFirstTime) {
max = integer;
setFirstTime = true;
}
if (max < integer) {
max = integer;
}
}
return max;
}
}
Test data
Modes {1,3} for { 3, 1, 5, 2, 4, 1, 3, 4, 3, 2, 1, 3, 4, 1 };
Modes {-1} for { 3, 1, 5, 2, 4, 1, 3, 4, 3, 2, 1, 3, 4, 1 ,-1,-1,-1,-1,-1};
As already pointed out by Nico Huysamen, finding multiple mode in Java 1.8 can be done alternatively as below.
import java.util.ArrayList;
import java.util.List;
import java.util.HashMap;
import java.util.Map;
public static void mode(List<Integer> numArr) {
Map<Integer, Integer> freq = new HashMap<Integer, Integer>();;
Map<Integer, List<Integer>> mode = new HashMap<Integer, List<Integer>>();
int modeFreq = 1; //record the highest frequence
for(int x=0; x<numArr.size(); x++) { //1st for loop to record mode
Integer curr = numArr.get(x); //O(1)
freq.merge(curr, 1, (a, b) -> a + b); //increment the frequency for existing element, O(1)
int currFreq = freq.get(curr); //get frequency for current element, O(1)
//lazy instantiate a list if no existing list, then
//record mapping of frequency to element (frequency, element), overall O(1)
mode.computeIfAbsent(currFreq, k -> new ArrayList<>()).add(curr);
if(modeFreq < currFreq) modeFreq = currFreq; //update highest frequency
}
mode.get(modeFreq).forEach(x -> System.out.println("Mode = " + x)); //pretty print the result //another for loop to return result
}
Happy coding!
Here's the complete clean and optimised code in JAVA 8
import java.io.*;
import java.util.*;
public class Solution {
public static void main(String[] args) {
/*Take input from user*/
Scanner sc = new Scanner(System.in);
int n =0;
n = sc.nextInt();
int arr[] = new int[n];
//////////////mean code starts here//////////////////
int sum = 0;
for(int i=0;i<n; i++)
{
arr[i] = sc.nextInt();
sum += arr[i];
}
System.out.println((double)sum/n);
//////////////mean code ends here//////////////////
//////////////median code starts here//////////////////
Arrays.sort(arr);
int val = arr.length/2;
System.out.println((arr[val]+arr[val-1])/2.0);
//////////////median code ends here//////////////////
//////////////mode code starts here//////////////////
int maxValue=0;
int maxCount=0;
for(int i=0; i<n; ++i)
{
int count=0;
for(int j=0; j<n; ++j)
{
if(arr[j] == arr[i])
{
++count;
}
if(count > maxCount)
{
maxCount = count;
maxValue = arr[i];
}
}
}
System.out.println(maxValue);
//////////////mode code ends here//////////////////
}
}