I have a list of Python objects that I want to sort by a specific attribute of each object:
[Tag(name="toe", count=10), Tag(name="leg", count=2), ...]
How do I sort the list by .count
in descending order?
I have a list of Python objects that I want to sort by a specific attribute of each object:
[Tag(name="toe", count=10), Tag(name="leg", count=2), ...]
How do I sort the list by .count
in descending order?
To sort the list in place:
orig_list.sort(key=lambda x: x.count, reverse=True)
To return a new list, use sorted
:
new_list = sorted(orig_list, key=lambda x: x.count, reverse=True)
Explanation:
key=lambda x: x.count
sorts by count.reverse=True
sorts in descending order.More on sorting by keys.
A way that can be fastest, especially if your list has a lot of records, is to use operator.attrgetter("count")
. However, this might run on an pre-operator version of Python, so it would be nice to have a fallback mechanism. You might want to do the following, then:
try: import operator
except ImportError: keyfun= lambda x: x.count # use a lambda if no operator module
else: keyfun= operator.attrgetter("count") # use operator since it's faster than lambda
ut.sort(key=keyfun, reverse=True) # sort in-place
self.__dict__ = {'some':'dict'}
after the __init__
method). I don't know why it sould be different, though.
__dict__
. Note that "an object having dynamically added attributes" and "setting an object's __dict__
attribute" are almost orthogonal concepts. I'm saying that because your comment seems to imply that setting the __dict__
attribute is a requirement for dynamically adding attributes.
operator.attrgetter
, I could supply a function with any property name and return a sorted collection.
Feb 24, 2016 at 18:16
Readers should notice that the key= method:
ut.sort(key=lambda x: x.count, reverse=True)
is many times faster than adding rich comparison operators to the objects. I was surprised to read this (page 485 of "Python in a Nutshell"). You can confirm this by running tests on this little program:
#!/usr/bin/env python
import random
class C:
def __init__(self,count):
self.count = count
def __cmp__(self,other):
return cmp(self.count,other.count)
longList = [C(random.random()) for i in xrange(1000000)] #about 6.1 secs
longList2 = longList[:]
longList.sort() #about 52 - 6.1 = 46 secs
longList2.sort(key = lambda c: c.count) #about 9 - 6.1 = 3 secs
My, very minimal, tests show the first sort is more than 10 times slower, but the book says it is only about 5 times slower in general. The reason they say is due to the highly optimizes sort algorithm used in python (timsort).
Still, its very odd that .sort(lambda) is faster than plain old .sort(). I hope they fix that.
__cmp__
is equivalent to calling .sort(cmp=lambda)
, not .sort(key=lambda)
, so it isn't odd at all.
longList2.sort(cmp = cmp)
. I tried this out and it performed nearly the same as .sort()
. (Also: note that the "cmp" sort parameter was removed in Python 3.)
Oct 29, 2019 at 4:56
Object-oriented approach
It's good practice to make object sorting logic, if applicable, a property of the class rather than incorporated in each instance the ordering is required.
This ensures consistency and removes the need for boilerplate code.
At a minimum, you should specify __eq__
and __lt__
operations for this to work. Then just use sorted(list_of_objects)
.
class Card(object):
def __init__(self, rank, suit):
self.rank = rank
self.suit = suit
def __eq__(self, other):
return self.rank == other.rank and self.suit == other.suit
def __lt__(self, other):
return self.rank < other.rank
hand = [Card(10, 'H'), Card(2, 'h'), Card(12, 'h'), Card(13, 'h'), Card(14, 'h')]
hand_order = [c.rank for c in hand] # [10, 2, 12, 13, 14]
hand_sorted = sorted(hand)
hand_sorted_order = [c.rank for c in hand_sorted] # [2, 10, 12, 13, 14]
__eq__
and __lt__
are the minimum implementation requirements?
__eq__
, __lt__
, __le__
, __gt__
, __ge__
and __ne__
and inside just forward to the attributes special function?
__eq__
and __lt__
and then use @functools.total_ordering
to get the rest.
from operator import attrgetter
ut.sort(key = attrgetter('count'), reverse = True)
It looks much like a list of Django ORM model instances.
Why not sort them on query like this:
ut = Tag.objects.order_by('-count')
If the attribute you want to sort by is a property, then you can avoid importing operator.attrgetter
and use the property's fget
method instead.
For example, for a class Circle
with a property radius
we could sort a list of circles
by radii as follows:
result = sorted(circles, key=Circle.radius.fget)
This is not the most well-known feature but often saves me a line with the import.
Circle.radius
. I don't understand why it doesn't work. Thanks for the pointer of using fget
. I don't like defining lambdas if something else can do the job, and I don't like passing names of attributes as strings to attrgetter
, it just feels wrong, to me.
Jul 5, 2023 at 10:46
sorted
takes each item to be sorted, passes each of them in turn to the key, and sorts the values returned as proxies for the items themselves. Circle.radius
doesn't work because property
objects themselves aren't callable. But they do have an attribute called fget
which is.
Add rich comparison operators to the object class, then use sort() method of the list.
See rich comparison in python.
Update: Although this method would work, I think solution from Triptych is better suited to your case because way simpler.
Also if someone wants to sort list that contains strings and numbers for e.g.
eglist=[
"some0thing3",
"some0thing2",
"some1thing2",
"some1thing0",
"some3thing10",
"some3thing2",
"some1thing1",
"some0thing1"]
Then here is the code for that:
import re
def atoi(text):
return int(text) if text.isdigit() else text
def natural_keys(text):
return [ atoi(c) for c in re.split(r'(\d+)', text) ]
eglist=[
"some0thing3",
"some0thing2",
"some1thing2",
"some1thing0",
"some3thing10",
"some3thing2",
"some1thing1",
"some0thing1"
]
eglist.sort(key=natural_keys)
print(eglist)
@Jose M Vidal's answer mentions something important: using rich comparisons (__lt__
, __eq__
etc. as in @jpp's answer) makes sorting much slower than passing a key function (as in the accepted answer). However, they end up showing a test using __cmp__
method which doesn't exist in Python 3 (which tbf was indeed so much slower than passing key in Python 2).
I want to point out that even in Python 3.12.0, using rich comparisons as in @jpp's answer makes sorting much slower than passing a key function.
The results of a little timeit test is shown below where sorting using rich comparisons vs a lambda key function vs operator.attrgetter
as key are compared. For a list with 10k items, it took about 18.8 ms when rich comparisons were used whereas it took 2.93 ms when a lambda key function was used and 2.47 ms when operator.attrgetter
was used.
So, as @tzot mentioned, operator.attrgetter
is faster than a lambda; however, using a key function in the first place instead of rich comparisons makes sorting over 5 times faster.
import timeit
import random
from operator import attrgetter
class Card(object):
def __init__(self, rank):
self.rank = rank
def __eq__(self, other):
return self.rank == other.rank
def __lt__(self, other):
return self.rank < other.rank
n = 100
random.seed(0)
lst = [Card(random.randrange(10000)) for _ in range(10000)]
min(timeit.repeat(lambda: sorted(lst), number=n))/n
# 0.018813106999732553
min(timeit.repeat(lambda: sorted(lst, key=lambda card: card.rank), number=n))/n
# 0.0029304770001908763
min(timeit.repeat(lambda: sorted(lst, key=attrgetter('rank')), number=n))/n
# 0.00247172600007616