List comprehension vs. lambda + filter

Question

I have a list that I want to filter by an attribute of the items.

Which of the following is preferred (readability, performance, other reasons)?

xs = [x for x in xs if x.attribute == value]

xs = filter(lambda x: x.attribute == value, xs)

A better example would be a case where you already had a nicely named function to use as your predicate. In that case, I think a lot more people would agree that filter was more readable. When you have a simple expression that can be used as-is in a listcomp, but has to be wrapped in a lambda (or similarly constructed out of partial or operator functions, etc.) to pass to filter, that's when listcomps win. — abarnert, Jul 31, 2013 at 19:15
It should be said that in Python3 at least, the return of filter is a filter generator object not a list. — Matteo Ferla, Aug 29, 2019 at 8:59
More readable? I guess it is a matter of personal taste but to me, the list comprehension solution looks like plain English: "for each element in my_list, take it only if it's attribute equals value" (!?). I guess even a non programmer might try to understand what's going on, more or less. In the second solution... well... what's that strange "lamba" word, to start with? Again, it is probably a matter of personal taste but I would go for the list comprehension solution all the time, regardless of potential tiny differences in performance that are basically only of interest to researchers. — Sal Borrelli, Apr 29, 2021 at 7:10
The specific implementation of filter is not all that readable (not surprising since Python is not really a functional programming language). Switching param order probably would have been a better choice in the lang development history, i.e. filter(xs, lambda: x: ...) would then read left-to-right like "filter xs to keep only values satisfying ...". Arguably, the comprehension should be considered more readable since it is left-to-right comprehensible (see what I did there?) and more "Pythonic" based on the not-FP-language attribute of Python and the not-so-readable impl of filter, etc. — Ezekiel Victor, Sep 25, 2022 at 15:11

CertainPerformance · Accepted Answer · 2019-09-23 01:17:48Z

714

It is strange how much beauty varies for different people. I find the list comprehension much clearer than filter+lambda, but use whichever you find easier.

There are two things that may slow down your use of filter.

The first is the function call overhead: as soon as you use a Python function (whether created by def or lambda) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn't think much about performance until you've timed your code and found it to be a bottleneck, but the difference will be there.

The other overhead that might apply is that the lambda is being forced to access a scoped variable (value). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing value through a closure and this difference won't apply.

The other option to consider is to use a generator instead of a list comprehension:

def filterbyvalue(seq, value):
   for el in seq:
       if el.attribute==value: yield el

Then in your main code (which is where readability really matters) you've replaced both list comprehension and filter with a hopefully meaningful function name.

edited Sep 23, 2019 at 1:17

CertainPerformance

363k54 gold badges332 silver badges337 bronze badges

answered Jun 10, 2010 at 10:52

Duncan

93.8k13 gold badges125 silver badges159 bronze badges

82

+1 for the generator. I have a link at home to a presentation that shows how amazing generators can be. You can also replace the list comprehension with a generator expression just by changing [] to (). Also, I agree that the list comp is more beautiful.
– Wayne Werner
Jun 10, 2010 at 13:03
2

Actually, no - filter is faster. Just run a couple of quick benchmarks using something like stackoverflow.com/questions/5998245/…
– skqr
Jun 15, 2015 at 17:47
4

@skqr better to just use timeit for benchmarks, but please give an example where you find filter to be faster using a Python callback function.
– Duncan
Jun 17, 2015 at 10:32
13

@tnq177 It's David Beasley's presentation on generators - dabeaz.com/generators
– Wayne Werner
Jun 8, 2016 at 13:03
2

"...which is where readability really matters...". Sorry, but readability always matters, even in the (rare) cases when you -- crying -- have to give up of it.
– Victor Schröder
Jan 17, 2019 at 8:45

| Show 4 more comments

Tendayi Mawushe · Accepted Answer · 2010-11-04 17:41:40Z

292

This is a somewhat religious issue in Python. Even though Guido considered removing map, filter and reduce from Python 3, there was enough of a backlash that in the end only reduce was moved from built-ins to functools.reduce.

Personally I find list comprehensions easier to read. It is more explicit what is happening from the expression [i for i in list if i.attribute == value] as all the behaviour is on the surface not inside the filter function.

I would not worry too much about the performance difference between the two approaches as it is marginal. I would really only optimise this if it proved to be the bottleneck in your application which is unlikely.

Also since the BDFL wanted filter gone from the language then surely that automatically makes list comprehensions more Pythonic ;-)

edited Nov 4, 2010 at 17:41

answered Jun 10, 2010 at 10:58

Tendayi Mawushe

25.7k6 gold badges51 silver badges57 bronze badges

4

Thanks for the links to Guido's input, if nothing else for me it means I will try not to use them any more, so that I won't get the habit, and I won't become supportive of that religion :)
– dashesy
Jun 12, 2013 at 1:17
2

but reduce is the most complex to do with simple tools! map and filter are trivial to replace with comprehensions!
– njzk2
May 30, 2014 at 20:22
11

didn't know reduce was demoted in Python3. thanks for the insight! reduce() is still quite helpful in distributed computing, like PySpark. I think that was a mistake..
– Tagar
Jun 28, 2015 at 16:10
2

@Tagar you can still use reduce you just have to import it from functools
– icc97
Oct 11, 2017 at 11:58
2

+1 for "I would really only optimise this if it proved to be the bottleneck in your application which is unlikely." – It may be off-topic but there is so much unreadable code out there just because developers want to safe a few microseconds or 20 KB of memory. Unless the marginal higher memory consumption or the 2 or 5 microseconds are really an issue, clean code should always be preferred. (In this scenario, using filter is as much clean code as using list comprehension. Personally, I consider list comprehension more pythonic.)
– Thomas
Oct 2, 2021 at 0:54

Add a comment |

Mateen Ulhaq · Accepted Answer · 2016-08-27 08:05:49Z

92

Since any speed difference is bound to be miniscule, whether to use filters or list comprehensions comes down to a matter of taste. In general I'm inclined to use comprehensions (which seems to agree with most other answers here), but there is one case where I prefer filter.

A very frequent use case is pulling out the values of some iterable X subject to a predicate P(x):

[x for x in X if P(x)]

but sometimes you want to apply some function to the values first:

[f(x) for x in X if P(f(x))]

As a specific example, consider

primes_cubed = [x*x*x for x in range(1000) if prime(x)]

I think this looks slightly better than using filter. But now consider

prime_cubes = [x*x*x for x in range(1000) if prime(x*x*x)]

In this case we want to filter against the post-computed value. Besides the issue of computing the cube twice (imagine a more expensive calculation), there is the issue of writing the expression twice, violating the DRY aesthetic. In this case I'd be apt to use

prime_cubes = filter(prime, [x*x*x for x in range(1000)])

edited Aug 27, 2016 at 8:05

Mateen Ulhaq

25.8k20 gold badges111 silver badges143 bronze badges

answered Nov 13, 2014 at 20:00

I. J. Kennedy

25.3k16 gold badges64 silver badges87 bronze badges

11

Would you not consider using the prime via another list comprehension? Such as [prime(i) for i in [x**3 for x in range(1000)]]
– viki.omega9
Mar 12, 2015 at 2:22
39

x*x*x cannot be a prime number, as it has x^2 and x as a factor, the example doesn't really make sense in a mathematical way, but maybe it's still helpul. (Maybe we could find something better though?)
– Zelphir Kaltstahl
Sep 16, 2015 at 12:06
6

Note that we may use a generator expression instead for the last example if we don't want to eat up memory: prime_cubes = filter(prime, (x*x*x for x in range(1000)))
– Mateen Ulhaq
Aug 27, 2016 at 8:13
7

@MateenUlhaq this can be optimized to prime_cubes = [1] to save both memory and cpu cycles ;-)
– Dennis Krupenik
Mar 12, 2018 at 10:21
12

@DennisKrupenik Or rather, []
– Mateen Ulhaq
Mar 12, 2018 at 15:11

| Show 6 more comments

Umang · Accepted Answer · 2010-06-10 10:22:36Z

36

Although filter may be the "faster way", the "Pythonic way" would be not to care about such things unless performance is absolutely critical (in which case you wouldn't be using Python!).

answered Jun 10, 2010 at 10:22

Umang

5,2262 gold badges25 silver badges24 bronze badges

16

Late comment to an often-seen argument: Sometimes it makes a difference to have an analysis run in 5 hours instead of 10, and if that can be achieved by taking one hour optimizing python code, it can be worth it (especially if one is comfortable with python and not with faster languages).
– bli
Jan 23, 2017 at 16:44
2

But more important is how much the source code slows us down trying to read and understand it!
– thoni56
Dec 13, 2019 at 7:07
3

Basically, Pythonic way is a secret weapon that you can use when you want to say my idea is better than yours.
– user1663023
Sep 29, 2021 at 13:11
1

filter is not faster (any longer): see stackoverflow.com/a/74432106/1864294
– Michael Dorner
Nov 14, 2022 at 13:04

Add a comment |

Jim50 · Accepted Answer · 2016-09-06 06:48:17Z

I thought I'd just add that in python 3, filter() is actually an iterator object, so you'd have to pass your filter method call to list() in order to build the filtered list. So in python 2:

lst_a = range(25) #arbitrary list
lst_b = [num for num in lst_a if num % 2 == 0]
lst_c = filter(lambda num: num % 2 == 0, lst_a)

lists b and c have the same values, and were completed in about the same time as filter() was equivalent [x for x in y if z]. However, in 3, this same code would leave list c containing a filter object, not a filtered list. To produce the same values in 3:

lst_a = range(25) #arbitrary list
lst_b = [num for num in lst_a if num % 2 == 0]
lst_c = list(filter(lambda num: num %2 == 0, lst_a))

The problem is that list() takes an iterable as it's argument, and creates a new list from that argument. The result is that using filter in this way in python 3 takes up to twice as long as the [x for x in y if z] method because you have to iterate over the output from filter() as well as the original list.

Adeynack · Accepted Answer · 2015-01-31 14:27:35Z

18

An important difference is that list comprehension will return a list while the filter returns a filter, which you cannot manipulate like a list (ie: call len on it, which does not work with the return of filter).

My own self-learning brought me to some similar issue.

That being said, if there is a way to have the resulting list from a filter, a bit like you would do in .NET when you do lst.Where(i => i.something()).ToList(), I am curious to know it.

EDIT: This is the case for Python 3, not 2 (see discussion in comments).

edited Jan 31, 2015 at 14:27

answered Oct 15, 2014 at 23:50

Adeynack

1,23012 silver badges19 bronze badges

4

filter returns a list and we can use len on it. At least in my Python 2.7.6.
– thiruvenkadam
Jan 29, 2015 at 7:33
9

It is not the case in Python 3. a = [1, 2, 3, 4, 5, 6, 7, 8] f = filter(lambda x: x % 2 == 0, a) lc = [i for i in a if i % 2 == 0] >>> type(f) <class 'filter'> >>> type(lc) <class 'list'>
– Adeynack
Jan 29, 2015 at 17:33
5

"if there is a way to have the resulting list ... I am curious to know it". Just call list() on the result: list(filter(my_func, my_iterable)). And of course you could replace list with set, or tuple, or anything else that takes an iterable. But to anyone other than functional programmers, the case is even stronger to use a list comprehension rather than filter plus explicit conversion to list.
– Steve Jessop
Apr 26, 2016 at 10:54

Add a comment |

unbeli · Accepted Answer · 2010-06-10 10:19:27Z

10

I find the second way more readable. It tells you exactly what the intention is: filter the list.
PS: do not use 'list' as a variable name

answered Jun 10, 2010 at 10:19

unbeli

29.9k5 gold badges57 silver badges57 bronze badges

Add a comment |

John La Rooy · Accepted Answer · 2010-06-10 10:17:47Z

9

generally filter is slightly faster if using a builtin function.

I would expect the list comprehension to be slightly faster in your case

answered Jun 10, 2010 at 10:17

John La Rooy

299k54 gold badges375 silver badges506 bronze badges

1

python -m timeit 'filter(lambda x: x in [1,2,3,4,5], range(10000000))' 10 loops, best of 3: 1.44 sec per loop python -m timeit '[x for x in range(10000000) if x in [1,2,3,4,5]]' 10 loops, best of 3: 860 msec per loop Not really?!
– giaosudau
Nov 27, 2014 at 16:27
@sepdau, lambda functions are not builtins. List comprehensions have improved over the past 4 years - now the difference is negligible anyway even with builtin functions
– John La Rooy
Nov 27, 2014 at 21:08

Add a comment |

thiruvenkadam · Accepted Answer · 2015-01-29 07:38:11Z

9

Filter is just that. It filters out the elements of a list. You can see the definition mentions the same(in the official docs link I mentioned before). Whereas, list comprehension is something that produces a new list after acting upon something on the previous list.(Both filter and list comprehension creates new list and not perform operation in place of the older list. A new list here is something like a list with, say, an entirely new data type. Like converting integers to string ,etc)

In your example, it is better to use filter than list comprehension, as per the definition. However, if you want, say other_attribute from the list elements, in your example is to be retrieved as a new list, then you can use list comprehension.

return [item.other_attribute for item in my_list if item.attribute==value]

This is how I actually remember about filter and list comprehension. Remove a few things within a list and keep the other elements intact, use filter. Use some logic on your own at the elements and create a watered down list suitable for some purpose, use list comprehension.

edited Jan 29, 2015 at 7:38

answered Jan 29, 2015 at 7:32

thiruvenkadam

4,2204 gold badges27 silver badges26 bronze badges

2

I will be happy to know the reason for down voting so that I will not repeat it again anywhere in the future.
– thiruvenkadam
Jan 29, 2015 at 7:41
1

the definition of filter and list comprehension were not necessary, as their meaning was not being debated. That a list comprehension should be used only for “new” lists is presented but not argued for.
– Agos
Feb 2, 2015 at 11:14
I used the definition to say that filter gives you list with same elements which are true for a case but with list comprehension we can modify the elements themselves, like converting int to str. But point taken :-)
– thiruvenkadam
Feb 2, 2015 at 14:02

Add a comment |

rharder · Accepted Answer · 2015-08-28 19:31:41Z

7

Here's a short piece I use when I need to filter on something after the list comprehension. Just a combination of filter, lambda, and lists (otherwise known as the loyalty of a cat and the cleanliness of a dog).

In this case I'm reading a file, stripping out blank lines, commented out lines, and anything after a comment on a line:

# Throw out blank lines and comments
with open('file.txt', 'r') as lines:        
    # From the inside out:
    #    [s.partition('#')[0].strip() for s in lines]... Throws out comments
    #   filter(lambda x: x!= '', [s.part... Filters out blank lines
    #  y for y in filter... Converts filter object to list
    file_contents = [y for y in filter(lambda x: x != '', [s.partition('#')[0].strip() for s in lines])]

answered Aug 28, 2015 at 19:31

rharder

3482 silver badges8 bronze badges

This achieves a lot in very little code indeed. I think it might be a bit too much logic in one line to easily understand and readability is what counts though.
– Zelphir Kaltstahl
Sep 16, 2015 at 11:50
1

You could write this as file_contents = list(filter(None, (s.partition('#')[0].strip() for s in lines)))
– Steve Jessop
Apr 26, 2016 at 10:59

Add a comment |

user1767754 · Accepted Answer · 2017-11-28 00:27:01Z

It took me some time to get familiarized with the higher order functions filter and map. So i got used to them and i actually liked filter as it was explicit that it filters by keeping whatever is truthy and I've felt cool that I knew some functional programming terms.

Then I read this passage (Fluent Python Book):

The map and filter functions are still builtins in Python 3, but since the introduction of list comprehensions and generator ex‐ pressions, they are not as important. A listcomp or a genexp does the job of map and filter combined, but is more readable.

And now I think, why bother with the concept of filter / map if you can achieve it with already widely spread idioms like list comprehensions. Furthermore maps and filters are kind of functions. In this case I prefer using Anonymous functions lambdas.

Finally, just for the sake of having it tested, I've timed both methods (map and listComp) and I didn't see any relevant speed difference that would justify making arguments about it.

from timeit import Timer

timeMap = Timer(lambda: list(map(lambda x: x*x, range(10**7))))
print(timeMap.timeit(number=100))

timeListComp = Timer(lambda:[(lambda x: x*x) for x in range(10**7)])
print(timeListComp.timeit(number=100))

#Map:                 166.95695265199174
#List Comprehension   177.97208347299602

Michael Dorner · Accepted Answer · 2022-11-14 13:04:23Z

5

I would come to the conclusion: Use list comprehension over filter since its

more readable
more pythonic
faster (for Python 3.11, see attached benchmark, also see )

Keep in mind that filter returns a iterator, not a list.

python3 -m timeit '[x for x in range(10000000) if x % 2 == 0]'

1 loop, best of 5: 270 msec per loop

python3 -m timeit 'list(filter(lambda x: x % 2 == 0, range(10000000)))'

1 loop, best of 5: 432 msec per loop

answered Nov 14, 2022 at 13:04

Michael Dorner

18.7k14 gold badges93 silver badges122 bronze badges

Add a comment |

Nathaniel Ford · Accepted Answer · 2018-08-13 19:55:39Z

4

In addition to the accepted answer, there is a corner case when you should use filter instead of a list comprehension. If the list is unhashable you cannot directly process it with a list comprehension. A real world example is if you use pyodbc to read results from a database. The fetchAll() results from cursor is an unhashable list. In this situation, to directly manipulating on the returned results, filter should be used:

cursor.execute("SELECT * FROM TABLE1;")
data_from_db = cursor.fetchall()
processed_data = filter(lambda s: 'abc' in s.field1 or s.StartTime >= start_date_time, data_from_db)

If you use list comprehension here you will get the error:

TypeError: unhashable type: 'list'

edited Aug 13, 2018 at 19:55

Nathaniel Ford

20.8k20 gold badges94 silver badges104 bronze badges

answered Feb 28, 2018 at 21:16

C.W.praen

751 silver badge4 bronze badges

2

all lists are unhashable >>> hash(list()) # TypeError: unhashable type: 'list' secondly this works fine: processed_data = [s for s in data_from_db if 'abc' in s.field1 or s.StartTime >= start_date_time]
– Thomas Grainger
Jan 29, 2020 at 10:22
2

"If the list is unhashable you cannot directly process it with a list comprehension." This is not true, and all lists are unhashable anyway.
– juanpa.arrivillaga
May 7, 2020 at 17:30

Add a comment |

Enrique · Accepted Answer · 2022-05-27 19:50:28Z

In terms of performance, it depends.

filter does not return a list but an iterator, if you need the list 'immediately' filtering and list conversion it is slower than with list comprehension by about 40% for very large lists (>1M). Up to 100K elements, there is almost no difference, from 600K onwards there starts to be differences.

If you don't convert to a list, filter is practically instantaneous.

More info at: https://blog.finxter.com/python-lists-filter-vs-list-comprehension-which-is-faster/

Rod Senra · Accepted Answer · 2018-10-03 19:13:25Z

Curiously on Python 3, I see filter performing faster than list comprehensions.

I always thought that the list comprehensions would be more performant. Something like: [name for name in brand_names_db if name is not None] The bytecode generated is a bit better.

>>> def f1(seq):
...     return list(filter(None, seq))
>>> def f2(seq):
...     return [i for i in seq if i is not None]
>>> disassemble(f1.__code__)
2         0 LOAD_GLOBAL              0 (list)
          2 LOAD_GLOBAL              1 (filter)
          4 LOAD_CONST               0 (None)
          6 LOAD_FAST                0 (seq)
          8 CALL_FUNCTION            2
         10 CALL_FUNCTION            1
         12 RETURN_VALUE
>>> disassemble(f2.__code__)
2           0 LOAD_CONST               1 (<code object <listcomp> at 0x10cfcaa50, file "<stdin>", line 2>)
          2 LOAD_CONST               2 ('f2.<locals>.<listcomp>')
          4 MAKE_FUNCTION            0
          6 LOAD_FAST                0 (seq)
          8 GET_ITER
         10 CALL_FUNCTION            1
         12 RETURN_VALUE

But they are actually slower:

   >>> timeit(stmt="f1(range(1000))", setup="from __main__ import f1,f2")
   21.177661532000116
   >>> timeit(stmt="f2(range(1000))", setup="from __main__ import f1,f2")
   42.233950221000214

Invalid comparison. First, you are not passing a lambda function to the filter version, which makes it default to the identity function. When defining if not None in the list comprehension you are defining a lambda function (notice the MAKE_FUNCTION statement). Second, the results are different, as the list comprehension version will remove only None value, whereas the filter version will remove all "falsy" values. Having that said, the whole purpose of microbenchmarking is useless. Those are one million iterations, times 1k items! The difference is negligible. — Victor Schröder, Jan 17, 2019 at 9:27
list(filter(None, seq)) is equal to [i for i in seq if i] not i is not None. docs.python.org/3/library/functions.html#filter — Sole Sensei, Mar 23, 2021 at 13:40

ingofreyer · Accepted Answer · 2021-02-05 07:45:07Z

Summarizing other answers

Looking through the answers, we have seen a lot of back and forth, whether or not list comprehension or filter may be faster or if it is even important or pythonic to care about such an issue. In the end, the answer is as most times: it depends.

I just stumbled across this question while optimizing code where this exact question (albeit combined with an in expression, not ==) is very relevant - the filter + lambda expression is taking up a third of my computation time (of multiple minutes).

My case

In my case, the list comprehension is much faster (twice the speed). But I suspect that this varies strongly based on the filter expression as well as the Python interpreter used.

Test it for yourself

Here is a simple code snippet that should be easy to adapt. If you profile it (most IDEs can do that easily), you will be able to easily decide for your specific case which is the better option:

whitelist = set(range(0, 100000000, 27))

input_list = list(range(0, 100000000))

proximal_list = list(filter(
        lambda x: x in whitelist,
        input_list
    ))

proximal_list2 = [x for x in input_list if x in whitelist]

print(len(proximal_list))
print(len(proximal_list2))

If you do not have an IDE that lets you profile easily, try this instead (extracted from my codebase, so a bit more complicated). This code snippet will create a profile for you that you can easily visualize using e.g. snakeviz:

import cProfile
from time import time


class BlockProfile:
    def __init__(self, profile_path):
        self.profile_path = profile_path
        self.profiler = None
        self.start_time = None

    def __enter__(self):
        self.profiler = cProfile.Profile()
        self.start_time = time()
        self.profiler.enable()

    def __exit__(self, *args):
        self.profiler.disable()
        exec_time = int((time() - self.start_time) * 1000)
        self.profiler.dump_stats(self.profile_path)


whitelist = set(range(0, 100000000, 27))
input_list = list(range(0, 100000000))

with BlockProfile("/path/to/create/profile/in/profile.pstat"):
    proximal_list = list(filter(
            lambda x: x in whitelist,
            input_list
        ))

    proximal_list2 = [x for x in input_list if x in whitelist]

print(len(proximal_list))
print(len(proximal_list2))

Mitul · Accepted Answer · 2022-03-11 06:22:46Z

Your question is so simple yet interesting. It just shows how flexible python is, as a programming language. One may use any logic and write the program according to their talent and understandings. It is fine as long as we get the answer.

Here in your case, it is just an simple filtering method which can be done by both but i would prefer the first one my_list = [x for x in my_list if x.attribute == value] because it seems simple and does not need any special syntax. Anyone can understands this command and make changes if needs it. (Although second method is also simple, but it still has more complexity than the first one for the beginner level programmers)

Vlad Havriuk · Accepted Answer · 2023-12-15 19:45:25Z

0

As mentioned in the accepeted answer, filter() may create unneccessary function call overload, but you can use generator comprehension using parenthesis:

xs = (x for x in xs if x.attribute == value)

This way you take the best of both worlds: you get nice syntax and lazy evaluation. And if you don't need the latter, just replace () with [].

answered Dec 15, 2023 at 19:45

Vlad Havriuk

1,35917 silver badges31 bronze badges

Add a comment |

Collectives™ on Stack Overflow

List comprehension vs. lambda + filter

18 Answers 18

Summarizing other answers

My case

Test it for yourself

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
list
functional-programming
filter
lambda
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

18 Answers 18

Summarizing other answers

My case

Test it for yourself

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythonlistfunctional-programmingfilterlambda or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
list
functional-programming
filter
lambda
or ask your own question.