Most efficient way to create a list of dictionaries in Python

Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.

What I am doing is the following:

def f(x):
 some function f that returns a dictionary given x

list_of_dict = 

xlist = [x1, x2, ..., xN]

for x in xlist:
 list_of_dict.append(f(x))

I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.

Thank you.

asked Nov 11 '18 at 0:02

mgiom

226

Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?

– usr2564301
Nov 11 '18 at 0:07

Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.

– mgiom
Nov 11 '18 at 0:09

1

The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.

– usr2564301
Nov 11 '18 at 0:12

Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.

– mgiom
Nov 11 '18 at 0:18

1

map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)

– dawg
Nov 11 '18 at 0:41

add a comment |

Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.

What I am doing is the following:

def f(x):
 some function f that returns a dictionary given x

list_of_dict = 

xlist = [x1, x2, ..., xN]

for x in xlist:
 list_of_dict.append(f(x))

I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.

Thank you.

asked Nov 11 '18 at 0:02

mgiom

226

Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?

– usr2564301
Nov 11 '18 at 0:07

Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.

– mgiom
Nov 11 '18 at 0:09

1

The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.

– usr2564301
Nov 11 '18 at 0:12

Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.

– mgiom
Nov 11 '18 at 0:18

1

map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)

– dawg
Nov 11 '18 at 0:41

add a comment |

Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.

What I am doing is the following:

def f(x):
 some function f that returns a dictionary given x

list_of_dict = 

xlist = [x1, x2, ..., xN]

for x in xlist:
 list_of_dict.append(f(x))

I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.

Thank you.

asked Nov 11 '18 at 0:02

mgiom

226

Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.

What I am doing is the following:

def f(x):
 some function f that returns a dictionary given x

list_of_dict = 

xlist = [x1, x2, ..., xN]

for x in xlist:
 list_of_dict.append(f(x))

I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.

Thank you.

python list dictionary

asked Nov 11 '18 at 0:02

mgiom

226

asked Nov 11 '18 at 0:02

mgiom

226

asked Nov 11 '18 at 0:02

mgiom

226

asked Nov 11 '18 at 0:02

mgiom

226

asked Nov 11 '18 at 0:02

mgiom

226

Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?

– usr2564301
Nov 11 '18 at 0:07

Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.

– mgiom
Nov 11 '18 at 0:09

1

The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.

– usr2564301
Nov 11 '18 at 0:12

Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.

– mgiom
Nov 11 '18 at 0:18

1

map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)

– dawg
Nov 11 '18 at 0:41

add a comment |

Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?

– usr2564301
Nov 11 '18 at 0:07

Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.

– mgiom
Nov 11 '18 at 0:09

1

The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.

– usr2564301
Nov 11 '18 at 0:12

Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.

– mgiom
Nov 11 '18 at 0:18

1

map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)

– dawg
Nov 11 '18 at 0:41

Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?

– usr2564301
Nov 11 '18 at 0:07

Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.

– mgiom
Nov 11 '18 at 0:09

The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.

– usr2564301
Nov 11 '18 at 0:12

Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.

– mgiom
Nov 11 '18 at 0:18

map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)

– dawg
Nov 11 '18 at 0:41

add a comment |

2 Answers
2

active

oldest

votes

Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.

The original approach seems to be the slowest:

In [20]: %%timeit 
 ...: list_of_dict = 
 ...: for x in xlist: 
 ...: list_of_dict.append(f(x)) 
 ...: 
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Mapping is the best way to go:

In [21]: %timeit list(map(f,xlist)) 
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

List comprehension is somewhere in the middle:

In [22]: %timeit [f(x) for x in xlist] 
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

edited Nov 11 '18 at 2:57

answered Nov 11 '18 at 2:45

DYZ

26.1k61948

add a comment |

-1

Since you mentioned in the comments that you want faster execution overall, maybe async the requests?

Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/

import asyncio
import requests

async def main():
 xlist = [...]
 list_of_dict = 
 loop = asyncio.get_event_loop()
 futures = [
 loop.run_in_executor(
 None, 
 requests.get, 
 i
 )
 for i in xlist
 ]
 for response in await asyncio.gather(*futures):
 respons_dict = your_parser(response) # Your parsing to dict from before
 list_of_dict.append(response_dict)
 return list_of_dict

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.

answered Nov 11 '18 at 0:47

Charles Landau

2,1851215

Did this answer responds to the OP's question ?

– Chiheb Nexus
Nov 11 '18 at 1:45

In general, async does not speed up computational tasks.

– DYZ
Nov 11 '18 at 2:47

In the example above I'm using it to make the requests in parallel, not the computation.

– Charles Landau
Nov 11 '18 at 2:48

The OP never mentions requests in their post.

– DYZ
Nov 11 '18 at 2:51

1

Correct, they mentioned it in the comments @DYZ

– Charles Landau
Nov 11 '18 at 2:52

|
show 2 more comments

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244627%2fmost-efficient-way-to-create-a-list-of-dictionaries-in-python%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

The original approach seems to be the slowest:

In [20]: %%timeit 
 ...: list_of_dict = 
 ...: for x in xlist: 
 ...: list_of_dict.append(f(x)) 
 ...: 
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Mapping is the best way to go:

In [21]: %timeit list(map(f,xlist)) 
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

List comprehension is somewhere in the middle:

In [22]: %timeit [f(x) for x in xlist] 
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

edited Nov 11 '18 at 2:57

answered Nov 11 '18 at 2:45

DYZ

26.1k61948

add a comment |

The original approach seems to be the slowest:

In [20]: %%timeit 
 ...: list_of_dict = 
 ...: for x in xlist: 
 ...: list_of_dict.append(f(x)) 
 ...: 
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Mapping is the best way to go:

In [21]: %timeit list(map(f,xlist)) 
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

List comprehension is somewhere in the middle:

In [22]: %timeit [f(x) for x in xlist] 
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

edited Nov 11 '18 at 2:57

answered Nov 11 '18 at 2:45

DYZ

26.1k61948

add a comment |

The original approach seems to be the slowest:

In [20]: %%timeit 
 ...: list_of_dict = 
 ...: for x in xlist: 
 ...: list_of_dict.append(f(x)) 
 ...: 
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Mapping is the best way to go:

In [21]: %timeit list(map(f,xlist)) 
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

List comprehension is somewhere in the middle:

In [22]: %timeit [f(x) for x in xlist] 
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

edited Nov 11 '18 at 2:57

answered Nov 11 '18 at 2:45

DYZ

26.1k61948

The original approach seems to be the slowest:

In [20]: %%timeit 
 ...: list_of_dict = 
 ...: for x in xlist: 
 ...: list_of_dict.append(f(x)) 
 ...: 
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Mapping is the best way to go:

In [21]: %timeit list(map(f,xlist)) 
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

List comprehension is somewhere in the middle:

In [22]: %timeit [f(x) for x in xlist] 
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

edited Nov 11 '18 at 2:57

answered Nov 11 '18 at 2:45

DYZ

26.1k61948

edited Nov 11 '18 at 2:57

answered Nov 11 '18 at 2:45

DYZ

26.1k61948

answered Nov 11 '18 at 2:45

DYZ

26.1k61948

answered Nov 11 '18 at 2:45

DYZ

26.1k61948

add a comment |

-1

Since you mentioned in the comments that you want faster execution overall, maybe async the requests?

Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/

import asyncio
import requests

async def main():
 xlist = [...]
 list_of_dict = 
 loop = asyncio.get_event_loop()
 futures = [
 loop.run_in_executor(
 None, 
 requests.get, 
 i
 )
 for i in xlist
 ]
 for response in await asyncio.gather(*futures):
 respons_dict = your_parser(response) # Your parsing to dict from before
 list_of_dict.append(response_dict)
 return list_of_dict

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.

answered Nov 11 '18 at 0:47

Charles Landau

2,1851215

Did this answer responds to the OP's question ?

– Chiheb Nexus
Nov 11 '18 at 1:45

In general, async does not speed up computational tasks.

– DYZ
Nov 11 '18 at 2:47

In the example above I'm using it to make the requests in parallel, not the computation.

– Charles Landau
Nov 11 '18 at 2:48

The OP never mentions requests in their post.

– DYZ
Nov 11 '18 at 2:51

1

Correct, they mentioned it in the comments @DYZ

– Charles Landau
Nov 11 '18 at 2:52

|
show 2 more comments

-1

Since you mentioned in the comments that you want faster execution overall, maybe async the requests?

Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/

import asyncio
import requests

async def main():
 xlist = [...]
 list_of_dict = 
 loop = asyncio.get_event_loop()
 futures = [
 loop.run_in_executor(
 None, 
 requests.get, 
 i
 )
 for i in xlist
 ]
 for response in await asyncio.gather(*futures):
 respons_dict = your_parser(response) # Your parsing to dict from before
 list_of_dict.append(response_dict)
 return list_of_dict

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.

answered Nov 11 '18 at 0:47

Charles Landau

2,1851215

Did this answer responds to the OP's question ?

– Chiheb Nexus
Nov 11 '18 at 1:45

In general, async does not speed up computational tasks.

– DYZ
Nov 11 '18 at 2:47

In the example above I'm using it to make the requests in parallel, not the computation.

– Charles Landau
Nov 11 '18 at 2:48

The OP never mentions requests in their post.

– DYZ
Nov 11 '18 at 2:51

1

Correct, they mentioned it in the comments @DYZ

– Charles Landau
Nov 11 '18 at 2:52

|
show 2 more comments

-1

Since you mentioned in the comments that you want faster execution overall, maybe async the requests?

Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/

import asyncio
import requests

async def main():
 xlist = [...]
 list_of_dict = 
 loop = asyncio.get_event_loop()
 futures = [
 loop.run_in_executor(
 None, 
 requests.get, 
 i
 )
 for i in xlist
 ]
 for response in await asyncio.gather(*futures):
 respons_dict = your_parser(response) # Your parsing to dict from before
 list_of_dict.append(response_dict)
 return list_of_dict

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.

answered Nov 11 '18 at 0:47

Charles Landau

2,1851215

Since you mentioned in the comments that you want faster execution overall, maybe async the requests?

Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/

import asyncio
import requests

async def main():
 xlist = [...]
 list_of_dict = 
 loop = asyncio.get_event_loop()
 futures = [
 loop.run_in_executor(
 None, 
 requests.get, 
 i
 )
 for i in xlist
 ]
 for response in await asyncio.gather(*futures):
 respons_dict = your_parser(response) # Your parsing to dict from before
 list_of_dict.append(response_dict)
 return list_of_dict

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.

answered Nov 11 '18 at 0:47

Charles Landau

2,1851215

answered Nov 11 '18 at 0:47

Charles Landau

2,1851215

answered Nov 11 '18 at 0:47

Charles Landau

2,1851215

answered Nov 11 '18 at 0:47

Charles Landau

2,1851215

Did this answer responds to the OP's question ?

– Chiheb Nexus
Nov 11 '18 at 1:45

In general, async does not speed up computational tasks.

– DYZ
Nov 11 '18 at 2:47

In the example above I'm using it to make the requests in parallel, not the computation.

– Charles Landau
Nov 11 '18 at 2:48

The OP never mentions requests in their post.

– DYZ
Nov 11 '18 at 2:51

1

Correct, they mentioned it in the comments @DYZ

– Charles Landau
Nov 11 '18 at 2:52

|
show 2 more comments

Did this answer responds to the OP's question ?

– Chiheb Nexus
Nov 11 '18 at 1:45

In general, async does not speed up computational tasks.

– DYZ
Nov 11 '18 at 2:47

In the example above I'm using it to make the requests in parallel, not the computation.

– Charles Landau
Nov 11 '18 at 2:48

The OP never mentions requests in their post.

– DYZ
Nov 11 '18 at 2:51

1

Correct, they mentioned it in the comments @DYZ

– Charles Landau
Nov 11 '18 at 2:52

Did this answer responds to the OP's question ?

– Chiheb Nexus
Nov 11 '18 at 1:45

In general, async does not speed up computational tasks.

– DYZ
Nov 11 '18 at 2:47

In the example above I'm using it to make the requests in parallel, not the computation.

– Charles Landau
Nov 11 '18 at 2:48

The OP never mentions requests in their post.

– DYZ
Nov 11 '18 at 2:51

Correct, they mentioned it in the comments @DYZ

– Charles Landau
Nov 11 '18 at 2:52

|
show 2 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Dfyjkt