Most efficient way to create a list of dictionaries in Python
Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.
What I am doing is the following:
def f(x):
some function f that returns a dictionary given x
list_of_dict =
xlist = [x1, x2, ..., xN]
for x in xlist:
list_of_dict.append(f(x))
I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.
Thank you.
python list dictionary
add a comment |
Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.
What I am doing is the following:
def f(x):
some function f that returns a dictionary given x
list_of_dict =
xlist = [x1, x2, ..., xN]
for x in xlist:
list_of_dict.append(f(x))
I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.
Thank you.
python list dictionary
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 '18 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 '18 at 0:09
1
Theappend
loop can be replaced withlist_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing whatf(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
– usr2564301
Nov 11 '18 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 '18 at 0:18
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do:list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 '18 at 0:41
add a comment |
Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.
What I am doing is the following:
def f(x):
some function f that returns a dictionary given x
list_of_dict =
xlist = [x1, x2, ..., xN]
for x in xlist:
list_of_dict.append(f(x))
I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.
Thank you.
python list dictionary
Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.
What I am doing is the following:
def f(x):
some function f that returns a dictionary given x
list_of_dict =
xlist = [x1, x2, ..., xN]
for x in xlist:
list_of_dict.append(f(x))
I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.
Thank you.
python list dictionary
python list dictionary
asked Nov 11 '18 at 0:02
mgiommgiom
226
226
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 '18 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 '18 at 0:09
1
Theappend
loop can be replaced withlist_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing whatf(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
– usr2564301
Nov 11 '18 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 '18 at 0:18
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do:list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 '18 at 0:41
add a comment |
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 '18 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 '18 at 0:09
1
Theappend
loop can be replaced withlist_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing whatf(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
– usr2564301
Nov 11 '18 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 '18 at 0:18
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do:list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 '18 at 0:41
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 '18 at 0:07
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 '18 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 '18 at 0:09
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 '18 at 0:09
1
1
The
append
loop can be replaced with list_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing what f(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.– usr2564301
Nov 11 '18 at 0:12
The
append
loop can be replaced with list_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing what f(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.– usr2564301
Nov 11 '18 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 '18 at 0:18
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 '18 at 0:18
1
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 '18 at 0:41
map
is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 '18 at 0:41
add a comment |
2 Answers
2
active
oldest
votes
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
add a comment |
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 '18 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 '18 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 '18 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 '18 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 '18 at 2:52
|
show 2 more comments
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244627%2fmost-efficient-way-to-create-a-list-of-dictionaries-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
add a comment |
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
add a comment |
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
edited Nov 11 '18 at 2:57
answered Nov 11 '18 at 2:45
DYZDYZ
26.1k61948
26.1k61948
add a comment |
add a comment |
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 '18 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 '18 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 '18 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 '18 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 '18 at 2:52
|
show 2 more comments
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 '18 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 '18 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 '18 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 '18 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 '18 at 2:52
|
show 2 more comments
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
answered Nov 11 '18 at 0:47
Charles LandauCharles Landau
2,1851215
2,1851215
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 '18 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 '18 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 '18 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 '18 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 '18 at 2:52
|
show 2 more comments
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 '18 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 '18 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 '18 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 '18 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 '18 at 2:52
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 '18 at 1:45
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 '18 at 1:45
In general,
async
does not speed up computational tasks.– DYZ
Nov 11 '18 at 2:47
In general,
async
does not speed up computational tasks.– DYZ
Nov 11 '18 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 '18 at 2:48
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 '18 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 '18 at 2:51
The OP never mentions requests in their post.
– DYZ
Nov 11 '18 at 2:51
1
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 '18 at 2:52
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 '18 at 2:52
|
show 2 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244627%2fmost-efficient-way-to-create-a-list-of-dictionaries-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 '18 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 '18 at 0:09
1
The
append
loop can be replaced withlist_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing whatf(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.– usr2564301
Nov 11 '18 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 '18 at 0:18
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do:list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 '18 at 0:41