How do I write a futures::Stream to disk without storing it entirely in memory first?

There's an example of downloading a file with Rusoto S3 here:
How to save a file downloaded from S3 with Rusoto to my hard drive?

The problem is that it looks like it's downloading the whole file into memory and then writing it to disk, because it uses the write_all method which takes an array of bytes, not a stream. How can I use the StreamingBody, which implements futures::Stream to stream the file to disk?

edited Nov 11 '18 at 4:26

Shepmaster

151k13292431

asked Nov 11 '18 at 2:46

Nicholas Bishop

572413

for x in stream file.write_all(&x) something like that...

– Stargateur
Nov 11 '18 at 4:04

That would require StreamingBody to be an iterator, which it is not.

– Nicholas Bishop
Nov 11 '18 at 4:09

add a comment |

There's an example of downloading a file with Rusoto S3 here:
How to save a file downloaded from S3 with Rusoto to my hard drive?

edited Nov 11 '18 at 4:26

Shepmaster

151k13292431

asked Nov 11 '18 at 2:46

Nicholas Bishop

572413

for x in stream file.write_all(&x) something like that...

– Stargateur
Nov 11 '18 at 4:04

That would require StreamingBody to be an iterator, which it is not.

– Nicholas Bishop
Nov 11 '18 at 4:09

add a comment |

There's an example of downloading a file with Rusoto S3 here:
How to save a file downloaded from S3 with Rusoto to my hard drive?

edited Nov 11 '18 at 4:26

Shepmaster

151k13292431

asked Nov 11 '18 at 2:46

Nicholas Bishop

572413

There's an example of downloading a file with Rusoto S3 here:
How to save a file downloaded from S3 with Rusoto to my hard drive?

stream rust future

edited Nov 11 '18 at 4:26

Shepmaster

151k13292431

asked Nov 11 '18 at 2:46

Nicholas Bishop

572413

edited Nov 11 '18 at 4:26

Shepmaster

151k13292431

asked Nov 11 '18 at 2:46

Nicholas Bishop

572413

edited Nov 11 '18 at 4:26

Shepmaster

151k13292431

edited Nov 11 '18 at 4:26

Shepmaster

151k13292431

edited Nov 11 '18 at 4:26

Shepmaster

151k13292431

asked Nov 11 '18 at 2:46

Nicholas Bishop

572413

asked Nov 11 '18 at 2:46

Nicholas Bishop

572413

asked Nov 11 '18 at 2:46

Nicholas Bishop

572413

for x in stream file.write_all(&x) something like that...

– Stargateur
Nov 11 '18 at 4:04

That would require StreamingBody to be an iterator, which it is not.

– Nicholas Bishop
Nov 11 '18 at 4:09

add a comment |

for x in stream file.write_all(&x) something like that...

– Stargateur
Nov 11 '18 at 4:04

That would require StreamingBody to be an iterator, which it is not.

– Nicholas Bishop
Nov 11 '18 at 4:09

for x in stream file.write_all(&x) something like that...

– Stargateur
Nov 11 '18 at 4:04

That would require StreamingBody to be an iterator, which it is not.

– Nicholas Bishop
Nov 11 '18 at 4:09

add a comment |

1 Answer
1

active

oldest

votes

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> file.write_all(&chunk).map_err(Into::into))

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 '18 at 14:12

answered Nov 11 '18 at 20:11

Shepmaster

151k13292431

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?

– Nicholas Bishop
Nov 12 '18 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)

– Sven Marnach
Nov 12 '18 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.

– Nicholas Bishop
Nov 12 '18 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.

– Sven Marnach
Nov 12 '18 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.

– Shepmaster
Nov 12 '18 at 16:29

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245412%2fhow-do-i-write-a-futuresstream-to-disk-without-storing-it-entirely-in-memory-f%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> file.write_all(&chunk).map_err(Into::into))

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 '18 at 14:12

answered Nov 11 '18 at 20:11

Shepmaster

151k13292431

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?

– Nicholas Bishop
Nov 12 '18 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)

– Sven Marnach
Nov 12 '18 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.

– Nicholas Bishop
Nov 12 '18 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.

– Sven Marnach
Nov 12 '18 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.

– Shepmaster
Nov 12 '18 at 16:29

add a comment |

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> file.write_all(&chunk).map_err(Into::into))

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 '18 at 14:12

answered Nov 11 '18 at 20:11

Shepmaster

151k13292431

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?

– Nicholas Bishop
Nov 12 '18 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)

– Sven Marnach
Nov 12 '18 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.

– Nicholas Bishop
Nov 12 '18 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.

– Sven Marnach
Nov 12 '18 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.

– Shepmaster
Nov 12 '18 at 16:29

add a comment |

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> file.write_all(&chunk).map_err(Into::into))

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 '18 at 14:12

answered Nov 11 '18 at 20:11

Shepmaster

151k13292431

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> file.write_all(&chunk).map_err(Into::into))

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 '18 at 14:12

answered Nov 11 '18 at 20:11

Shepmaster

151k13292431

edited Nov 12 '18 at 14:12

answered Nov 11 '18 at 20:11

Shepmaster

151k13292431

answered Nov 11 '18 at 20:11

Shepmaster

151k13292431

answered Nov 11 '18 at 20:11

Shepmaster

151k13292431

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?

– Nicholas Bishop
Nov 12 '18 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)

– Sven Marnach
Nov 12 '18 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.

– Nicholas Bishop
Nov 12 '18 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.

– Sven Marnach
Nov 12 '18 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.

– Shepmaster
Nov 12 '18 at 16:29

add a comment |

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?

– Nicholas Bishop
Nov 12 '18 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)

– Sven Marnach
Nov 12 '18 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.

– Nicholas Bishop
Nov 12 '18 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.

– Sven Marnach
Nov 12 '18 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.

– Shepmaster
Nov 12 '18 at 16:29

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?

– Nicholas Bishop
Nov 12 '18 at 2:45

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)

– Sven Marnach
Nov 12 '18 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.

– Nicholas Bishop
Nov 12 '18 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.

– Sven Marnach
Nov 12 '18 at 15:27

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.

– Shepmaster
Nov 12 '18 at 16:29

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Dfyjkt