finding the details of incomplete tasks from a list

finding the details of incomplete tasks from a list



I've have a recursive descent function that takes all of the files in the parent directory and all of the files from any number of child directories to send to AWS S3. I have a 5 min timeout, based from this post, set to let all of the files in the folder get pushed to S3 and if it takes longer than that I want to cancel any of the remaining tasks. While I'm setting the cancel flag for the token regardless of if the Delay or Wait of the WhenAny hit the timeout or not I want to be able take all of the tasks that didn't complete from the list and pull the details of the request for logging. Microsoft says the Id and CurrentId of the task can't be considered unique.


Delay


Wait


WhenAny



How can I get the request object that created the task from the task object?


private static void ProcessDirectory(System.IO.DirectoryInfo di)

int _timeOut = 5 * 60 * 1000;
foreach (var item in di.GetDirectories())

ProcessDirectory(item);


using (Amazon.S3.AmazonS3Client _client = new Amazon.S3.AmazonS3Client())

System.Threading.CancellationTokenSource _cancellationTokenSource = new System.Threading.CancellationTokenSource();
System.Collections.Generic.List<System.Threading.Tasks.Task<Amazon.S3.Model.PutObjectResponse>> _responses = new List<System.Threading.Tasks.Task<Amazon.S3.Model.PutObjectResponse>>(1000);
foreach (var item in di.GetFiles())

_responses.Add(_client.PutObjectAsync(new Amazon.S3.Model.PutObjectRequest

BucketName = SiteSettings.Bucket,
CannedACL = Amazon.S3.S3CannedACL.PublicRead,
FilePath = item.FullName,
Key = item.FullName.Replace(SiteSettings.OutputRoot, string.Empty).Replace(@"", "/")
, _cancellationTokenSource.Token));

// Wait 5 Mins + 1 sec
System.Threading.Tasks.Task.WhenAny(System.Threading.Tasks.Task<Amazon.S3.Model.PutObjectResponse>.WhenAll(_responses)
, System.Threading.Tasks.Task.Delay(_timeOut)).Wait(_timeOut + 1000);

_cancellationTokenSource.Cancel(); //Cancel the remaining pushes for this folder.
foreach (var item in _responses)

if (!item.IsCompleted)

//Pull the key value to log








Your code would be a lot simpler and faster if you changed ProcessDirectory into an asynchronous method and just awaited individual PUTs in the loop. Trying to execute eg 100 PUTs at the same time will results in 100 slow uploads. For large files a single upload at a time would be faster than the 100 concurrent uploads. Other options would be to use Parallel.For with a limit to how many concurrent tasks can run at a time, or an ActionBlock<T> with a DOP>1
– Panagiotis Kanavos
Sep 5 '18 at 15:36


ProcessDirectory


Parallel.For





@PanagiotisKanavos can you explain why threading one item would be faster than multiple items? It seems to me that would just create a thread for the sake of creating a thread and instead of running in the foreground it gets put into the background to complete when ever the cpu has time to get to it vs having the context and cpu ready to run the code block.
– Aaron
Sep 5 '18 at 20:25




1 Answer
1



You can save some unique key for every work item once you create it and then use that key for logging. In this example I used item.FullName as a key. Also I have taken a liberty to remove long namespaces before types for better readability, hope you wouldn't mind:


item.FullName


private static void ProcessDirectory(System.IO.DirectoryInfo di)

int _timeOut = 5 * 60 * 1000;
foreach (var item in di.GetDirectories())

ProcessDirectory(item);


using (Amazon.S3.AmazonS3Client _client = new Amazon.S3.AmazonS3Client())

CancellationTokenSource _cancellationTokenSource = new CancellationTokenSource();
Dictionary<string, Task<Amazon.S3.Model.PutObjectResponse>> _responses =
new Dictionary<string, Task<Amazon.S3.Model.PutObjectResponse>>(1000);

foreach (var item in di.GetFiles())

// use any unique information about your item here
var itemName = item.FullName;
_responses[itemName] = _client.PutObjectAsync(new Amazon.S3.Model.PutObjectRequest

BucketName = SiteSettings.Bucket,
CannedACL = Amazon.S3.S3CannedACL.PublicRead,
FilePath = itemName,
Key = item.FullName.Replace(SiteSettings.OutputRoot, string.Empty).Replace(@"", "/")
, _cancellationTokenSource.Token);

// Wait 5 Mins + 1 sec
Task.WhenAny(Task<Amazon.S3.Model.PutObjectResponse>.WhenAll(_responses.Values)
,Task.Delay(_timeOut)).Wait(_timeOut + 1000);

_cancellationTokenSource.Cancel(); //Cancel the remaining pushes for this folder.
foreach (var item in _responses)

if (!item.Value.IsCompleted)

//Pull the key value to log
var keyValue = item.Key;






You see, I exchanged List<Task<Amazon.S3.Model.PutObjectResponse>> with Dictionary<string, Task<Amazon.S3.Model.PutObjectResponse>> where key is a full name of the file. So if some task in a dictionary doesn't finish in 5 minute, you will be able to get the name of the file, that was not loaded.


List<Task<Amazon.S3.Model.PutObjectResponse>>


Dictionary<string, Task<Amazon.S3.Model.PutObjectResponse>>



Hope it helps.



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Some of your past answers have not been well-received, and you're in danger of being blocked from answering.



Please pay close attention to the following guidance:



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)