Split tensorflow dataset in dataset per class

Split tensorflow dataset in dataset per class



I have a dataset created from one tfrecord file. This dataset contains 5 different classes.



Now I want to create batches with a fixed number of elements (8 for example) from each batch. So it should create batches of 40 elements containing 8 elements of each class.



Is this possible with tf.data?




1 Answer
1



Easiest thing to do is (perhaps not very convenience) :



a) Prepare 5 different TFRecords, each ontaining elements of only one specific class.


TFRecords



b) Create 5 different tf.data.TFRecordDataset instances and hence 5 different iterators.


5


tf.data.TFRecordDataset


5



c) Then in the main code :


iterators = [....] # Store your iterators in a list
data = list(map(lambda x : x.get_next(), iterators))
data_to_use = tf.concat(....) # Concat your data in one single batch of `40` elements.



Another approach (without creating separate datasets)



a) Use only one TFRecord. But create 5 different instances of it


5



b) In each instance, use tf.data.filter(predicate) method of tf.data API, to filter records, which belong to one specific class. For that you will have to write a function, which can check for the class of each record.


tf.data.filter(predicate)


tf.data



c) Then follow step c) as in the previous solution.


c)



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)