Questions tagged [tfrecord]

TensorFlow Record Format. A TFRecord file represents a sequence of (binary) strings. The format is not random access, so it is suitable for streaming large amounts of data but not suitable if fast sharding or other non-sequential access is desired.

tfrecord
Filter by
Sorted by
Tagged with
85 votes
8 answers
71k views

How to inspect a Tensorflow .tfrecord file?

I have a .tfrecord but I don't know how it is structured. How can I inspect the schema to understand what the .tfrecord file contains? All Stackoverflow answers or documentation seem to assume I know ...
Bob van Luijt's user avatar
65 votes
8 answers
73k views

How do I convert a directory of jpeg images to TFRecords file in tensorflow?

I have training data that is a directory of jpeg images and a corresponding text file containing the file name and the associated category label. I am trying to convert this training data into a ...
Nadav Ben-Haim's user avatar
35 votes
7 answers
42k views

TensorFlow - Read all examples from a TFRecords at once?

How do you read all examples from a TFRecords at once? I've been using tf.parse_single_example to read out individual examples using code similar to that given in the method read_and_decode in the ...
golmschenk's user avatar
  • 12.1k
33 votes
3 answers
17k views

how to store numpy arrays as tfrecord?

I am trying to create a dataset in tfrecord format from numpy arrays. I am trying to store 2d and 3d coordinates. 2d coordinates are numpy array of shape (2,10) of type float64 3d coordinates are ...
csbk's user avatar
  • 569
32 votes
5 answers
25k views

Obtaining total number of records from .tfrecords file in Tensorflow

Is it possible for obtain the total number of records from a .tfrecords file ? Related to this, how does one generally keep track of the number of epochs that have elapsed while training models? While ...
HuckleberryFinn's user avatar
21 votes
1 answer
33k views

TensorFlow strings: what they are and how to work with them

When I read file with tf.read_file I get something with type tf.string. Documentation says only that it is "Variable length byte arrays. Each element of a Tensor is a byte array." (https://www....
ckorzhik's user avatar
  • 778
20 votes
2 answers
25k views

Tensorflow TFRecord: Can't parse serialized example

I am trying to follow this guide in order to serialize my input data into the TFRecord format but I keep hitting this error when trying to read it: InvalidArgumentError: Key: my_key. Can't parse ...
Stewart_R's user avatar
  • 14.1k
19 votes
3 answers
6k views

How to efficiently save a Pandas Dataframe into one/more TFRecord file?

First I want to quickly give some background. What I want to achieve eventually is to train a fully connected neural network for a multi-class classification problem under tensorflow framework. The ...
Ling Gu's user avatar
  • 249
17 votes
1 answer
11k views

how to convert numpy to tfrecords and then generate batches?

My question is about how to get batch inputs from multiple (or sharded) tfrecords. I've read the example https://github.com/tensorflow/models/blob/master/inception/inception/image_processing.py#L410. ...
mining's user avatar
  • 3,619
14 votes
6 answers
8k views

Split .tfrecords file into many .tfrecords files

Is there any way to split .tfrecords file into many .tfrecords files directly, without writing back each Dataset example ?
christk's user avatar
  • 854
14 votes
2 answers
19k views

AttributeError: 'Tensor' object has no attribute 'numpy' in Tensorflow 2.1

I am trying to convert the shape property of a Tensor in Tensorflow 2.1 and I get this error: AttributeError: 'Tensor' object has no attribute 'numpy' I already checked that the output of tf....
Nick Skywalker's user avatar
12 votes
3 answers
9k views

Tensorflow/models uses COCO 90 class ids although COCO has only 80 categories

The labelmaps of Tensorflows object_detection project contain 90 classes, although COCO has only 80 categories. Therefore the parameter num_classes in all sample configs is set to 90. If i now ...
gustavz's user avatar
  • 3,034
12 votes
3 answers
5k views

How to visualize a TFRecord?

I was asked this on another forum but thought I'd post it here for anyone that is having trouble with TFRecords. TensorFlow's Object Detection API can produce strange behavior if the labels in the ...
Steve Goley's user avatar
11 votes
1 answer
17k views

Numpy array to TFrecord

I'm trying to train a custom dataset through tensorflow object detection api. Dataset contains 40k training images and labels which are in numpy ndarray format (uint8). training dataset shape=2 ([...
Govinda Malavipathirana's user avatar
11 votes
0 answers
2k views

How to decode Unicode string in Tensorflow's graph pipeline

I have created a tfRecord file to store data. I have to store Hindi text so, I have saved it in the bytes using string.encode('utf-8'). But, I am stuck at the time of reading the data. I am reading ...
lifeisshubh's user avatar
11 votes
0 answers
846 views

TensorFlow Example vs SequenceExample

Theres not that much information given in the TensorFlow documentation: https://www.tensorflow.org/api_docs/python/tf/train/Example https://www.tensorflow.org/api_docs/python/tf/train/SequenceExample ...
pilz2985's user avatar
  • 257
10 votes
1 answer
16k views

Proper way to iterate tf.data.Dataset in session for 2.0

I have downloaded some *.tfrecord data from the youtube-8m project. You can download a 'small' portion of the data with this command: curl data.yt8m.org/download.py | shard=1,100 partition=2/video/...
leonard's user avatar
  • 815
10 votes
1 answer
3k views

How to download a sentinel images from google earth engine using python API in tfrecord

While trying to download sentinel image for a specific location, the tif file is generated by default in drive but its not readable by openCV or PIL.Image().Below is the code for the same. If I use ...
Mohit Anand's user avatar
9 votes
2 answers
8k views

TensorFlow - Read video frames from TFRecords file

TLDR; my question is on how to load compressed video frames from TFRecords. I am setting up a data pipeline for training deep learning models on a large video dataset (Kinetics). For this I am using ...
verified.human's user avatar
9 votes
3 answers
7k views

Tensorflow object detection API killed - OOM. How to reduce shuffle buffer size?

System information OS Platform and Distribution: CentOS 7.5.1804 TensorFlow installed from: pip install tensorflow-gpu TensorFlow version: tensorflow-gpu 1.8.0 CUDA/cuDNN version: 9.0/7.1.2 GPU model ...
dpaddon's user avatar
  • 91
9 votes
1 answer
12k views

Tensorflow 2.0: how to transform from MapDataset (after reading from TFRecord) to some structure that can be input to model.fit

I've stored my training and validation data on two separate TFRecord files, in which I store 4 values: signal A (float32 shape (150,)), signal B (float32 shape (150,)), label (scalar int64), id (...
Alberto A's user avatar
  • 1,230
9 votes
1 answer
7k views

Tensorflow: Modern way to load large data

I want to train a convolutional neural network (using tf.keras from Tensorflow version 1.13) using numpy arrays as input data. The training data (which I currently store in a single >30GB '.npz' ...
Adomas Baliuka's user avatar
9 votes
0 answers
1k views

Tensorflow get_single_element not working with tf.data.TFRecordDataset.batch()

I am trying to perform ZCA whitening on a Tensorflow Dataset. In order to do this, I am trying to extract my data from my Dataset as a Tensor, perform the whitening, then create another Dataset after. ...
takeoffs_alex's user avatar
8 votes
2 answers
6k views

How to use Dataset API to read TFRecords file of lists of variant length?

I want to use Tensorflow's Dataset API to read TFRecords file of lists of variant length. Here is my code. def _int64_feature(value): # value must be a numpy array. return tf.train.Feature(...
Lion Lai's user avatar
  • 2,005
8 votes
1 answer
3k views

How to import tfrecord files in a pandas dataframe?

I have a tfrecord file and would like to import it in a pandas dataframe or numpy array. I found tools to read tfrecords but they only work inside a tensorflow session, which is not the use case I ...
maxk's user avatar
  • 161
8 votes
1 answer
4k views

Read data from TFRecord file used in Object Detection API

I want to read the data stored in a TFRecord file that I've used as a train record in TF Object Detection API. However, I get an InvalidArgumentError: Input to reshape is a tensor with 91090 values, ...
Thomas Fauskanger's user avatar
8 votes
2 answers
7k views

In TensorFlow 2.0, how to feed TFRecord data to keras model?

I've tried to solve classification problem whose input data having 32 features and 16 labels by Deep Neural Network (DNN). They look like, # Input data shape=(32,), dtype=float32, np.array([-0....
구마왕's user avatar
  • 488
8 votes
0 answers
235 views

Writing from TF Dataset to tfrecords file

It's really easy to read a TFRecords file into a TF Dataset by using a TFRecordDataset, but is there a similar way to write into a TFRecord file given that I already have the info in a Dataset, or do ...
Amartya's user avatar
  • 83
7 votes
1 answer
12k views

TensorFlow Dataset.shuffle - large dataset [duplicate]

I'm using TensorFlow 1.2 with a dataset in a 20G TFRecord file. There is about half a million samples in that TFRecord file. Looks like if I choose a value smaller than the amount of records in the ...
rodrigo-silveira's user avatar
7 votes
3 answers
5k views

Tensorflow: Count number of examples in a TFRecord file -- without using deprecated `tf.python_io.tf_record_iterator`

Please read post before marking Duplicate: I was looking for an efficient way to count the number of examples in a TFRecord file of images. Since a TFRecord file does not save any metadata about the ...
krishnab's user avatar
  • 9,630
7 votes
1 answer
5k views

Writing and Reading lists to TFRecord example

I want to write a list of integers (or any multidimensional numpy matrix) to one TFRecords example. For both a single value or a list of multiple values I can creates the TFRecord file without error. ...
Shahriar49's user avatar
7 votes
4 answers
16k views

How to load tfrecord in pytorch?

How to use tfrecord with pytorch? I have downloaded "Youtube8M" datasets with video-level features, but it is stored in tfrecord. I tried to read some sample from these file to convert it to numpy ...
Whisht's user avatar
  • 723
7 votes
2 answers
6k views

tensorflow ValueError: features should be a dictionary of `Tensor`s. Given type: <class 'tensorflow.python.framework.ops.Tensor'>

This is my code! My tensorflow version is 1.6.0, python version is 3.6.4. If I direct use dataset to read csv file, I can train and no wrong. But I convert csv file to tfrecords file, it's wrong. I ...
LIN's user avatar
  • 141
7 votes
1 answer
5k views

how can I save a string data to TFRecord?

when save to TFRecord, I use: def _int64_feature(value): return tf.train.Feature(int64_list=tf.train.Int64List(value=[value])) def _bytes_feature(value): return tf.train.Feature(bytes_list=...
Shouyu Chen's user avatar
6 votes
2 answers
653 views

Chunk tensorflow dataset records into multiple records

I have an unbatched tensorflow dataset that looks like this: ds = ... for record in ds.take(3): print('data shape={}'.format(record['data'].shape)) -> data shape=(512, 512, 87) -> data ...
Ollie's user avatar
  • 624
5 votes
1 answer
4k views

Shuffling tfrecords files

I have 5 tfrecords files, one for each object. While training I want to read data equally from all the 5 tfrecords i.e. if my batch size is 50, I should get 10 samples from 1st tfrecord file, 10 ...
deep_jandu's user avatar
5 votes
1 answer
3k views

TFRecord vs RecordIO

TensorFlow Object Detection API prefers TFRecord file format. MXNet and Amazon Sagemaker seem to use RecordIO format. How are these two binary file formats different, or are they the same thing?
Austin's user avatar
  • 7,101
5 votes
2 answers
3k views

Tensorflow: read variable length data, via Dataset (tfrecord)

Best I would like to read some TF records data. This works, but only for Fixed length data, but now I would like to do the same thing with variable length data VarLenFeature def load_tfrecord_fixed(...
Dieter's user avatar
  • 2,579
5 votes
1 answer
3k views

Reading a TFRecord file where features that were used to encode is not known

I am very new to TensorFlow and this might be a very beginner question. I have seen examples where custom datasets are converted to TFRecord files using the knowledge of the features one wants to use (...
Sherine Brahma's user avatar
5 votes
3 answers
2k views

TFRecord format for multiple instances of the same or different classes on one training image

I am trying to train a Faster R-CNN on grocery dataset detection using the new Object Detection API, but I do not quite understand the process of creating a TFRecord file for that. I am aware of the ...
Ivan Shelonik's user avatar
5 votes
1 answer
1k views

Generating TFRecord format data from C+

I'm trying to use TFRecord format to record data from C++ and then use it in python to feed TensorFlow model. TLDR; Simply serializing proto messages into a stream doesn't satisfy .tfrecord format ...
khkarens's user avatar
  • 1,325
5 votes
2 answers
3k views

How to add class to existing model?

I have trained a model using tensorflow object detection/SSD mobilenet. It works great! I'd like to add a class to it - just to detect pens or something. How can I do this? I have created my image ...
Simon Kiely's user avatar
  • 5,950
5 votes
2 answers
5k views

How to convert multiple parquet files into TFrecord files using SPARK?

I would like to produce stratified TFrecord files from a large DataFrame based on a certain condition, for which I use write.partitionBy(). I'm also using the tensorflow-connector in SPARK, but this ...
Kristof's user avatar
  • 144
5 votes
3 answers
3k views

How to create multiple TFRecord files instead of making a big one and then splitting it up?

I'm dealing with quite big time series dataset, one that prepared as SequenceExamples is then written to a TFRecord. This results in a quite large file (over 100GB) but I'd like to have it stored in ...
Coldark's user avatar
  • 445
5 votes
1 answer
568 views

What do the arguments for TFRecordOptions actually mean (wrt tf.io.TFRecordWriter)?

I export some fairly large Pandas dataframes to Tensorflow's serialized format. And I do it often and it's really slow. Which is probably because I have to serialize the individual examples idk. Also, ...
grofte's user avatar
  • 1,999
5 votes
0 answers
227 views

tfRecords shown faulty in TF2

I have a couple of own tfrecord file made by myself. They are working perfectly in tf1, I used them in several projects. However if i want to use them in Tensorflow Object Detection API with tf2 (...
Nemes Gyula Ádám's user avatar
4 votes
2 answers
5k views

Tensorflow Object Detection, error while generating tfrecord [TypeError: None has type NoneType, but expected one of: int, long]

When checking across different solutions available on the net, most people (including datitran) pointed out that it might be a missing class or a misspell of a class in the train csv file. Am not able ...
Pais's user avatar
  • 63
4 votes
2 answers
17k views

as_list() is not defined on an unknown TensorShape

**Update After implementing @jdehesa answer: My code looks like this: from __future__ import absolute_import, division, print_function, unicode_literals import tensorflow as tf import numpy as np ...
mangawy's user avatar
  • 65
4 votes
2 answers
2k views

TFrecords occupy more space than original JPEG images

I'm trying to convert my Jpeg image set into to TFrecords. But TFrecord file is taking almost 5x more space than the image set. After a lot of googling, I learned that when JPEG are written into ...
Uchiha Madara's user avatar
4 votes
2 answers
4k views

Write and Read SparseTensor to and from a tfrecord file

Is it possible to do this elegantly? Right now only thing I can think of is to save the indices (tf.int64), values (tf.float32), and shape (tf.int64) of the SparseTensor in 3 separate Features (the ...
Maosi Chen's user avatar
  • 1,491

1
2 3 4 5
9