Questions tagged [tfrecord]
TensorFlow Record Format. A TFRecord file represents a sequence of (binary) strings. The format is not random access, so it is suitable for streaming large amounts of data but not suitable if fast sharding or other non-sequential access is desired.
tfrecord
430
questions
85
votes
8
answers
71k
views
How to inspect a Tensorflow .tfrecord file?
I have a .tfrecord but I don't know how it is structured. How can I inspect the schema to understand what the .tfrecord file contains?
All Stackoverflow answers or documentation seem to assume I know ...
65
votes
8
answers
73k
views
How do I convert a directory of jpeg images to TFRecords file in tensorflow?
I have training data that is a directory of jpeg images and a corresponding text file containing the file name and the associated category label. I am trying to convert this training data into a ...
35
votes
7
answers
42k
views
TensorFlow - Read all examples from a TFRecords at once?
How do you read all examples from a TFRecords at once?
I've been using tf.parse_single_example to read out individual examples using code similar to that given in the method read_and_decode in the ...
33
votes
3
answers
17k
views
how to store numpy arrays as tfrecord?
I am trying to create a dataset in tfrecord format from numpy arrays. I am trying to store 2d and 3d coordinates.
2d coordinates are numpy array of shape (2,10) of type float64
3d coordinates are ...
32
votes
5
answers
25k
views
Obtaining total number of records from .tfrecords file in Tensorflow
Is it possible for obtain the total number of records from a .tfrecords file ? Related to this, how does one generally keep track of the number of epochs that have elapsed while training models? While ...
21
votes
1
answer
33k
views
TensorFlow strings: what they are and how to work with them
When I read file with tf.read_file I get something with type tf.string. Documentation says only that it is "Variable length byte arrays. Each element of a Tensor is a byte array." (https://www....
20
votes
2
answers
25k
views
Tensorflow TFRecord: Can't parse serialized example
I am trying to follow this guide in order to serialize my input data into the TFRecord format but I keep hitting this error when trying to read it:
InvalidArgumentError: Key: my_key. Can't parse ...
19
votes
3
answers
6k
views
How to efficiently save a Pandas Dataframe into one/more TFRecord file?
First I want to quickly give some background. What I want to achieve eventually is to train a fully connected neural network for a multi-class classification problem under tensorflow framework.
The ...
17
votes
1
answer
11k
views
how to convert numpy to tfrecords and then generate batches?
My question is about how to get batch inputs from multiple (or sharded) tfrecords. I've read the example https://github.com/tensorflow/models/blob/master/inception/inception/image_processing.py#L410. ...
14
votes
6
answers
8k
views
Split .tfrecords file into many .tfrecords files
Is there any way to split .tfrecords file into many .tfrecords files directly, without writing back each Dataset example ?
14
votes
2
answers
19k
views
AttributeError: 'Tensor' object has no attribute 'numpy' in Tensorflow 2.1
I am trying to convert the shape property of a Tensor in Tensorflow 2.1 and I get this error:
AttributeError: 'Tensor' object has no attribute 'numpy'
I already checked that the output of tf....
12
votes
3
answers
9k
views
Tensorflow/models uses COCO 90 class ids although COCO has only 80 categories
The labelmaps of Tensorflows object_detection project contain 90 classes, although COCO has only 80 categories.
Therefore the parameter num_classes in all sample configs is set to 90.
If i now ...
12
votes
3
answers
5k
views
How to visualize a TFRecord?
I was asked this on another forum but thought I'd post it here for anyone that is having trouble with TFRecords.
TensorFlow's Object Detection API can produce strange behavior if the labels in the ...
11
votes
1
answer
17k
views
Numpy array to TFrecord
I'm trying to train a custom dataset through tensorflow object detection api. Dataset contains 40k training images and labels which are in numpy ndarray format (uint8). training dataset shape=2 ([...
11
votes
0
answers
2k
views
How to decode Unicode string in Tensorflow's graph pipeline
I have created a tfRecord file to store data. I have to store Hindi text so, I have saved it in the bytes using string.encode('utf-8').
But, I am stuck at the time of reading the data. I am reading ...
11
votes
0
answers
846
views
TensorFlow Example vs SequenceExample
Theres not that much information given in the TensorFlow documentation:
https://www.tensorflow.org/api_docs/python/tf/train/Example
https://www.tensorflow.org/api_docs/python/tf/train/SequenceExample
...
10
votes
1
answer
16k
views
Proper way to iterate tf.data.Dataset in session for 2.0
I have downloaded some *.tfrecord data from the youtube-8m project. You can download a 'small' portion of the data with this command:
curl data.yt8m.org/download.py | shard=1,100 partition=2/video/...
10
votes
1
answer
3k
views
How to download a sentinel images from google earth engine using python API in tfrecord
While trying to download sentinel image for a specific location, the tif file is generated by default in drive but its not readable by openCV or PIL.Image().Below is the code for the same. If I use ...
9
votes
2
answers
8k
views
TensorFlow - Read video frames from TFRecords file
TLDR; my question is on how to load compressed video frames from TFRecords.
I am setting up a data pipeline for training deep learning models on a large video dataset (Kinetics). For this I am using ...
9
votes
3
answers
7k
views
Tensorflow object detection API killed - OOM. How to reduce shuffle buffer size?
System information
OS Platform and Distribution: CentOS 7.5.1804
TensorFlow installed from: pip install tensorflow-gpu
TensorFlow version: tensorflow-gpu 1.8.0
CUDA/cuDNN version: 9.0/7.1.2
GPU model ...
9
votes
1
answer
12k
views
Tensorflow 2.0: how to transform from MapDataset (after reading from TFRecord) to some structure that can be input to model.fit
I've stored my training and validation data on two separate TFRecord files, in which I store 4 values: signal A (float32 shape (150,)), signal B (float32 shape (150,)), label (scalar int64), id (...
9
votes
1
answer
7k
views
Tensorflow: Modern way to load large data
I want to train a convolutional neural network (using tf.keras from Tensorflow version 1.13) using numpy arrays as input data. The training data (which I currently store in a single >30GB '.npz' ...
9
votes
0
answers
1k
views
Tensorflow get_single_element not working with tf.data.TFRecordDataset.batch()
I am trying to perform ZCA whitening on a Tensorflow Dataset. In order to do this, I am trying to extract my data from my Dataset as a Tensor, perform the whitening, then create another Dataset after. ...
8
votes
2
answers
6k
views
How to use Dataset API to read TFRecords file of lists of variant length?
I want to use Tensorflow's Dataset API to read TFRecords file of lists of variant length. Here is my code.
def _int64_feature(value):
# value must be a numpy array.
return tf.train.Feature(...
8
votes
1
answer
3k
views
How to import tfrecord files in a pandas dataframe?
I have a tfrecord file and would like to import it in a pandas dataframe or numpy array.
I found tools to read tfrecords but they only work inside a tensorflow session, which is not the use case I ...
8
votes
1
answer
4k
views
Read data from TFRecord file used in Object Detection API
I want to read the data stored in a TFRecord file that I've used as a train record in TF Object Detection API.
However, I get an InvalidArgumentError: Input to reshape is a tensor with 91090 values, ...
8
votes
2
answers
7k
views
In TensorFlow 2.0, how to feed TFRecord data to keras model?
I've tried to solve classification problem whose input data having 32 features and 16 labels by Deep Neural Network (DNN).
They look like,
# Input data
shape=(32,), dtype=float32,
np.array([-0....
8
votes
0
answers
235
views
Writing from TF Dataset to tfrecords file
It's really easy to read a TFRecords file into a TF Dataset by using a TFRecordDataset, but is there a similar way to write into a TFRecord file given that I already have the info in a Dataset, or do ...
7
votes
1
answer
12k
views
TensorFlow Dataset.shuffle - large dataset [duplicate]
I'm using TensorFlow 1.2 with a dataset in a 20G TFRecord file. There is about half a million samples in that TFRecord file.
Looks like if I choose a value smaller than the amount of records in the ...
7
votes
3
answers
5k
views
Tensorflow: Count number of examples in a TFRecord file -- without using deprecated `tf.python_io.tf_record_iterator`
Please read post before marking Duplicate:
I was looking for an efficient way to count the number of examples in a TFRecord file of images. Since a TFRecord file does not save any metadata about the ...
7
votes
1
answer
5k
views
Writing and Reading lists to TFRecord example
I want to write a list of integers (or any multidimensional numpy matrix) to one TFRecords example. For both a single value or a list of multiple values I can creates the TFRecord file without error. ...
7
votes
4
answers
16k
views
How to load tfrecord in pytorch?
How to use tfrecord with pytorch?
I have downloaded "Youtube8M" datasets with video-level features, but it is stored in tfrecord.
I tried to read some sample from these file to convert it to numpy ...
7
votes
2
answers
6k
views
tensorflow ValueError: features should be a dictionary of `Tensor`s. Given type: <class 'tensorflow.python.framework.ops.Tensor'>
This is my code!
My tensorflow version is 1.6.0, python version is 3.6.4.
If I direct use dataset to read csv file, I can train and no wrong. But I convert csv file to tfrecords file, it's wrong. I ...
7
votes
1
answer
5k
views
how can I save a string data to TFRecord?
when save to TFRecord, I use:
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=...
6
votes
2
answers
653
views
Chunk tensorflow dataset records into multiple records
I have an unbatched tensorflow dataset that looks like this:
ds = ...
for record in ds.take(3):
print('data shape={}'.format(record['data'].shape))
-> data shape=(512, 512, 87)
-> data ...
5
votes
1
answer
4k
views
Shuffling tfrecords files
I have 5 tfrecords files, one for each object. While training I want to read data equally from all the 5 tfrecords i.e. if my batch size is 50, I should get 10 samples from 1st tfrecord file, 10 ...
5
votes
1
answer
3k
views
TFRecord vs RecordIO
TensorFlow Object Detection API prefers TFRecord file format. MXNet and Amazon Sagemaker seem to use RecordIO format. How are these two binary file formats different, or are they the same thing?
5
votes
2
answers
3k
views
Tensorflow: read variable length data, via Dataset (tfrecord)
Best
I would like to read some TF records data.
This works, but only for Fixed length data, but now I would like to do the same thing with variable length data VarLenFeature
def load_tfrecord_fixed(...
5
votes
1
answer
3k
views
Reading a TFRecord file where features that were used to encode is not known
I am very new to TensorFlow and this might be a very beginner question. I have seen examples where custom datasets are converted to TFRecord files using the knowledge of the features one wants to use (...
5
votes
3
answers
2k
views
TFRecord format for multiple instances of the same or different classes on one training image
I am trying to train a Faster R-CNN on grocery dataset detection using the new Object Detection API, but I do not quite understand the process of creating a TFRecord file for that. I am aware of the ...
5
votes
1
answer
1k
views
Generating TFRecord format data from C+
I'm trying to use TFRecord format to record data from C++ and then use it in python to feed TensorFlow model.
TLDR; Simply serializing proto messages into a stream doesn't satisfy .tfrecord format ...
5
votes
2
answers
3k
views
How to add class to existing model?
I have trained a model using tensorflow object detection/SSD mobilenet. It works great!
I'd like to add a class to it - just to detect pens or something.
How can I do this?
I have created my image ...
5
votes
2
answers
5k
views
How to convert multiple parquet files into TFrecord files using SPARK?
I would like to produce stratified TFrecord files from a large DataFrame based on a certain condition, for which I use write.partitionBy(). I'm also using the tensorflow-connector in SPARK, but this ...
5
votes
3
answers
3k
views
How to create multiple TFRecord files instead of making a big one and then splitting it up?
I'm dealing with quite big time series dataset, one that prepared as SequenceExamples is then written to a TFRecord. This results in a quite large file (over 100GB) but I'd like to have it stored in ...
5
votes
1
answer
568
views
What do the arguments for TFRecordOptions actually mean (wrt tf.io.TFRecordWriter)?
I export some fairly large Pandas dataframes to Tensorflow's serialized format. And I do it often and it's really slow. Which is probably because I have to serialize the individual examples idk. Also, ...
5
votes
0
answers
227
views
tfRecords shown faulty in TF2
I have a couple of own tfrecord file made by myself.
They are working perfectly in tf1, I used them in several projects.
However if i want to use them in Tensorflow Object Detection API with tf2 (...
4
votes
2
answers
5k
views
Tensorflow Object Detection, error while generating tfrecord [TypeError: None has type NoneType, but expected one of: int, long]
When checking across different solutions available on the net, most people (including datitran) pointed out that it might be a missing class or a misspell of a class in the train csv file. Am not able ...
4
votes
2
answers
17k
views
as_list() is not defined on an unknown TensorShape
**Update
After implementing @jdehesa answer: My code looks like this:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import numpy as np
...
4
votes
2
answers
2k
views
TFrecords occupy more space than original JPEG images
I'm trying to convert my Jpeg image set into to TFrecords. But TFrecord file is taking almost 5x more space than the image set. After a lot of googling, I learned that when JPEG are written into ...
4
votes
2
answers
4k
views
Write and Read SparseTensor to and from a tfrecord file
Is it possible to do this elegantly?
Right now only thing I can think of is to save the indices (tf.int64), values (tf.float32), and shape (tf.int64) of the SparseTensor in 3 separate Features (the ...