Why does slicing lazy-loaded AudioIOTensor fail with stereo FLAC?

TensorFlow tutorial on audio data preparation (https://www.tensorflow.org/io/tutorials/audio) provides the following example:

import tensorflow as tf
import tensorflow_io as tfio
audio = tfio.audio.AudioIOTensor('gs://cloud-samples-tests/speech/brooklyn.flac')
print(audio)

...and then states "The content of the audio clip will only be read as needed, either by converting AudioIOTensor to Tensor through to_tensor(), or though slicing (emphasis added). Slicing is especially useful when only a small portion of a large audio clip is needed:"

audio_slice = audio[100:]
# remove last dimension
audio_tensor = tf.squeeze(audio_slice, axis=[-1])
print(audio_tensor)

This works as advertised, and it prints:

<AudioIOTensor: shape=[28979     1], dtype=<dtype: 'int16'>, rate=16000>
tf.Tensor([16 39 66 ... 56 81 83], shape=(28879,), dtype=int16)

So far, so good. Now I try this with a stereo FLAC:

audio = tfio.audio.AudioIOTensor('./audio/stereo_file.flac')
print(audio.shape)

...which prints

tf.Tensor([12371520        2], shape=(2,), dtype=int64)

I see here there are two channels as expected. Again, so far, so good.

Now I'd like to extract one channel only and take some number of samples, say 512. So I try:

audio_slice = audio[0:512, 0:1]

...and this fails and crashes Python.

Check failed: 1 == NumElements() (1 vs. 2)Must have a one element tensor
...
Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

However, if I make a copy first with empty slice notation, everything works as I'd hope.

audio_slice = audio[:]
print(audio_slice.shape)
audio_slice = audio_slice[0:512, 0:1]
print(audio_slice.shape)
audio_tensor = tf.squeeze(audio_slice, axis=[-1])
print(audio_tensor.shape)

...which prints:

tf.Tensor([12371520        2], shape=(2,), dtype=int64)
(12371520, 2)
(512, 1)
(512,)

I assume the first fails on account of the audio being lazy-loaded, but I'm not sure why it works in the tutorial but fails with my stereo file. Shouldn't the slicing load data as needed?

tensorflow>=2.8.0
tensorflow-io>=0.25.0

source https://stackoverflow.com/questions/71991390/why-does-slicing-lazy-loaded-audioiotensor-fail-with-stereo-flac

StacksPedia

Search This Blog

Why does slicing lazy-loaded AudioIOTensor fail with stereo FLAC?

Labels

Comments

Post a Comment

Popular posts from this blog

Confusion between commands.Bot and discord.Client | Which one should I use?

How to show number of registered users in Laravel based on usertype?

Why is my reports service not connecting?