Skip to main content

Why does my LSTM model predict wrong values although the loss is decreasing?

I am trying to build a machine learning model which predicts a single number from a series of numbers. I am using an LSTM model with Tensorflow.

You can imagine my dataset to look something like this:

Index x data y data
0 np.array(shape (10000,1) ) numpy.float32
1 np.array(shape (10000,1) ) numpy.float32
2 np.array(shape (10000,1) ) numpy.float32
... ... ...
56 np.array(shape (10000,1) ) numpy.float32

Easily said I just want my model to predict a number (y data) from a sequence of numbers (x data).

For example like this:

  • array([3.59280851, 3.60459062, 3.60459062, ...]) => 2.8989773
  • array([3.54752101, 3.56740332, 3.56740332, ...]) => 3.0893357
  • ...

x and y data

From my x data I created a numpy array x_train which I want to use to train the network. Because I am using an LSTM network, x_train should be of shape (samples, time_steps, features). I reshaped my x_train array to be shaped like this: (57, 10000, 1), because I have 57 samples, which each are of length 10000 and contain a single number.

The y data was created similarly and is of shape (57,1) because, once again, I have 57 samples which each contain a single number as the desired y output.

Current model attempt

My model summary looks like this: current model

The model was compiled with model.compile(loss="mse", optimizer="adam") so my loss function is simply the mean squared error and as an optimizer I'm using Adam.

Current results

Training of the model works fine and I can see that the loss and validation loss decreases after some epochs. The actual problem occurs when I want to predict some data y_verify from some data x_verify. I do this after the training is finished to determine how well the model is trained. In the following example I simply used the data I used for training to determine how well the model is trained (I know about overfitting and that verifying with the training set is not the right way of doing it, but that is not the problem I want to demonstrate right not).

In the following graph you can see the y data I provided to the model in blue. The orange line is the result of calling model.predict(x_verify) where x_verify is of the same shape as x_train.

current results

I also calculated the mean absolute percentage error (MAPE) of my prediction and the actual data and it came out to be around 4% which is not bad, because I only trained for 40 epochs. But this result still is not helpful at all because as you can see in the graph above the curves do not match at all.

Question:

What is going on here?

Am I using an incorrect loss function?

Why does it seem like the model tries to predict a single value for all samples rather than predicting a different value for all samples like it's supposed to be?

Ideally the prediction should be the y data which I provided so the curves should look the same (more or less).

Do you have any ideas?

Thanks! :)



source https://stackoverflow.com/questions/73457069/why-does-my-lstm-model-predict-wrong-values-although-the-loss-is-decreasing

Comments

Popular posts from this blog

ValueError: X has 10 features, but LinearRegression is expecting 1 features as input

So, I am trying to predict the model but its throwing error like it has 10 features but it expacts only 1. So I am confused can anyone help me with it? more importantly its not working for me when my friend runs it. It works perfectly fine dose anyone know the reason about it? cv = KFold(n_splits = 10) all_loss = [] for i in range(9): # 1st for loop over polynomial orders poly_order = i X_train = make_polynomial(x, poly_order) loss_at_order = [] # initiate a set to collect loss for CV for train_index, test_index in cv.split(X_train): print('TRAIN:', train_index, 'TEST:', test_index) X_train_cv, X_test_cv = X_train[train_index], X_test[test_index] t_train_cv, t_test_cv = t[train_index], t[test_index] reg.fit(X_train_cv, t_train_cv) loss_at_order.append(np.mean((t_test_cv - reg.predict(X_test_cv))**2)) # collect loss at fold all_loss.append(np.mean(loss_at_order)) # collect loss at order plt.plot(np.log(al...

Sorting large arrays of big numeric stings

I was solving bigSorting() problem from hackerrank: Consider an array of numeric strings where each string is a positive number with anywhere from to digits. Sort the array's elements in non-decreasing, or ascending order of their integer values and return the sorted array. I know it works as follows: def bigSorting(unsorted): return sorted(unsorted, key=int) But I didnt guess this approach earlier. Initially I tried below: def bigSorting(unsorted): int_unsorted = [int(i) for i in unsorted] int_sorted = sorted(int_unsorted) return [str(i) for i in int_sorted] However, for some of the test cases, it was showing time limit exceeded. Why is it so? PS: I dont know exactly what those test cases were as hacker rank does not reveal all test cases. source https://stackoverflow.com/questions/73007397/sorting-large-arrays-of-big-numeric-stings

How to load Javascript with imported modules?

I am trying to import modules from tensorflowjs, and below is my code. test.html <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Document</title </head> <body> <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.0.0/dist/tf.min.js"></script> <script type="module" src="./test.js"></script> </body> </html> test.js import * as tf from "./node_modules/@tensorflow/tfjs"; import {loadGraphModel} from "./node_modules/@tensorflow/tfjs-converter"; const MODEL_URL = './model.json'; const model = await loadGraphModel(MODEL_URL); const cat = document.getElementById('cat'); model.execute(tf.browser.fromPixels(cat)); Besides, I run the server using python -m http.server in my command prompt(Windows 10), and this is the error prompt in the console log of my browser: Failed to loa...