Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lab 1 Part 2] - Missing softmax argument in Dense layer #147

Open
ksadura opened this issue Jan 4, 2024 · 0 comments
Open

[Lab 1 Part 2] - Missing softmax argument in Dense layer #147

ksadura opened this issue Jan 4, 2024 · 0 comments

Comments

@ksadura
Copy link

ksadura commented Jan 4, 2024

In the solution for this task the final RNN model is created as follows:

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]),
        LSTM(rnn_units), 
        tf.keras.layers.Dense(vocab_size)
    ])

    return model

model = build_model(len(vocab), embedding_dim=256, rnn_units=1024, batch_size=32)

Why there's no activation function (softmax) defined in the Dense layer? In the task it's said:

The final output of the LSTM is then fed into a fully connected Dense layer where we'll output a softmax over each character in the vocabulary, and then sample from this distribution to predict the next character.

According to the docs activation function is None if not explicitly declared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant