Skip to content

Some questions about the Hyperparameter settings #13

@hzzzzzhappy

Description

@hzzzzzhappy

Great work! I would like to ask what the window_size and sequence_length are for the model during pre-training, fine-tuning, and testing large-seer. This seems to be important for model training. I have observed that the loss fluctuates during the early stages of training. I would like to know approximately how many epochs it takes for the loss to stabilise.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions