The VidTwin training shows a NaN issue

When training from scratch using my own dataset, after around 100k steps, the AE loss decreases gradually from 1.02×10⁵ to −3.41×10⁵, and then starts producing NaN values.
However, when loading the author’s pretrained model, the AE loss starts at −3.59×10⁵ and becomes NaN after only a few hundred steps.
I have two main questions:
- Why is the loss scale so large? Is this normal?
- What could be the possible causes of the NaN issue?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The VidTwin training shows a NaN issue #24

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The VidTwin training shows a NaN issue #24

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions