techniques
Gradient Accumulation
Imagine you're pushing a car up a hill. Gradient accumulation is like pushing the car a little bit, noting the effort, then repeating this process several times *before* adjusting your direction based on the combined effort. This allows you to effectively use a larger batch size than your hardware would normally permit, leading to more stable training.
Want to learn more about AI?
Peter Saddington has trained 17,000+ people on agile and AI. Let’s talk.
Work with Peter