techniques

Gradient Accumulation

Imagine you're pushing a car up a hill. Gradient accumulation is like pushing the car a little bit, noting the effort, then repeating this process several times *before* adjusting your direction based on the combined effort. This allows you to effectively use a larger batch size than your hardware would normally permit, leading to more stable training.

Want to learn more about AI?

Peter Saddington has trained 17,000+ people on agile and AI. Let’s talk.

Work with Peter