This is an awesome feature!
But it’s very hard to find. The user needs to hover over the image to see the button and then click it (and most people wion’t know about that).
Even though I knew and was looking for the feature, I had the check the video to get that I need to hover.
IMO it should be “in your face” to be used in the beginning. I’d even make it create the captions by default, without the user having to click anything