sometimes, it does not matter how many variants you test; what matters is the arhitecture; thus it may be possible that, untill a certain point, all testing is nothing but a wild shot

on mills are more tolerance parameters than on a lathe :
... lathe : droop & rapid droop
... mill : in pos width, in pos 2, clamp, cycle , point r , retract , return

g60 is udp / unidirectional positioning, one shot/not modal; i recomand to avoid using it at this stage

about which code is active, 61/64, there may be shown, somewhere, on your screen, which is active; for example, on p300, is in bottom right corner; but i wouldn't look for that, because i have noticed that changing vinp on mills has effect even if there is no 61 inside the code; this means that 61 is power on default, or 61 and vinp are targeting different accuracy functions, and, to test this, simply change approach :
... rough approach : you have a set of codes with different vinp values; put at the begining g61 and run them, then replace g61 with 64 and run them again, maybe you will notice a difference
... direct approach : using file asign & system variables, to record real tolerance and each block duration; when testing codes, there may be differencies easy to spot ( like how you noticed that there is a stop ), but, also, there are those that you can not spot ( for example something faster than your reaction time ); you can really shorten your trials using fwritc vdin and vapa* ( hoping that those can run on your machine )

i have been recently looking into such parameters, by colecting data, pls check attached; 1st task, is to build a code that delivers time decrease, as vinp increases, then use this as a template / kindly