585,981 active members*
4,288 visitors online*
Register for free
Login
Results 1 to 15 of 15
  1. #1
    Join Date
    Nov 2014
    Posts
    12

    Intermittent Axis Position Issue

    Hi Tom,

    I'm hoping you can help give us guidance to solve an intermittent issue we are experiencing with our machine. The problem started about a week ago, we operated the machine for about 2-3 weeks prior with no similar issues before this problem began.

    Generally after first turning on the machine it will function normally for a few minutes and then suddenly all of the axis will start moving with violent jerks and vibrations. The issue occurs on all axis simultaneously for a varying period of time after which they start to behave normally again. An episode can last from a few seconds to a few minutes, once it passes the machine will operate normally for a random period of time (up to 10-15 minutes) and then another episode will occur. In a few rare cases the issue occurred immediately upon turning on the machine, but generally it runs for a minute or two before we see it.

    There is no discernible pattern we have found that causes an episode. It can occur while jogging, cutting, executing move commands, moving in any direction along the axis,and at any location on the table.

    Our setup:
    2 SnapAmp with Kanalog card
    4 Axis each with their own power supply
    DC servo motors dirrentially encoded
    Encoder cables are shielded and grounded
    Dedicated 5V power supply for the kflop.

    Our init.c file: http://www.filedropper.com/init (click Download This File)


    We have nothing flashed to the kflop at the moment. We have tried running the kflop with 4.32, 4.33d, and 4.33j, this has no effect on the issue.


    Since the issue began we have tried to eliminate the possibility of noise as the culprit by ensuring there are no ground loops, we have isolated the kflop from ground by running it off a laptop and by powering the kflop's 5V supply with UPS battery. There is still some noise on the 5V supply during normal operation of the machine that can't be avoided. Since the machine worked normally before we made efforts to reduce the noise I'm doubtful that our issue is noise related, but just in case this is what the 5VDC looks like while the axis are moving:

    5VDC_zpsa57b6350.jpg Photo by David_Piccolo | Photobucket

    I've managed to gather step response data during an episode,as you will see there are blips in the encoder position that cause this violent reaction.

    Step Response Data:
    http://www.filedropper.com/x1_2 (click Download This File)

    We have tested our encoders separately by individually powering them and pushing the axis, the pulses are consistent, they do not show any variation or malformation. Since the issue occurs on all axis it seems unlikely that this is an encoder issue.

    At the moment we are at a loss as to the root cause of this problem, the common denominators between all the axis are few. I would greatly appreciate any insight you can provide to point us in the right direction.

    Sincerely,
    David.

    Ps. This is my first time using file dropper, it seems to work but if there are any issues let me know.

  2. #2
    Join Date
    Mar 2007
    Posts
    137

    Re: Intermittent Axis Position Issue

    Maybe your to technical, here is some simple thoughts: The first thing the service guy is going to do is determine if the problem is mechanical or electrical. Can you hand crank each axis in both directions for full travel, (even if you have to grab the ball screw to do it) maybe there is a mechanical issue. (such as the oiler quit and the ways are dry as a bone or a gibb is lose, becoming real tight in one direction, etc.etc.). Are ALL the cooling fans working? (I fixed 3 machines over the phone when the owner said" it runs good for 15 minuets, than it screws up, by making them check the cooling fans that were not working) Take your screwdriver and check EVERY wire connection, and pull on the wires to make sure their tight, INCLUDING the incoming wires by the on off switch. (unplug first so you don't get zapped) I find lose wires even on new controls! (if you have a capacitor for DC voltage, make sure its bled down before you touch anything connected to it) Are your neighbors TIG welding? The machine is well grounded, right? Remove and check fuses, the new AC brushless drives may still run if you dropped one leg of your 3 phase, but maybe not so well. Check all voltages, starting with your shop voltage, then to and into the machine. Did you just get this machine 3 weeks ago? used? Run each axis by itself, If 2 run good and the third jerks, maybe the two good ones are simply trying to match up with the jerky one. I even go as far as wiggling wires and pounding on electrical enclosures to try to make the machine screw up when its running good, but has been acting up. I started using CONTROL CONCEPTS ACTIVE TRACKING FILTER for all the 115 volt components (computer and logic power supply, etc.) in the cnc's I build and that has eliminated occasional weird stuff. (they are cheap on e bay) does your machine have fiber optics in it? if so, properly clean the ends. Maybe you won't find the problem, but if you do all this yourself, you don't have to pay a service guy $190 per hour to do it. Call the previous owner and ask them about this! I hope this helps.

  3. #3
    Join Date
    Nov 2014
    Posts
    12
    springlakecnc,

    There is no mechanical problem, the machine works perfectly fine between episodes and as I stated the problem occurs at different positions moving in different directions. The temperature never exceeds 30 degrees C. We double checked all of our connections and soldering. We shook the wires to see if it would change our scope measurements or cause the issue, it did not. As I stated in my post we are using DC motors, there is no 3 phase component in the system. We didn't buy the machine 3 weeks we built it a month ago, it's our machine there were no previous owners. We checked all of our voltages and they are at the same levels they were when the machine worked properly. The machine does not have fiber optics.

    I'm glad you brought up the point of running the machine with the axis disabled, I forgot to mention this test my previous post. We removed each axis and its power supply from the system and tested them individually, the problem still occurs with only one of any of the axis in the system.

    I appreciate the time you took to respond,
    David.

  4. #4
    Join Date
    May 2006
    Posts
    4045

    Re: Intermittent Axis Position Issue

    Hi David,

    I don't know what that could be. The Step Response Screen Plot should show some clues. Your file dropper links don't work. The best thing would be to upload the raw data from the Step Response Screen.. This allows us to zoom in and display in various ways ourselves.

    Regards
    TK
    http://dynomotion.com

  5. #5
    Join Date
    Nov 2014
    Posts
    12

  6. #6
    Join Date
    May 2006
    Posts
    4045

    Re: Intermittent Axis Position Issue

    Hi David,

    Thanks for the links. You forgot to include your CustomDef.h file. Please post it.

    It seems clear there are discontinuities in the encoder position. Typically an apparent sudden jump of varying amounts in the 20-60 count range.

    Communication noise can cause a glitch in the reading but the next reading would normally be correct as KFLOP basically reads the absolute position from SnapAmp. This doesn't seem to be the case here.

    Is the position drifting and incorrect after this occurs? For example if you return back to some commanded position before and after all these jerks is the physical position the same?

    One possibility is a software bug causing KFLOP to lockup for a few milliseconds. So that instead of sampling every 90us it stalls and the next sample is much longer for some reason. its like KFLOP goes to sleep and when it wakes up things have changed alot. The discontinuities seem to always be in the forward direction which would fit this scenario. Shifting the red curve after the discontinuity ~4ms to the right would fit the expected position.
    Attachment 259070

    Here is a KFLOP C Program (also attached as a text file) that will time KFLOP Time Slices and print the maximum value seen about every 1 second on the Console. Please have it running and do a Step Response that shows discontinuities and post the results.
    Code:
    #include "KMotionDef.h"
    
    // measure time from one time slice to the next
    
    main()
    {
        int i=0;
        double t0,t1,dt,tmax=0.0;
    
        t0=WaitNextTimeSlice();
        for (;;)
        {
            t1=WaitNextTimeSlice();
    
            dt=t1-t0;
            t0=t1;
    
            if (dt>tmax) tmax=dt;
    
            if (++i==10000)
            {
                printf("Max Time Slice = %8.2f us\n",tmax*1e6);
                t0=WaitNextTimeSlice();
                i=0;
            }
        }
    }
    Regards
    TK
    http://dynomotion.com

  7. #7
    Join Date
    Nov 2014
    Posts
    12
    Thanks Tom, I will post the test results tomorrow.
    In the mean time here is the CustomDef.h file: https://www.dropbox.com/s/5knz2d8ipj...stomDef.h?dl=0

  8. #8
    Join Date
    Nov 2014
    Posts
    12
    Hi Tom,

    I ran several tests using your Max Time Slice program.

    Initially I simply jogged all the axis opposite to home on the table and then ran my home script. Before running home.c the Max Time Slice was around 180us, after wards and during was around 270us... as shown here:
    https://www.dropbox.com/s/l1vx41t1l3...a%201.txt?dl=0

    During the next tests I simply ran repetitive step responses until I caused the error. The Max Time Slice was around 180us until an error occured and then it doubled, as shown in the following tests:
    https://www.dropbox.com/s/z8fjbk0xf8...a%202.txt?dl=0
    https://www.dropbox.com/s/p1sbftnxs3...a%203.txt?dl=0
    https://www.dropbox.com/s/dfygi43hbx...a%204.txt?dl=0

    I attempted to measure axis physical position during these tests using a sensitive measurement tool and it appears as though the axis returns to the same spot each time with 1/100 of an inch of accuracy.

    Here are the various step response data collections gathered during these tests:
    https://www.dropbox.com/s/3nwmvb2n0c...a%201.txt?dl=0
    https://www.dropbox.com/s/twfo9k7i6j...a%202.txt?dl=0
    https://www.dropbox.com/s/mpdi8mfdqs...a%203.txt?dl=0
    https://www.dropbox.com/s/cigdptgai4...a%204.txt?dl=0
    https://www.dropbox.com/s/pwh1yvpf94...a%205.txt?dl=0


    Thanks,
    David.

  9. #9
    Join Date
    May 2006
    Posts
    4045

    Re: Intermittent Axis Position Issue

    Hi David,

    Thanks for all the data. That provides some additional clues but we're still not able to figure out what the problem is.

    The 180us and 270us time slices are correct and make perfect sense. With only the diagnostic program running (with the system Thread always also running) the time should be 2x90us=180us. When your homing Thread is also running the time should be 3x90=270us.

    The 360us time is incorrect and should not occur. I realize now that there is a flaw in the diagnostic program. It basically computes elapsed time based on the number of 90us servo ticks plus an offset from the last servo tick. But if a servo tick is not 90us then the time will be incorrect. I've attached a new diagnostic that measures the time as before and also using a raw 50MHz timer that should give a correct result regardless of if a servo tick is delayed. Both results should agree if the servo ticks are really 90us.

    I'm not sure how interested you are but here is my current theory on what is occurring (but I don't yet understand why). Its a somewhat complicated matter but basically KFLOP's DSP scans the Axis Channel's configurations and determines which of 32 registers in the two SnapAmps need to be read. It then forms a 32-bit word (with one bit for each register) and writes the word to KFLOP's FPGA. KFLOP's FPGA then takes over and reads all those registers from the SnapAmps through the 16 wire ribbon communication cable. This could be up to 512 bits and require 10~20us. The results are stored into a FIFO in KFLOP's FPGA. KFLOP's DSP then reads the data from the FIFO. Normally by the time the data is attempted to be read it should already be available. If it is not available it loops up to 2048 times (~1ms) until it is or times out. I believe for some reason the data is not available and this stalls processing for ~1ms (or multiples of this depending on how many timeouts occur). This of course greatly disrupts the 90us servo samples.

    The communication between KFLOP and SnapAmp is timing deterministic (it is basically like reading memory). Regardless of noise or communication errors the data should be read in the same amount of time (it could be corrupted data but should never hold off putting something in the FIFO to be available to be read by KFLOP's DSP).

    One scenario that might fit could be that the 32-bit word that determines which register are to be read might be corrupted when it is written locally to KFLOP's FPGA. If some bits were altered from 1 to 0 then those registers would not be read and placed in the FIFO as expected causing a timeout. However I can't think of any reason why a noise problem like this would be suddenly switched on by some event. Why would millions of servo ticks work flawlessly and then suddenly failures occur randomly every few thousand servo ticks? I suppose Temperature could do something like that?

    Odd things about your configuration is that you have 2 SnapAmps plus a Kanalog. Is it possible to run your system without Kanalog connected to see if that is somehow related and triggereing a bug in our code?

    Another Odd thing is that you are using KFLOP Axis Channels 0,2,4,6 for some reason. Normally Axis channels 0,1,2,3 would be used. That should not make a difference but you might try to see if it does. Also you might set unused Axis channels to Input Mode "No Input" (KFLOP reads and maintains encoder positions for all Axis channels set to encoder input mode regardless of whether the channels are used or enabled).

    There is an example Measure IRQ.c that measures the Servo Calculation Time for all the Axes and other operations that occur every 90us servo sample (which should be significantly less than 90us). You might run that before you observe the problem and after to see if there is a difference. If there is, that might be a clue. Note: no other User Threads should be running to get a valid measurement.

    I suspect this is a software bug or noise issue. But we could send a loaner set of boards to try to see if this is some hardware problem with the boards. Contact our support if you would like to do this.

    Are the 4 board stacked together with standoffs and short cables? Are there any earth ground connections to the boards?

    Sorry for more questions than answers.

    Regards
    TK
    http://dynomotion.com

  10. #10
    Join Date
    Nov 2014
    Posts
    12
    Hi Tom,

    I appreciate all of the detail you provided, it absolutely helps clarify what's happening behind the curtain.

    Today I configured our axis as channels 0,1,2,3, and I added channels 4-7 as No Input and No Output in our init.c file. I also disconnected the kanalog card from the circuit. Unfortunately I must report that the problem persisted despite normalizing our setup.

    I monitored the snap amp temperature closely, it did not exceed 30°C.

    We have the 4 boards stacked with the standoffs provided and with short ribbons. The only ground connection on the kflop exists via the USB cable, we've tried running the kflop off a laptop (on battery power), and off of a PC powered by a UPS battery to eliminate the possibility of ground noise -- it had no effect.

    It's difficult to know exactly what noise the encoder signals are seeing during normal operation as my oscilloscope's impedance seems to cause a cacophony of interference if measuring while everything is powered up.

    I ran the measureIRQ.c file and gathered the following data:
    https://www.dropbox.com/s/sz3jx5sayb...0Data.txt?dl=0
    You'll notice near the bottom where an episode occurred there is a spike in the IRQ time measurement, however I ran this test several times and the spike never occurred again.

    I ran a test with your new Max Time Slice code and gathered the following data:
    https://www.dropbox.com/s/dzguuqizwp...0data.txt?dl=0

    My team and I are keen on the idea of trying a loaner set of boards to see if the problem could be with the controller. We may bark up that tree soon unless the information provided today spawns any new theories.

    Thanks again,
    David.

  11. #11
    Join Date
    Jun 2013
    Posts
    1041

    Re: Intermittent Axis Position Issue

    Is it cold in your shop or does the machine stay a stable temperature? Also is your power supply for the k-flop run to a common ground with the high voltage supply in any way?

    Ben

  12. #12
    Join Date
    Dec 2006
    Posts
    10

    Re: Intermittent Axis Position Issue

    Bhurts,

    As I am part of dPiccolo's team for this project and responsible for the supplies, I can answer this one.

    The machine enjoys 4 separate, traditional, "brute force" power supplies using a toroidal transformer, a full wave diode bridge and capacitors each (along with a common soft starter, snubbers across contacts and bleeders on each cap bank). The supplies are built by us, on a panel in a metallic box and provide an independent source for each amplifier, around the tune of 40vdc and 12amps each for 3 motors, 50vdc and 7 amps for the last one. They are fed from 240vac mains. The kflop and the encoders share a 5vdc supply fed from 120vac mains. This supply we have not built, and have mounted in our "power box". We also have a store bought 12vdc supply, also fed from 120vac, however we are not using it yet, so it has been left disconnected. Note that during our testing we have substituted the 5vdc supplies with no effect.

    The output connection from each supply is a direct run via a conduit to each snap amp "side" or supply input via RW-90 14awg, and the 5vdc supply is a direct run to the kflop power inputs. We are using shielded, twisted cabling (22 awg I believe) for the 5vdc connection, however we have substituted this with no effect. The encoders use the snapamp 5vdc terminals for power. There is no deliberate connection from any rail of the supplies to earth ground, however as dPiccolo mentioned, we have metered an earth connection through the USB cable. It seems that the snapamps/kflops create a common connection among all the return lines of the supplies on the board, including the 5v supply that independently powers the kflop and our encoders. Ultimately, this connects to the USB shield, and the motherboard of the PC creates a connection to earth ground. it is not a connection we can control at the moment. We wondered about this, so we ran it from a laptop, with no noticeable difference.

    Our only earth ground connections are a) cable shields, one side only b) bonding required by Canadian Electrical Code to the cases, the frame of the machine and each independent axis along with our current tool (a 2.2 hp 120vac router), etc using a "star" pattern, where they meet to our ground bar in the box. Our feeder's bare wire is attached here and goes directly to the panel. The PC enjoys its power supply from our "power box", where were have provided a fused 120vac 15amp receptacle tapped from our main incoming feeder (240vac 3 wire on a 30 amp breaker). Being on a residential service, our neutral connection is also solidly connected to earth, at the panel in this case by the utility company. I have witnessed the ground rod driven into the ground and the service feeder installed myself for this building. All this being said, we are very confident there is no ground loop caused by conductors with this setup. Also, when we began tackling the issue, the circuits were all isolated and metered out in search of a loop to no avail.

    As for the shop, it is continually heated to a comfortable 18 degrees C +/- 2. I too wondered if temperature was a factor due to when the symptoms manifested themselves, however the temperature in the shop is always constant and the timing of the errors appears more and more random as time goes on. The owner (who is also our client) ensures us that he does not touch the thermostat, and it is a non programmable mercury switch type.

    The temperature of the snapamps seem normal according to the readout on KMotion, however we have not inspected the boards with an FLIR camera to be absolutely sure. Unfortunately, we do not have access to one either. Even if we did, the boards are stacked very much like it is shown in the dynomotion website, so it would be very difficult to get a proper picture.

    The team has asked the question if we could force the snapamp fans through software, however we were unsuccessful in our attempts - it would seem they are hardwired, however we have not been able to confirm this. We did not force the temperature with a heat gun to start the fans nor are we willing to try. The other way is to force the fans to spin by giving them 5v directly from the supply however we do not wish to modify the controller in any way physically for the purpose of this troubleshooting unless absolutely necessary.

    Hope this helps.

    -Frate

  13. #13
    Join Date
    May 2006
    Posts
    4045

    Re: Intermittent Axis Position Issue

    Hi David/Frate,

    Thanks for all the detailed information. I can't see anything being done incorrectly.

    Are there 18 gauge ground wires between the KFLOP and each SnapAmp?

    Thanks for running the tests to eliminate the Kanalog or Channel numbering.

    Another thought to reduce possibilities further, would be to use only one SnapAmp and disconnect the other. I believe I recall you have tried only one axis enabled and still observed the problem.

    The extended Time slice measurement with the newer diagnostic is consistent with my Theory that somehow the KFLOP FPGA FIFO incorrectly runs empty and times out. But we still don't have a plausible scenario how this could ever occur.

    Regarding Temperature: My thinking was there might be some marginal Memory/DSP/FPGA timing issue where temperature higher or lower might change the timing slightly to make things work or not. I can't think of what else would vary over several minutes to make the problem come and go. Not so much that the SnapAmp MosFETs are overheating. But after Power up heat from the DSP, FPGA, MosFETS, or whatever may start increasing the temperature slightly. Not so much the room temperature. The SnapAmp's FPGA turn on the SnapAmp Fans when when either side reaches 40C.

    You might try to see if the TimeSlice Diagnostic can detect an extended (2.8ms) Time Slice with no motor connection or motor power at all. If you could demonstrate this it would be a big clue. I believe by leaving motor power off and opening up the following error to a huge value you should be able to perform Step Response Plots. Of course the encoder position will not change because there will be no real motion but if the TimeSlice Diagnostic still shows a failure that would be a big clue.

    Otherwise let us send a set of replacement boards to see if it is just a strange hardware problem. Where are you located? Please contact Dynomotion Support and let us know where to ship them. Do you have another KFLOP to try?

    Regards
    TK
    http://dynomotion.com

  14. #14
    Join Date
    Dec 2006
    Posts
    10

    Re: Intermittent Axis Position Issue

    Tom,

    Yes, there are wires between the KFLOP and the each SnapAmp. It is how we received it.

    We have tried running only 1 SnapAmp (the terminated one), with no difference. David will be trying the new test without motors attached soon.

    Unfortunately, we do not have another KFLOP to try, this set of boards is all we have.

    I will be emailing Dynomotion Support shortly with my info so we may discuss a swap.

    -Frate

  15. #15
    Join Date
    Nov 2014
    Posts
    12
    Hi Tom,

    I ran some tests with the motor supplies off using a MaxFollowError of 1e+009.
    I can now confirm that the issue occurs with no axis movement or supply voltage input.

    Firstly I "jogged" the axis in kmotioncnc and after 30 seconds or so an episode occurred. The ~180us/180us reading became ~360us/2800us as we saw in previous tests.

    Secondly I reset the controller and ran step response tests for 15 minutes with no episode. I switched back to kmotioncnc and jogged for several seconds, then switched back to step response where a second episode occurred shortly there after.

    Here is the step response data collected during an episode:
    https://www.dropbox.com/s/i3ipop9vvx...a%205.txt?dl=0

    Hopefully this sheds some light on the problem

    Thank you,
    David.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •