(SKR 1.3) Marlin 2.0 M112 (EM Halt) issue

Hi there again,

Currently i’m writing (forked of the BIGTreetech TFT software) a new UI for the much used TFT’s around here designed for Routing/Milling instead of 3d printing. It already has some cool features such as Direct jog control (moving while pressing with dynamic speed and stop upon release). And much more especially for milling and routing.

However i currently run into problems with the Marlin 2.0 emergency stop routine. After a forced stop (direct “M112\n” over the serial) i receive the printer halted echo. But then a random number of axis start running away. Presumable to the homepoint or something. Marlin is unresponsive as it should afterwards, but this can’t be the intention of the M112.

Is this a known bug or something? Or am i overlooking something.

1 Like

I haven’t heard of that. Do you have a serial log? Maybe post that in the Marlin issues.

Never worked with debug mode before. Thing is it only happens during a print? Possibly only when zeroed at its non origin. Hopefully i can narrow the scenario down.

Took me some time to get it working in the first place, since the gcode sender in the tft firmware does not provide means to directly push an emergency command. So hence i had to write the function myself. The first tests were ok (durring cnc idle). However during printing, even though it directly confirms the kill() command is called. The cnc starts moving.

Marlin 2.0 should be the defacto standard for marlin based 32bit machines right? Then this must have been noticed by many more, i’d assume…

Never heard of this problem either and it doesn’t happen with my board so it’s hard to tell.

What stepper drivers are you using? Will M112 turn off power to steppers?

My best guess here is EMI. Does it only happen if the router (spindle) is on? What if you turn the router off, will the random walking stop?

1 Like

It’s a board i’m using for development. A minimal setup without spindle, 3 steppers with de standard A4988. So EMI should be ruled out. The board should not power the steppers down after a M112, they should freeze in position also to prevent the gantry from falling down and doing more damage in an emergency situation. And it does not. However it seems to trigger a homing action after the M112 whilst printing.

Ok looks like we can rule out gravity :slight_smile: A4988 are simple drivers that take pulse inputs so I would not rule out some weird electrical noise on the board however in this case the steppers should walk randomly all the time they are powered up not just after M112. It is theoretically possible that the code that MCU is running affects the noise and it may hit the “sweet spot” in that tight watchdog loop that it enters after kill. It would take a crazy set of circumstances but stranger things have happened. I would try to switch out the power supply if you have a spare just to be sure.

Triggering of homing routine after “kill” is unlikely. It may be possible if the firmware somehow gets automatically reset and then the host (TFT) sends it a homing command but if Marlin became unresponsive then this is unlikely as well.

Here’s the function that is called at M112 if that helps.

EDIT: One thing to try that should rule out a homing sequence as a cause would be to set DISABLE_X or DISABLE_Y to true. M112 should then disable that stepper. If the stepper re-energizes and starts moving it’s more likely to be a software issue. If it stays disabled you’re more likely to have an electrical issue.

BTW I absolutely love the idea of a CNC UI for these displays.

I’d pretty much rule out an electrical issue. For 1 the set up really does not allow for much EMI, but second and most, it’d be pretty unlikely to have EMI trigger X movement in one direction for a while, followed by Y motion for longer and than a period of Z motion ;-). At least the EMI whould not be causing the “homing”. of course a dangling kill-pin might restart the entire sequence, however whould not trigger a home, and most likely cause problems in the normal operation. The set-up runs fine for over hours without triggering issues, running gcode from the SD.

I’m working my way through the code to see where i can find any possible mishaps…

What i find most peculiar currently is that when running the EMERGENCY_PARSER (which i am). it disables most of the standard M112 handling. Also Kill() is not explicitly called. Basically all it does is set: emergency_parser.killed_by_M112 to true. The only function acting on this is: Temperature::manage_heater(). Which will call Kill upon seeing it true, but it’s not a logical location if you ask me.

In my set-up it might not even come to that location. (looking at the call hierarchy)

Is it really that consistent? E.g. it moves the same amount in the same order every single time this happens?

EMERGENCY_PARSER didn’t make any difference for me. I can confirm that the kill behavior is slightly inconsistent as one can assume from the code: When EP is enabled, steppers are not disabled (if DISABLE_? is set) w/o EP they are. We should probably report this bug.

You can try to disable EMERGENCY_PARSER and slip an M112 in the middle of a gcode file to see if this changes things.

Another setting worth checking, you have EXTRUDERS = 1 right? E=0 is known to be broken.

Yeah, not the best name as it does a few other things in addition to thermal management. However I am pretty sure it will be called. This function is called from all over the place like safe_delay and the main program loop (via the idle function). More importantly, this is one of the few functions that resets the hardware watchdog timer. Thus unless you’ve disabled USE_WATCHDOG, the board would reset unless this is called periodically.

Good to know! did some more testing (by trying) and found a few extra clues. It only seems to happen when in a long move like a distant travel move that takes a while. During short (normal milling moves like holes e.d.) it works fine and stops the machine correctly.

Also if it works it does not respond on the serial connection. Whilst if i get a runaway i get the serial message “Printer halted. kill() called!” just before it starts moving away.

Ha!

I can confirm this now. If I enable EP and send two long moves to the queue:

G1 X50 Y50 F500
G1 X100 Y50 F500

Then M112 will stop the machine, the steppers pause for a moment and then continue until the buffer is empty: E.g. both of the moves will be executed until it comes to a stop. This is probably why you don’t see this with short moves because you just don’t realize that M112 didn’t take effect immediately.

My wild guess is that because stepper pulses are generated from interrupts, kill() is not able to disable all of the interrupts on LPC1768 or something re-enables the stepper ISR.

This should be reported to Marlin. Thank you for debugging.

1 Like

Thanks for the confirmation. I added an issue tot the Marlin2.0 tracker:

1 Like

Thank you. Can you drop a link here as well so I can keep an eye on this.

2 Likes

BTW, if anybody uses the BigTreeTech firmware and prints from the SD card i’ve got a tip for you (Wow the firmware is badly designed…).

The main loop during print in which also the GCode is send, is also used for rendering the screen. So each line of GCode is seperated (in time) with the rendering en all the other actions on the board. This causes pretty choppy motion with the standard Marlin Settings. If you’d like an easy Quickfix that improves this handling significantly give some priority to gcode handling. To do so you can do the following:

In the menu.c file in the loop Process, add a little for loop around this section:
getGcodeFromFile(); //Get Gcode command from the file to be printed
sendQueueCmd(); //Parse and send Gcode commands in the queue
parseACK(); //Parse the received slave response information
parseRcvGcode(); //Parse the received Gcode from other UART, such as: ESP3D, etc…

Change it to:
for( u8 i = 0 ; i < 32 ; i++ ){
getGcodeFromFile(); //Get Gcode command from the file to be printed
sendQueueCmd(); //Parse and send Gcode commands in the queue
parseACK(); //Parse the received slave response information
parseRcvGcode(); //Parse the received Gcode from other UART, such as: ESP3D, etc…
}

The CPU is fast enough for you not to notice the delay this causes on the UI handling. Also you might want to increase the command buffer on Marlin a bit to smooth out printing: (BLOCK_BUFFER_SIZE=32) in Configuration_adv.h

Probably i’ll change this piece of code before release. If useable and imho significantly better for milling than the original i’ll add it to GitHub.

1 Like