Bug #5334
"dT is NaN! ..." skuttlebutt in log associated with broken Program Resource display in Flight scene
100%
Description
I wish I could get more of a handle with this one. I had my Program Resources (i.e. Funds, Reputation, Science) display break in the Flight scene shortly before the type of crash I typically associate with scene changes. When I checked the crash folder, the output log was unusually large considering I had not been in the VAB much (I launched a "Satellite 3" for a contract, a design I'd already used and didn't change, and it has no parachutes.) As near as I can tell, the log file doesn't have a single entry of #5169 skuttlebutt, but page after page of this junk:
(Filename: C:/buildslave/unity/build/artifacts/StandalonePlayerGenerated/UnityEngineDebug.cpp Line: 56)
dT is NaN! tA: NaN, E: NaN, M: NaN, T: NaN
It was generated while I was in the Flight scene punctuated by normal messages of the occasional staging event. In the attached log, I removed most of it, with those removals denoted by /* comments */. I clicked through hundreds of screens scrolling exactly five entries at a time, making blinks to other types of messages extremely obvious when they occurred. I thought that the broken Program Resource tab might have been a result of changing through craft, but I couldn't reproduce it this way. Whether it even has anything to do with "dT is NaN!" stuff I have no idea yet (as I look at the log, I find myself doubtful, since "dT is NaN!" skuttlebutt happened during the Satellite 3 launch, while the broken resource tab happened while I was flying Mun 3.) I'd have to see one or the other happen again to get a better idea of what's going on.
History
#1 Updated by Squelch over 9 years ago
- Status changed from New to Need More Info
I notice you have mods installed. Are you able to reproduce without them?
I too have been chasing the cause of this, and have attributed it to setting maneuver on exactly circular, or orthogonal orbits. These types of orbits are incredibly hard to achieve during normal gameplay, but mods like HyperEdit and MechJeb can. My suspicion is there is a "divide by zero" error being introduced by "exact" orbits.
A copy of your log and the steps to reproduce together with a savegame would be most welcome, however, we cannot reproduce and verify issues that use any kind of addon due to the possible "other" variables they might introduce.
#2 Updated by featherwinglove over 9 years ago
I do have Kerbal Engineer as you can see in the screenshot, and the orbit is at very low eccentricity, but I didn't realize this until after you mentioned it since it was not an objective of my flying at the time. I wouldn't classify this sort of flying as "incredibly hard" without mods since jitteringly low eccentricities can easily be done by equalizing apoapsis and periapsis altitude from the Map view. Polar orbits are a bit harder since the last little bit can only be done by observing Free camera and Navball rotation speeds at polar passage.
#3 Updated by featherwinglove over 9 years ago
- File KSP.log KSP.log added
- File persistent.sfs persistent.sfs added
I went over my report again and remembered that most of these messages were generated while Satellite 3 was in powered flight and none while I was logged into the low-eccentricity Mun 1 (I wanted to put it in a polar orbit, but I missed it deliberately to avoid #5305.) I just checked my latest run (no crash) log, an I got a crap-tonne of the stuff in the powered flight of suborbital tourism vehicle "Naddie 1", attaching files.
It has time stamps, and I notice that there are four messages with tightly grouped times every 1/4 to 1/3 second during the initial powered flight phase. It appears to be happening on a per-frame basis (Note: I normally get frame rates of about 3.5fps during powered flight with the camera pointed towards the ground.) In place of these, for two non-consecutive frames at 22:10:23.5 and 22:10:40.7 during high frame rate flight ("high" being in the 15fps ballpark), I got a bunch of messages
Actor::setAngularDamping: The angular damping must be nonnegative!
At that about that time, I was moving the craft slightly from a nose-first ascent attitude to a tail-first descent attitude, which involved several SAS toggles. I usually toggle SAS on at high angular rates, toggle it back off at low angular rates an high torque values to knock out its control inputs, then toggle it back on for final stabilization. Maybe I got really lucky this time and stopped the craft at such low angular rates that it had a fit.
During the descent, the messages indicate a fairly typical frame rate of 10-11fps, again four messages per frame. During these periods, the game is GPU limited, and it appears that the game is trying to advance the physics integration while the frame is still being rendered. If dT is flushed, and doesn't return the usual 0.04s until the frame is complete, then basically the physics thread is twiddling its thumbs writing these log entries while waiting for the GPU.
To reproduce this, you'll probably have to rip out your 980 and hope the computer has GMA on the mobo, crank your graphics settings to their maximums, and then launch a three-part hopper with the camera aimed west. I have enough to make and test a prediction:
1. Three part hopper: tonnes of skuttlebutt in the log.
2. 300 part monstrosity: nuthin'. I'll need to power it with a Mammoth to keep from generating so many exhaust particles that the GPU gets floored again. Task Manager gives some good indications as to whether it's the GPU or CPU that is floored.
Be back later to report if I'm correct.
#4 Updated by featherwinglove over 9 years ago
Prediction 1 has failed so far, which implicates a mod. ...a rude one that doesn't identify itself in these messages. Leading suspects are the ones that probably try to make course predictions faster than my frame rate: Kerbal Engineer and Trajectories. I'm still going to try unmodded high-frame high-part first. Please don't kill the issue just yet.
#5 Updated by featherwinglove over 9 years ago
The actual 227 part monstrosity: nuthin'. I'll start testing the mods maybe tomorrow. I don't know if we should kill it, since most of the problems I've observed in Kerbal Engineer to date have been fed to it from the API (the most irritating being bad horizontal speed.)
#6 Updated by Squelch over 9 years ago
Thanks for your work on this, it really is appreciated.
From your trials and tribulations during testing this I can't help infer that the problems are in fact arising from near exact orbits, and therefore floating point jitter or divide by zero errors. As I pointed out initially, these are extremely hard to achieve during normal gameplay in stock. The progression of this issue does appear to confirm the use of mods can introduce variables that are hard to reproduce in a stock game.
The behaviour is unwanted and potentially game breaking (even if it is rare in stock) so ideally needs some attention. This issue will remain open as it serves the dual purpose of highlighting both points above. This will also be forwarded to the developers as feedback.
As a footnote:
We are extremely interested in any reports of this problem in a stock game. Logs and (clean) savegames are very welcome, and can be used as compelling evidence for a higher priority.
#7 Updated by Kerbas_ad_astra over 9 years ago
Featherwinglove, your comment about suspecting Kerbal Engineer led me to do further testing (I've been encountering it on my installs as well), and it is KER -- in particular, the "time to atmosphere" readout. It's not installed on the orbital panel (or any panel) by default, so these errors are so rare because it's only people with custom HUD or panel settings that are at risk. Furthermore, the error only occurs under these conditions:
1. Time to atmosphere is installed on a panel or HUD which is open/viewed.
2. The "orbit" is entirely within the atmosphere (the error spam stops as soon as the apoapsis is above the atmosphere).
I see from the code (https://github.com/CYBUTEK/KerbalEngineer/blob/master/KerbalEngineer/Flight/Readouts/Orbital/TimeToAtmosphere.cs) that KER checks to make sure that the periapsis is inside the atmosphere, to stop it from asking for a time that doesn't exist (i.e. it asks when the orbit reaches the radius equal to the edge of the planet's atmosphere), so it probably also needs to check that apoapsis is outside the atmosphere. I'll pass the word along to the KER devs, and we'll see if this clears things up. EDIT 2: Made the change on my end, it works, pull-request is in.
If there is a "stock bug" here, it's probably the fact that orbit.TrueAnomalyAtRadius(radius) isn't happy when the orbit doesn't actually contain that radius (EDIT: and maybe doesn't have the built-in sanity checks to test for that case?), but in that case, NaN sounds like a perfectly reasonable answer -- probably the only thing that needs to change is a clearer log entry explaining what's going on. Maybe something like "requested TrueAnomalyAtRadius for radius not in orbit", with details on the orbit, requested radius, and a stack trace to see where the offending call was made?
#8 Updated by featherwinglove over 9 years ago
From your trials and tribulations during testing this I can't help infer that the problems are in fact arising from near exact orbits, and therefore floating point jitter or divide by zero errors.
This is clearly not the case here, as I stated in comment # 2 This issue is occurring before I have achieve any kind of orbit, let alone a precise one.
As I pointed out initially, these are extremely hard to achieve during normal gameplay in stock.
This is a myth, and you are doing a disservice to the KSP community and developers by perpetuating it. A perfectly circular orbit can easily be achieved by bringing your vertical speed to zero and making your periapis and apoapsis equal with no mods and a level 1 tracking station in career mode. If you find it "extremely hard to achieve", it's because you're not a very good pilot.
The progression of this issue does appear to confirm the use of mods can introduce variables that are hard to reproduce in a stock game.
That I agree with. I'm almost surprised you didn't mark this issue as mod-related. I guess you didn't because of so many issues that mods have exposed (horizontal speed and latitudes reported by the stock API showing up in MechJeb and KER.)
I'm in a break from KSP at the moment, which is why I haven't updated this issue recently.
#9 Updated by featherwinglove over 9 years ago
Kerbas_ad_astra wrote:
Featherwinglove, your comment about suspecting Kerbal Engineer led me to do further testing...
You're awesome! I'm tempted to ask for your Reddit, Voat, and Youtube overviews so I can throw you a bunch of random votes. Thanks for your hard work.
#10 Updated by TriggerAu over 8 years ago
- Status changed from Need More Info to Needs Clarification
#11 Updated by Kerbas_ad_astra over 8 years ago
It's been a while, but if I recall correctly, this particular issue was caused by a call (in Kerbal Engineer) to orbit.TrueAnomalyAtRadius(radius) with a "radius" that was higher than the orbit's apoapsis (which does not exist). I contributed a sanity check to KER to stop it from making the call, but I just made a custom version with the fix removed, and the "dT is NaN" messages are still made. I don't consider this an unreasonable answer, but maybe TrueAnomalyAtRadius should have some sanity checks of its own, and put a more verbose message in the log than "dT is NaN!" if they fail. Maybe something like "Orbit.TrueAnomalyAtRadius(radius): radius > orbit.apoapsis" (or "radius < orbit.periapsis") and a stack trace?
#12 Updated by featherwinglove over 8 years ago
- Status changed from Needs Clarification to Resolved
- % Done changed from 0 to 100
It was fixed in KER 1.0. ...er... 1.0.18 I think.
#13 Updated by Kerbas_ad_astra over 8 years ago
It's true that it's been fixed on KER's end (since 1.0.19.1), but it may be worthwhile (I'd certainly suggest Squad consider it, at least) to add some logic to the stock method to provide more information in the log when bad calls get made.
#14 Updated by featherwinglove over 8 years ago
I agree. This KER bug being reported and resolved here is a good indication of that. The lag and log output had me confused enough that I didn't think it had anything to do with mods
#15 Updated by TriggerAu about 8 years ago
- Status changed from Resolved to Closed