Back when I was managing large engineering projects, particularly anything for the government, we had to rate every risk item on a two-axis grid. The two axes amounted to probability and pain, essentially the likelihood of a risk item causing a problem, and the level of impact that problem will have on the project.
Extending this thinking to the ROPS, I have had a half-dozen problems created by having the ROPS up, with zero problems ever caused by keeping it down. But on the flip side, the problems created by having it up are usually lower impact, compared to the pain of having it down during a rollover event.
So, based on my own usage history, I'd rate this as:
Risk of ROPS up: High probability, low impact
Risk of ROPS down: Low probability, high impact
Other usage profiles and terrain will obviously shift the probability around, more than the impact.