Maximum Elapsed Time Exceeded

Message boards : Number crunching : Maximum Elapsed Time Exceeded
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Conan

Send message
Joined: 9 Apr 15
Posts: 29
Credit: 850,239
RAC: 7
Message 228 - Posted: 26 May 2015, 6:56:35 UTC
Last modified: 26 May 2015, 7:00:54 UTC

All my Windows XP 32 bit work units are failing with this error message.

<message>
Maximum elapsed time exceeded
</message>
<stderr_txt>
Operation time:1000000.000000
Diferential Time:0.002000
Frequency:50
Initial Time:999000.000000


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7c90120e

See WU 4064725
Also WU 4071990

So far 64 bit Linux seems unaffected.

Conan
ID: 228 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile jcastro
Avatar

Send message
Joined: 16 Mar 15
Posts: 219
Credit: 14,859
RAC: 0
Message 229 - Posted: 26 May 2015, 9:38:44 UTC - in response to Message 228.  

Hi!

It's a really strange behaviour. Because the time limit is set in the WU creation, and it's the same for every platform. We will try to isolate the bug as soon as posible and find what it's happen.

Thanks for the attention, Joel.
ID: 229 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
morgan

Send message
Joined: 10 Apr 15
Posts: 8
Credit: 2,135,746
RAC: 2
Message 230 - Posted: 26 May 2015, 10:40:23 UTC - in response to Message 229.  

Well all the Carro-Rodriguez-Laguna-Pueyo Epicardial Model (Carro et al. 2011) for human ventricular cells v1.02 runs OK.

But all v1.03 fails here,like Conan´s,, (win 32)
ID: 230 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile jcastro
Avatar

Send message
Joined: 16 Mar 15
Posts: 219
Credit: 14,859
RAC: 0
Message 231 - Posted: 26 May 2015, 11:08:07 UTC - in response to Message 230.  

Well all the Carro-Rodriguez-Laguna-Pueyo Epicardial Model (Carro et al. 2011) for human ventricular cells v1.02 runs OK.

But all v1.03 fails here,like Conan´s,, (win 32)



v1.02 also tends to fail, like v1.02 of x64 system, but i fails less than v1.03, we will deprecate v.103 x86 until we get the error.

Thanks for the patience, Joel.
ID: 231 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile jcastro
Avatar

Send message
Joined: 16 Mar 15
Posts: 219
Credit: 14,859
RAC: 0
Message 233 - Posted: 26 May 2015, 11:41:01 UTC
Last modified: 26 May 2015, 11:41:49 UTC

To ilustrate the strange behaviour of the bug, the same v1.03 have work in the same PC with a same lenth WU.
Correct http://denis.usj.es/denisathome/result.php?resultid=4053805
Fail http://denis.usj.es/denisathome/result.php?resultid=4063184

we are working on it, thanks for the patience, Joel.
ID: 233 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile jcastro
Avatar

Send message
Joined: 16 Mar 15
Posts: 219
Credit: 14,859
RAC: 0
Message 235 - Posted: 26 May 2015, 14:17:36 UTC

Hi!

The problem could be that the server isn't calculating correctly the flops of each machine, some issues with v1.02 could generate this unreal value ( 205.99Gflops in a x86 machine). So we have reset the stadistics trying to solve this problem.

We hope this solve the error.

Thanks for your patience, Joel.
ID: 235 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Megan-Louise
Avatar

Send message
Joined: 17 May 15
Posts: 5
Credit: 11,764
RAC: 0
Message 245 - Posted: 28 May 2015, 3:28:58 UTC

Suspended task for 1 minute after 10min 50secs runtime and progress bar at 10 percent. On resuming task, estimated time rose rapidly (from 1hour 34 minutes to 6hours 18mins) in an hour, whilst progress bar barely changed.

http://denis.usj.es/denisathome/result.php?resultid=4190003

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7c90120e


Had to abort task in the end. Sorry :(
ID: 245 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile jcastro
Avatar

Send message
Joined: 16 Mar 15
Posts: 219
Credit: 14,859
RAC: 0
Message 246 - Posted: 28 May 2015, 12:40:27 UTC - in response to Message 245.  

Hi!

No problem, some bugs caused that the estimation system goes crazy. It uses wrong estimation factor and this produce those errors. We have deprecated the version which was affected by the bug and we are trying to generate a new bug-free version.

Thanks for your patience, Joel.
ID: 246 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Maximum Elapsed Time Exceeded