Long Running Task
log in

Advanced search

Message boards : Number crunching : Long Running Task

Author Message
Conan
Send message
Joined: 9 Apr 15
Posts: 25
Credit: 496,457
RAC: 0
Message 465 - Posted: 14 Sep 2015, 8:21:32 UTC
Last modified: 14 Sep 2015, 8:27:28 UTC

I have a Dennis work unit that has been running for over 9 Hours so far, still using a full core.

Are there a few really long running work units in the current batch?

The longest I have had recently has been 9,500 seconds (2.64 hours).

The work unit is running on a Windows XP 32 bit AMD Phenom 955 computer.

Thanks
Conan

Profile jcastro
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 16 Mar 15
Posts: 218
Credit: 14,859
RAC: 0
Message 467 - Posted: 14 Sep 2015, 10:08:03 UTC - in response to Message 465.

Could you send me the name of the task?

Best regards, Joel

Conan
Send message
Joined: 9 Apr 15
Posts: 25
Credit: 496,457
RAC: 0
Message 468 - Posted: 14 Sep 2015, 11:10:07 UTC

It is GD_jcarro_20150903001429000000_CRLP_Exp_Design_1_conf_4619_2

See Task 14737515

It is at 9 hours 35 minutes and counting at the moment.

Thanks
Conan

Conan
Send message
Joined: 9 Apr 15
Posts: 25
Credit: 496,457
RAC: 0
Message 470 - Posted: 14 Sep 2015, 22:03:39 UTC - in response to Message 468.
Last modified: 14 Sep 2015, 22:05:01 UTC

It is GD_jcarro_20150903001429000000_CRLP_Exp_Design_1_conf_4619_2

See Task 14737515

It is at 9 hours 35 minutes and counting at the moment.

Thanks
Conan


It eventually reported fine after 11 Hours, but gave very poor credit (10 cr/h) probably due to the other returned result only recording 1,655 seconds run time.
Mine took a Run Time 40,465.05 CPU Time 39,788.40 Credit 115.46

No idea why it has taken so long to run, glad it was a successful result but for the effort at least double points would of been nice.

Conan

ritterm
Send message
Joined: 8 Apr 15
Posts: 4
Credit: 111,560
RAC: 0
Message 573 - Posted: 19 Oct 2015, 14:10:00 UTC

I have a host that seems to be having problems with long running, "never ending" tasks. The three in-progress tasks below have about 15 hours of run time and have been showing 100% progress for several hours:

2XP_18101350_3600_1608_1
2XP_18101350_3600_1653_1
2XP_18101350_3600_2893_0

Should I let them run or abort? Credit is not a big concern, but I don't want to hold up my wingmen if my work is a lost cause.

Crystal Pellet
Send message
Joined: 16 Jul 15
Posts: 4
Credit: 3,556,285
RAC: 0
Message 574 - Posted: 19 Oct 2015, 20:08:38 UTC - in response to Message 573.

Hi Mr. M.

On that system with the stock application they should last about 10 hours.
It's up to you what to do with the running tasks.

For speeding up process have a look at the optimized applications: http://denis.usj.es/denisathome/forum_thread.php?id=53

ritterm
Send message
Joined: 8 Apr 15
Posts: 4
Credit: 111,560
RAC: 0
Message 575 - Posted: 19 Oct 2015, 22:35:47 UTC - in response to Message 574.

On that system with the stock application they should last about 10 hours...

Thanks for the feedback, CP. The 5 tasks that it's completed finished in between about 3-6 hours. Unless I hear something from the admin soon, I'll just abort (they are suspended right now). I'll take a look into running the optimized app when I return for a longer run of work.

Profile jcastro
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 16 Mar 15
Posts: 218
Credit: 14,859
RAC: 0
Message 576 - Posted: 20 Oct 2015, 9:42:59 UTC - in response to Message 575.

Hi ritterm!

In this project we have increase the time of each simulation. Those three you have are the longests ones, ( the 3600 number in the middle of the WU's name it's not casual) this is why it cost more time for you.

The differences between normal app and optimized one could cause some mistakes in the calculus of the cost of the task.

Also when you enter the last part of the simulation is when the results are stored in a file, this is a critical part of the program, if that part is aborted, it will need to restart from the last checkpoint.

Best regards, Joel.

ritterm
Send message
Joined: 8 Apr 15
Posts: 4
Credit: 111,560
RAC: 0
Message 577 - Posted: 20 Oct 2015, 10:37:04 UTC - in response to Message 576.

In this project we have increase the time of each simulation. Those three you have are the longests ones, ( the 3600 number in the middle of the WU's name it's not casual) this is why it cost more time for you...

Thanks, Joel. So, that might explain the long run time, but they have been "stuck" at 100% progress for several hours and they still use a full core while they running. I'll be happy to let them finish, but I don't want to waste my time or further delay wingmen getting validated.

ritterm
Send message
Joined: 8 Apr 15
Posts: 4
Credit: 111,560
RAC: 0
Message 578 - Posted: 20 Oct 2015, 15:12:22 UTC - in response to Message 577.

...I don't want to waste my time or further delay wingmen getting validated.

These tasks ran far, far longer than they should have based on the run times of shorter tasks. I've aborted them.

Profile jcastro
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 16 Mar 15
Posts: 218
Credit: 14,859
RAC: 0
Message 579 - Posted: 20 Oct 2015, 16:33:18 UTC - in response to Message 578.

...I don't want to waste my time or further delay wingmen getting validated.

These tasks ran far, far longer than they should have based on the run times of shorter tasks. I've aborted them.


We will look at them to see if something wrong is happen.

Regards, Joel.

HolgerXXX
Send message
Joined: 9 Apr 15
Posts: 4
Credit: 3,068
RAC: 0
Message 588 - Posted: 26 Oct 2015, 16:09:07 UTC

should i let it run or abort?
at 97,434% and 15:27:27 !!!!

Name 2XP_16101430_3600_4300_1


Workunit 13820670
Created 16 Oct 2015, 12:34:40 UTC
Sent 16 Oct 2015, 14:13:24 UTC
Report deadline 27 Oct 2015, 3:33:24 UTC


Application version Carro-Rodriguez-Laguna-Pueyo Epicardial Model (Carro et al. 2011) for human ventricular cells v1.05

Membran
Send message
Joined: 25 Apr 15
Posts: 2
Credit: 57,962
RAC: 0
Message 649 - Posted: 6 Nov 2015, 9:04:19 UTC

Always after suspend the project and resume the tasks they run longer as normal.


Post to thread

Message boards : Number crunching : Long Running Task


Main page · Your account · Message boards


Copyright © 2020 Universidad San Jorge