New version 0.05 of DENIS-Fiber_beta
Message boards :
Number crunching :
New version 0.05 of DENIS-Fiber_beta
Message board moderation
Author | Message |
---|---|
Send message Joined: 8 Jul 22 Posts: 44 Credit: 1,009,628 RAC: 1,011 ![]() ![]() ![]() ![]() ![]() |
Just got some version 0.05, not started yet. Should version 0.04 units be aborted? Paul. |
![]() Send message Joined: 5 Apr 25 Posts: 53 Credit: 290,503 RAC: 11,761 ![]() ![]() ![]() ![]() |
I wouldn't think so, but I'm curiously waiting for an official update from the research team. I'm seeing similar runtimes even though before starting, BOINC estimates a bit less than for v0.04 tasks. ![]() |
Send message Joined: 6 Mar 23 Posts: 71 Credit: 2,442,785 RAC: 1,158 ![]() ![]() ![]() ![]() ![]() |
Just got some version 0.05, not started yet. Tha does not happen to me. When I get a supply of work units, I almost always complete them long before another set becomes available. In the old days, they expired in less than a week, but I could almost always complete them in about three days, so I had no Denis work more than half the time. These new (0.04 and 0.05) work units have a longer expiry date (about three weeks), so I get much fewer units than before. And they take about 7.5 hours each to run. My app_config file allows three of these tasks to run at a time, so I still run out long before the next batch arrives. So it looks as though I willl never have the problem implied by your question. It makes me wonder how you can get so much work on this project as to not finish a set before the next set arrives. ![]() |
Send message Joined: 25 Dec 23 Posts: 6 Credit: 2,275,795 RAC: 6,500 ![]() ![]() ![]() ![]() ![]() |
Just got some version 0.05, not started yet. I wouldn't abort any work units... Because the work is intermittent I suspect people hog the work units and download what they think will be enough to get them through the next scarcity period. Given the long deadline to these workunits people will be hogging even more. It would be nice if there were a more even distribution of work for all... I just crank away at other projects when there is no Denis to be had. |
![]() Send message Joined: 5 Apr 25 Posts: 53 Credit: 290,503 RAC: 11,761 ![]() ![]() ![]() ![]() |
Indeed, I'm one of the hoggers too. I don't have access to sci-fi computers that crunch these long tasks in 7-8 hours so it's a long grind to get through my reserves. The exception is my dedicated miniPC that ran out of v0.04 just as v0.05 got sent. The laptops are taking much longer and they have 4 or 8 threads instead of 16. ![]() |
Send message Joined: 6 Mar 23 Posts: 71 Credit: 2,442,785 RAC: 1,158 ![]() ![]() ![]() ![]() ![]() |
I wouldn't abort any work units... I agree, unless they have have expired already. Because the work is intermittent I suspect people hog the work units and download what they think will be enough to get them through the next scarcity period. For earlier tasks (0.01 to 0.02) this would not have worked for me because if I ran through a bunch in 3 days, but got new work only about once a week, getting more work when available would have timed out since they expired in about three days. Given the long deadline to these workunits people will be hogging even more. It would be nice if there were a more even distribution of work for all... I just crank away at other projects when there is no Denis to be had. Except for WCG, if I diddle the amount of work in the boinc client to get, it applies to all projects,, and if I set it to get more work for Denis, it would get more for all the others too. No problem for ClimatePrediction that does not send out work. But for others, this would be a mistake. ![]() |
Send message Joined: 8 Jul 22 Posts: 44 Credit: 1,009,628 RAC: 1,011 ![]() ![]() ![]() ![]() ![]() |
It makes me wonder how you can get so much work on this project as to not finish a set before the next set arrives. I got some SRbase with deadline 3 days & estimated runtime 30 seconds that run ~20 hours on these old PCs. Paul. |
Send message Joined: 29 Sep 24 Posts: 10 Credit: 35,700 RAC: 918 ![]() ![]() ![]() |
Just got some version 0.05, not started yet. The admins have the ability to cancel tasks on the project side, which will result in them being aborted with a "Server Cancelled" status the next time your client checks in. Until they do this, I would assume that the admins want to receive the results of the WUs that they've sent out. |
![]() Send message Joined: 27 Mar 24 Posts: 8 Credit: 27,624 RAC: 248 ![]() ![]() |
@Jean-David Beyer. It's not necessarily about grabbing lots of tasks. BOINC is made up of whatever computers people have available to make a contribution. Personally I have a Raspberry Pi working 24/7 that is currently working on two 0.05 Denis tasks, and which recently returned a 0.04 task after several days. I also have my work PC working on two 0.04 tasks and one 0.05 task. It's fast enough for my needs, but very far from the latest and greatest hardware. It's further slowed because I throttle it with TThrottle so I don't have to listen to the fan screaming its head off whilst I work, which also prolongs the life of the CPU of a computer that doesn't belong to me (by keeping the CPU temperature down). Also, electricity where I live is very expensive, so there is a financial incentive to minimising its use. Some of the tasks I picked up were part of a large number of tasks that someone had aborted, so the tortoises will be beating the hare. |
Send message Joined: 8 Jul 22 Posts: 44 Credit: 1,009,628 RAC: 1,011 ![]() ![]() ![]() ![]() ![]() |
Any official word on whether version 0.04 units should be aborted? Paul. |
![]() Send message Joined: 21 Nov 24 Posts: 21 Credit: 63,260 RAC: 774 ![]() ![]() ![]() |
Hi! No need to abort, the only update from the 0.04 to 0.05 beta version is the output to the stderr.txt file, to see where are failing some checkpoints (only 9 from all tasks by the moment). Best regards, Iván. |
Send message Joined: 6 Mar 23 Posts: 71 Credit: 2,442,785 RAC: 1,158 ![]() ![]() ![]() ![]() ![]() |
These are all from my Linux machine (my biggest and fastest). 0.01, 0.02, and 0.03 tasks ran very fast and I did a lot of them. I could do them in about 1/2 a week and then there were none until the next week. I seem to remember thes tasks took about 20 minutes each. 0.04 and 0.05 tasks were larger and seemed to take about 7 1/2 hours each. So the server gave me fewer of these. But since they had much greater expiry date, I could have completed at least twice as many as the server chose to give me. The server seems to think I could do around 500 tasks per day, but gives me only about 15 per week. Very disappointing. Beta of DENIS-fiber 0.01 x86_64-pc-linux-gnu Number of tasks completed 3293 Max tasks per day 3786 Number of tasks today 0 Consecutive valid tasks 3286 Average processing rate 6.52 GFLOPS Average turnaround time 1.09 days Beta of DENIS-fiber 0.02 x86_64-pc-linux-gnu Number of tasks completed 4008 Max tasks per day 4502 Number of tasks today 0 Consecutive valid tasks 4002 Average processing rate 6.70 GFLOPS Average turnaround time 0.99 days Beta of DENIS-fiber 0.03 x86_64-pc-linux-gnu Number of tasks completed 2076 Max tasks per day 2576 Number of tasks today 0 Consecutive valid tasks 2076 Average processing rate 6.21 GFLOPS Average turnaround time 1.47 days Beta of DENIS-fiber 0.04 x86_64-pc-linux-gnu Number of tasks completed 15 Max tasks per day 515 Number of tasks today 0 Consecutive valid tasks 15 Average processing rate 6.33 GFLOPS Average turnaround time 0.97 days Beta of DENIS-fiber 0.05 x86_64-pc-linux-gnu Number of tasks completed 16 Max tasks per day 516 Number of tasks today 0 Consecutive valid tasks 16 Average processing rate 6.31 GFLOPS Average turnaround time 0.97 days ![]() |
Send message Joined: 12 Mar 23 Posts: 3 Credit: 4,935,469 RAC: 2,280 ![]() ![]() ![]() ![]() ![]() |
A little report from my side: To test the resilience of the checkpointing of the 75 v0.05 tasks I got, I ran a loop that restarted BOINC via "systemctl restart boinc-client.service" every 21 minutes, with checkpointing configured to every 10 minutes. So the tasks cycled through loading the last checkpoint, running 2x for 10 minutes, saving a checkpoint, and on minute 21 BOINC shuts down, terminating all running tasks, including their processes. Immediately after, BOINC starts up again and all tasks were restartet, loading the last checkpoint, having lost only around 30 seconds of cpu time since the last save. This repeated for around 7.5 hours. This is logged in the stderr.txt visible in the task view, e.g. https://denis.usj.es/denisathome/result.php?resultid=60550862 until purge. Result: no errors, checkpoints worked perfectly and all result files had the same hash, so no random miscalculation between any of them! 280ea3d133fb61739e7673ca2b0172bc DENIS_Fiber_Beta_20250603094325541131_InitialTest_k_1-Test_0-conf_672.xml 6b77f91bf95eea017891fb1e167a99f0 DENIS_Fiber_Beta_20250603094325541131_InitialTest_k_1-Test_0-conf_672_1_r718134832_0 bcb751d6c140566a52fccd7da3d5452e DENIS_Fiber_Beta_20250603094325541131_InitialTest_k_1-Test_0-conf_672_1_r718134832_1 (Blocked upload of results until all were finished to md5sum * | sort them.) |