Myocyte v0.15 Beta

Message boards : Number crunching : Myocyte v0.15 Beta
Message board moderation

To post messages, you must log in.

AuthorMessage
rjs5

Send message
Joined: 3 Nov 15
Posts: 22
Credit: 1,136,105
RAC: 0
Message 1789 - Posted: 29 Jul 2022, 17:11:35 UTC
Last modified: 29 Jul 2022, 17:12:04 UTC

v0.15 Beta WU have been running on my Windows system for 214476 (clx10980xe-rtx3090) for 4 hours each and indicate they are going to take another 3 days.

Seems like there might be a problem with the long run times.
ID: 1789 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
grumpy

Send message
Joined: 17 Jul 22
Posts: 9
Credit: 431,030
RAC: 2
Message 1790 - Posted: 29 Jul 2022, 18:38:16 UTC

v0.15 Beta WU had to abort all those. Too agressive on cpu, will take forever on a ryzen 9 3950
ID: 1790 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Penguin

Send message
Joined: 2 Jan 18
Posts: 4
Credit: 1,056,163
RAC: 1
Message 1792 - Posted: 30 Jul 2022, 1:46:29 UTC

If you look closer and deeper into Task Manager you will find the Windows Defender Malware app going bonkers and eating up CPU and affecting these DENIS v0.15 tasks. Linux crunching remains unaffected.
ID: 1792 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
grumpy

Send message
Joined: 17 Jul 22
Posts: 9
Credit: 431,030
RAC: 2
Message 1793 - Posted: 30 Jul 2022, 3:52:03 UTC - in response to Message 1792.  
Last modified: 30 Jul 2022, 3:56:59 UTC

nothing to do with Windows Defender because i'm not running it on my win 11 machine.
Anyway the wu's I have are ending computation errors.
They would overreach the dead line!
ID: 1793 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
grumpy

Send message
Joined: 17 Jul 22
Posts: 9
Credit: 431,030
RAC: 2
Message 1794 - Posted: 30 Jul 2022, 4:23:52 UTC
Last modified: 30 Jul 2022, 4:24:21 UTC

These wu's are not behaving correctly . Had to force terminate them after Boinc had exited, still running in the background!
AV's will react if they look like malware or viruses activities.
ID: 1794 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[VENETO] boboviz

Send message
Joined: 9 Apr 15
Posts: 155
Credit: 644,645
RAC: 0
Message 1795 - Posted: 30 Jul 2022, 4:41:48 UTC - in response to Message 1794.  

AV's will react if they look like malware or viruses activities.


+1
I'll try to stop Windows Defender to stop the scan...
ID: 1795 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 8 Apr 15
Posts: 30
Credit: 181,937
RAC: 0
Message 1796 - Posted: 30 Jul 2022, 6:26:30 UTC - in response to Message 1794.  

These wu's are not behaving correctly . Had to force terminate them after Boinc had exited, still running in the background!
AV's will react if they look like malware or viruses activities.

Had to do SAME thing on All my Windows PC's (Vista, Win 7 Win 8.1 - all 64 bit). The tasks were using memory & disk space but not CPU after aborting tasks in BOINC.

The Vista PC did not have any MS AV's software to "interfere".
The Win 7 & Win 8.1 PC's do have MS AV but I have BOINC data folders specifically DISABLED from AV scan and according to ProcessExplorer MS AV did not have the files "open" for scan, etc.

Have set all PC's to NO NEW WORK ON THE DENIS PROJECT UNTIL THINGS GET FIXED (probably next week some time since it's early Saturday AM in Spain).

ID: 1796 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DaveW

Send message
Joined: 7 Jul 22
Posts: 7
Credit: 300,007
RAC: 0
Message 1797 - Posted: 30 Jul 2022, 8:09:36 UTC
Last modified: 30 Jul 2022, 8:21:53 UTC

Same problem here. Turn Defender off: https://support.microsoft.com/en-us/windows/turn-off-defender-antivirus-protection-in-windows-security-99e6004f-c54c-8509-773c-a4d776b77960

Further units are stalling after about 1.4%. Is this a batch of units in particular that are affected?

NNT - back to PG for me. It was going so well .
ID: 1797 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Apr 15
Posts: 29
Credit: 1,426,883
RAC: 1
Message 1798 - Posted: 30 Jul 2022, 8:51:34 UTC - in response to Message 1789.  

They are running OK for me under Ubuntu 20.04.4.
They complete in the usual 51 minutes.
https://denis.usj.es/denisathome/results.php?hostid=215226&offset=0&show_names=0&state=4&appid=
ID: 1798 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile SEARCHER
Avatar

Send message
Joined: 12 Apr 15
Posts: 4
Credit: 317,686
RAC: 0
Message 1799 - Posted: 30 Jul 2022, 11:23:43 UTC

Good Day and Hello,

sorry I have now many Problems with the new WU`s too. My Antivir Scanner from AVIRA make big Trouble about the Checkpoints and say was a Virus by me. Then break most WU after 6 hours up by me. I stop now Projekt DENIS, why i crunch with 4 Maschines for nothing and I have only Errors and Virus Alarms by me. I go now to Project RAKE SEARCH.

Greetz SEARCHER
Member of CHARITY TEAM
Member of Team FREE TIBET/ TIBET LIBRE
ID: 1799 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[VENETO] boboviz

Send message
Joined: 9 Apr 15
Posts: 155
Credit: 644,645
RAC: 0
Message 1800 - Posted: 30 Jul 2022, 13:46:49 UTC

I disabled the MS antivirus, but still very slow.
After 2hrs the wus are at 38% (other versions finished after 70 minutes)
ID: 1800 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
grumpy

Send message
Joined: 17 Jul 22
Posts: 9
Credit: 431,030
RAC: 2
Message 1801 - Posted: 30 Jul 2022, 14:28:52 UTC
Last modified: 30 Jul 2022, 14:30:36 UTC

Those ( still BETA) (0.15) wu's shows up with an estimated time of 24 minutes but then grow to hours on this computer any way, they are terminated.... next batch ...please!
I use kaspersky av ( not defender) and it did react but, that was less than 4% of the total activities no big deal.
As far as the unix-like os that would be a different coding and may not react the same.
I have no option settings selected to leave boinc running in the background (services) or idling in memory storage so they should terminate with boinc.
ID: 1801 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dirk Broer
Avatar

Send message
Joined: 13 Apr 15
Posts: 10
Credit: 1,020,654
RAC: 0
Message 1802 - Posted: 30 Jul 2022, 15:26:46 UTC

Aborted mine (Beta of DENIS-myocyte v0.15 windows_x86_64) after they had come to a standstill (estimated running times up to 97 days!)
ID: 1802 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rjs5

Send message
Joined: 3 Nov 15
Posts: 22
Credit: 1,136,105
RAC: 0
Message 1803 - Posted: 31 Jul 2022, 14:31:31 UTC - in response to Message 1798.  

They are running OK for me under Ubuntu 20.04.4.
They complete in the usual 51 minutes.
https://denis.usj.es/denisathome/results.php?hostid=215226&offset=0&show_names=0&state=4&appid=


Fedora 36 is working OK too. The two Windows machines are both failing. I have the Boinc Data directory exempted from Norton antivirus so I am pretty sure there is no antivirus involvement.

I ran the free version of Intel Vtune on my system running multiple Denis WU and did not see anything obvious. I will try that again tonight and look again. There has to be something that is different between Windows and Linux. The thing that comes up for me is the difference in the file systems. Linux will allow multiple opens on a file where Windows will not.
ID: 1803 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
grumpy

Send message
Joined: 17 Jul 22
Posts: 9
Credit: 431,030
RAC: 2
Message 1804 - Posted: 31 Jul 2022, 17:12:13 UTC

My results show large stderr files.... saying " problem saving checkpoints" ! probable problems
ID: 1804 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jesús Carro
Project administrator
Project developer
Project scientist
Help desk expert
Avatar

Send message
Joined: 18 Mar 15
Posts: 198
Credit: 452,469
RAC: 0
Message 1805 - Posted: 1 Aug 2022, 6:57:49 UTC

Hi!
We are experience problems with the chekpoint in windows hosts. We have tryed to add a temporary file for the checkpoint to avoid corrupted checkpoints, but it fails when it tries to rename it. As the checkpoint fails, it tries again in all the iterations... for that reason the aplication goes so slow. I will upload a new version solving it as fast as possible.

Many thanks for the comments. Checking your taks it easier to find the problem.

Best,
Jesús.
Jesús Carro
Universidad San Jorge
@InSilicoHeart
ID: 1805 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rjs5

Send message
Joined: 3 Nov 15
Posts: 22
Credit: 1,136,105
RAC: 0
Message 1806 - Posted: 2 Aug 2022, 4:48:18 UTC - in response to Message 1805.  

Hi!
We are experience problems with the chekpoint in windows hosts. We have tryed to add a temporary file for the checkpoint to avoid corrupted checkpoints, but it fails when it tries to rename it. As the checkpoint fails, it tries again in all the iterations... for that reason the aplication goes so slow. I will upload a new version solving it as fast as possible.

Many thanks for the comments. Checking your taks it easier to find the problem.

Best,
Jesús.



Myocyte v0.16 Beta runs as expected on both Windows 11 and Linux Fedora in expected time on my machines.

Checkpointing seems to be happening every 2 minutes which seems to be a little too frequent.
ID: 1806 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[VENETO] boboviz

Send message
Joined: 9 Apr 15
Posts: 155
Credit: 644,645
RAC: 0
Message 1808 - Posted: 2 Aug 2022, 6:30:19 UTC - in response to Message 1806.  

Checkpointing seems to be happening every 2 minutes which seems to be a little too frequent.


With a strange behaviour.
I restarted my pc with 4 wus at 88%.
After the restart, 1 wus was at 88%, the others 3 restarted from 0%
ID: 1808 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DaveW

Send message
Joined: 7 Jul 22
Posts: 7
Credit: 300,007
RAC: 0
Message 1809 - Posted: 2 Aug 2022, 7:33:38 UTC - in response to Message 1808.  

After the restart, 1 wus was at 88%, the others 3 restarted from 0%


Ouch. I'll continue to wait.
ID: 1809 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jesús Carro
Project administrator
Project developer
Project scientist
Help desk expert
Avatar

Send message
Joined: 18 Mar 15
Posts: 198
Credit: 452,469
RAC: 0
Message 1811 - Posted: 2 Aug 2022, 11:17:49 UTC - in response to Message 1808.  

Checkpointing seems to be happening every 2 minutes which seems to be a little too frequent.


With a strange behaviour.
I restarted my pc with 4 wus at 88%.
After the restart, 1 wus was at 88%, the others 3 restarted from 0%


Hi!
This is due to the checkpoint issue, but at least now it's detected and reset so you get credit for all the compute time. In some cases, in windows, the checkpoint is not completely saved, and that leaves a corrupted checkpoint. In previous versions, the process continued with the erroneous data and in the validation your result was discarded. Now the program detects it and restarts the simulation so that the results you send are valid and you are given credit.

We know that this behavior is not the best one because in some cases it will act as no checkpoint, but it is an improvement from previous versions. The tasks are not very long so it is not a big problem, but we want to improve it. In the next version we will try a double checkpoint file system (while we create a new checkpoint, the previous one is keept to be sure the program has at least one valid checkpoint). We tried it in a very simple way in version 0.15 and it didn't work, but we will improve it to make it more robust.

The frequency of checkpointing is decided by the boinc client. We do not control it. We only control in what parts of the program it can be done.

Best,
Jesús.
Jesús Carro
Universidad San Jorge
@InSilicoHeart
ID: 1811 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Myocyte v0.15 Beta