Lots of WUs "Error While Computing"
Message boards :
Number crunching :
Lots of WUs "Error While Computing"
Message board moderation
Author | Message |
---|---|
Send message Joined: 8 Mar 16 Posts: 2 Credit: 5,523,829 RAC: 0 |
Hope it's just my computers (but I wouldn't know why), but I've had nearly half of the WUs I got return Error While Computing. The ones that validate only run for a second? Someone please tell me what's going on. |
Send message Joined: 18 Mar 15 Posts: 284 Credit: 2,748,608 RAC: 0 |
Hi Steve! Thank you for your feedback and sorry for the problems. We will analyze what is happening. Best, Jesús Jesús Carro Universidad San Jorge @InSilicoHeart |
Send message Joined: 5 Oct 15 Posts: 17 Credit: 1,335,501 RAC: 0 |
One of my computers successfully completed a couple units which took a couple of hours, as expected. Its stderr.txt output is over 2400 lines long. However, this computer also successfully completed units which took seconds, and their stderr.txt is only about 50 lines. Here is an example: Name GD_jcarro_20160708110503000000_SecondSimulations_SteadyState4000_conf_917.xml_0 Workunit 18800550 Created 8 Jul 2016, 9:05:04 UTC Sent 8 Jul 2016, 11:33:53 UTC Report deadline 22 Jul 2016, 11:33:53 UTC Received 8 Jul 2016, 11:34:25 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 62844 Run time 2 sec CPU time Validate state Valid Credit 0.01 Device peak FLOPS 2.60 GFLOPS Application version Carro-Rodriguez-Laguna-Pueyo Epicardial Model (Carro et al. 2011) for human ventricular cells v1.06 Stderr output <core_client_version>7.2.47</core_client_version> <![CDATA[ <stderr_txt> MName:CRLP2011_EPI MID:0 OpT:12000000.000000 DT:0.002000 OutFreq:50 InT:11996000.000000 NumConstToChange:15 NumStatesToPrint:1 NumAlgToPrint:0 CC ID:16 NAME: G_Na in component Fast_Na_Current VALUE:13.5582 CC ID:17 NAME: G_Na_B in component Background_Na_Current VALUE:0.000690863 CC ID:23 NAME: G_Kr in component Rapidly_Activating_K_Current VALUE:0.0269904 CC ID:24 NAME: G_Ks in component Slowly_Activating_K_Current VALUE:0.00349788 CC ID:25 NAME: G_Kp in component Plateau_K_Current VALUE:0.00157695 CC ID:26 NAME: G_to in component Transient_Outward_K_Current VALUE:0.124306 CC ID:28 NAME: G_K1 in component Inward_Rectifier_K_Current VALUE:0.609042 CC ID:29 NAME: G_ClCa in component Ca_Activated_Cl_Current VALUE:0.0414283 CC ID:31 NAME: G_Cl_B in component Background_Cl_Current VALUE:0.0064337 CC ID:34 NAME: G_Ca in component L_Type_Calcium_Current VALUE:0.000175819 CC ID:47 NAME: G_Ca_B in component Background_Ca_Current VALUE:0.000676662 CC ID:19 NAME: Ibar_NaK in component Na_K_Pump_Current VALUE:0.907297 CC ID:44 NAME: Ibar_NCX in component Na_Ca_Exchanger_Current VALUE:5.70955 CC ID:46 NAME: Ibar_PMCA in component Sarcolemmal_Ca_Pump_Current VALUE:0.0681847 CC ID:7 NAME: I_Stim_CL in component membrane VALUE:4000 STP ID:0 - V in component membrane CONFIG END SolveModel 388 SolveModel 406 NUMC2CHANGE: 15 SolveModel 410 ITER: 0 , 16 --- 1.355820e+001 SolveModel 410 ITER: 1 , 17 --- 6.908630e-004 SolveModel 410 ITER: 2 , 23 --- 2.699040e-002 SolveModel 410 ITER: 3 , 24 --- 3.497880e-003 SolveModel 410 ITER: 4 , 25 --- 1.576950e-003 SolveModel 410 ITER: 5 , 26 --- 1.243060e-001 SolveModel 410 ITER: 6 , 28 --- 6.090420e-001 SolveModel 410 ITER: 7 , 29 --- 4.142830e-002 SolveModel 410 ITER: 8 , 31 --- 6.433700e-003 SolveModel 410 ITER: 9 , 34 --- 1.758190e-004 SolveModel 410 ITER: 10 , 47 --- 6.766620e-004 SolveModel 410 ITER: 11 , 19 --- 9.072970e-001 SolveModel 410 ITER: 12 , 44 --- 5.709550e+000 SolveModel 410 ITER: 13 , 46 --- 6.818470e-002 SolveModel 410 ITER: 14 , 7 --- 4.000000e+003 SolveModel 413 PRINTABLE_STATE ID:0 07:33:49 (5300): called boinc_finish(0) </stderr_txt> ]]> |
Send message Joined: 18 Mar 15 Posts: 284 Credit: 2,748,608 RAC: 0 |
Yes, we have detected that there is a problem with the files of SteadyState2000 and SteadyState4000. We are working to find what is happening. In local it works properly. Best, JEsús. Jesús Carro Universidad San Jorge @InSilicoHeart |
Send message Joined: 18 Mar 15 Posts: 284 Credit: 2,748,608 RAC: 0 |
Hi! We have uploaded a new version in which one, the bug is fixed. If you detect it again, please report it here. Thank you very much! Best, Jesús. Jesús Carro Universidad San Jorge @InSilicoHeart |
Send message Joined: 1 Jul 15 Posts: 2 Credit: 243,560 RAC: 0 |
I got also one result with error while computing GD_jcarro_20160714201100000000_ThirdSimulations_SteadyState1000Schmidt98_conf_474.xml_0 CPU time: 23 hours <message> exceeded elapsed time limit 92043.99 (368000.00G/3.15G) </message> Matthias |
Send message Joined: 11 Jul 16 Posts: 1 Credit: 78,376 RAC: 0 |
I am having a lot of very long wu end with error while computing. If you need any info let me know. <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> MName:CRLP2011_EPI MID:0 OpT:24000000.000000 DT:0.002000 OutFreq:50 InT:23992000.000000 NumConstToChange:21 NumStatesToPrint:2 NumAlgToPrint:0 CC ID:16 NAME: G_Na in component Fast_Na_Current VALUE:19.49890136 CC ID:17 NAME: G_Na_B in component Background_Na_Current VALUE:0.000473911734 CC ID:23 NAME: G_Kr in component Rapidly_Activating_K_Current VALUE:0.0333725 CC ID:24 NAME: G_Ks in component Slowly_Activating_K_Current VALUE:0.003911159 CC ID:25 NAME: G_Kp in component Plateau_K_Current VALUE:0.002040128 CC ID:26 NAME: G_to in component Transient_Outward_K_Current VALUE:0.13697164 CC ID:28 NAME: G_K1 in component Inward_Rectifier_K_Current VALUE:0.7089200967 CC ID:29 NAME: G_ClCa in component Ca_Activated_Cl_Current VALUE:0.069126099438 CC ID:31 NAME: G_Cl_B in component Background_Cl_Current VALUE:0.01069641 CC ID:34 NAME: G_Ca in component L_Type_Calcium_Current VALUE:0.0001851241056 CC ID:47 NAME: G_Ca_B in component Background_Ca_Current VALUE:0.0006057563114 CC ID:19 NAME: Ibar_NaK in component Na_K_Pump_Current VALUE:1.05791202 CC ID:44 NAME: Ibar_NCX in component Na_Ca_Exchanger_Current VALUE:4.823514 CC ID:46 NAME: Ibar_PMCA in component Sarcolemmal_Ca_Pump_Current VALUE:0.0557848354 CC ID:12 NAME: J_Ca_juncsl in component membrane VALUE:8.0706556422e-13 CC ID:13 NAME: J_Ca_slmyo in component membrane VALUE:4.0001812468e-12 CC ID:60 NAME: k_SR_leak in component SR_Fluxes VALUE:6.513019016e-06 CC ID:55 NAME: ks in component SR_Fluxes VALUE:28.42225 CC ID:58 NAME: V_max_SR_CaP in component SR_Fluxes VALUE:0.0050332844732 CC ID:7 NAME: I_Stim_CL in component membrane VALUE:8000 CC ID:33 NAME: Ca_o in component Calcium_Concentrations VALUE:2.5 STP ID:0 - V in component membrane STP ID:23 - Ca_i in component Calcium_Concentrations CONFIG END Sniped a lot of CP Doing CP It:6132953918.000000 Doing CP It:6138905580.000000 Doing CP It:6144935804.000000 Doing CP It:6151107187.000000 Doing CP It:6157221981.000000SIGSEGV: segmentation violation Stack trace (8 frames): ../../projects/denis.usj.es_denisathome/CRLP2011EPI_107_x86_64-pc-linux-gnu(boinc_catch_signal+0x57)[0x485117] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7ff571a65330] /lib/x86_64-linux-gnu/libc.so.6(_IO_vfprintf+0x2d)[0x7ff5716d9c6d] /lib/x86_64-linux-gnu/libc.so.6(_IO_fprintf+0x87)[0x7ff5716e4337] ../../projects/denis.usj.es_denisathome/CRLP2011EPI_107_x86_64-pc-linux-gnu[0x40ff61] ../../projects/denis.usj.es_denisathome/CRLP2011EPI_107_x86_64-pc-linux-gnu[0x41d309] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7ff5716b1f45] ../../projects/denis.usj.es_denisathome/CRLP2011EPI_107_x86_64-pc-linux-gnu[0x4048b9] Exiting... </stderr_txt> ]]> |
Send message Joined: 16 Mar 15 Posts: 219 Credit: 14,859 RAC: 0 |
Hi! This bug should be fixed using version 1.08 of CRLP2011 applications. Refresh your jobs. Best regards, Joel. |
Send message Joined: 21 May 15 Posts: 5 Credit: 1,874,902 RAC: 0 |
A lot of wasted hours of CPU time......I'm not sure I should trust this project. My work was listed as error but 2 Linux users completed in a lot shorter time frame. There should be a way to keep the OS's separate. . |
Send message Joined: 5 Oct 15 Posts: 17 Credit: 1,335,501 RAC: 0 |
I don't think it's quite fixed. On a Linux box: GD_jcarro_20160714201445000000_ThirdSimulations_SteadyState2000Schmidt98_conf_93.xml_2 Outcome Computation error Client state Compute error Exit status 193 (0xc1) EXIT_SIGNAL Computer ID 62747 Run time 1 days 1 hours 5 min 53 sec CPU time 1 days 0 hours 20 min 58 sec Validate state Invalid Credit 0.00 Device peak FLOPS 0.77 GFLOPS Application version Carro-Rodriguez-Laguna-Pueyo Epicardial Model (Carro et al. 2011) for human ventricular cells v1.08 Peak working set size 4.36 MB Peak swap size 14.12 MB Peak disk usage 0.05 MB Doing CP It:2185491915.000000 Doing CP It:2187622491.000000 Doing CP It:2189763663.000000 Doing CP It:2191921198.000000 Doing CP It:2194083377.000000 Doing CP It:2196235201.000000SIGSEGV: segmentation violation Stack trace (8 frames): ../../projects/denis.usj.es_denisathome/CRLP2011EPI_108_x86_64-pc-linux-gnu(boinc_catch_signal+0x57)[0x4ca917] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10d10)[0x7f8fe7dadd10] /lib/x86_64-linux-gnu/libc.so.6(_IO_vfprintf+0x24)[0x7f8fe7a1cc44] /lib/x86_64-linux-gnu/libc.so.6(_IO_fprintf+0x87)[0x7f8fe7a27b97] ../../projects/denis.usj.es_denisathome/CRLP2011EPI_108_x86_64-pc-linux-gnu[0x41b225] ../../projects/denis.usj.es_denisathome/CRLP2011EPI_108_x86_64-pc-linux-gnu[0x462b0f] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f8fe79f3a40] ../../projects/denis.usj.es_denisathome/CRLP2011EPI_108_x86_64-pc-linux-gnu[0x4048b9] Exiting... </stderr_txt> ]]> There are 4 related units, 2 are In Progress, and 2 also errored. One of the errors was also on version 1.08. |
Send message Joined: 16 Mar 15 Posts: 219 Credit: 14,859 RAC: 0 |
Hi, we will take it into account. That seems to be related to something during the calculus inside the simulation, we will upgrade our app as soon as possible. Best regards, Joel. |
Send message Joined: 21 May 15 Posts: 5 Credit: 1,874,902 RAC: 0 |
http://denis.usj.es/denisathome/result.php?resultid=38323222 Why? all the results lately seem to fail after a couple of days on a Windows PC? If windows is going to fail, do not provide WU's for them. That is on your end. I am using 7.6.22 and I have updated my PC OS and Video drivers, so I am current. I am letting one of the 1.08 tasks continue to see if it completes as expected, but it still says 1 day and 7 hours to go after 8 hours of work. Not at all like the Linux boxes. |
Send message Joined: 12 Apr 15 Posts: 1 Credit: 178,729 RAC: 0 |
Task 38376354 failed on a Linux system, app v1.08: Compute error for a SteadyState3000 simulation. Exit status 193 (0xc1) EXIT_SIGNAL Run time 1 days 0 hours 32 min 10 sec CPU time 17 hours 57 min SIGSEGV: segmentation violation Task 38355439 also failed after 4 days 21 hours albeit on app v1.07. That system still performs reasonably well; 36 Valid, 3 Invalid and only 2 finished in an error. |