𝕏

Problems in the server

Message boards : News : Problems in the server
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Jesús Carro
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 18 Mar 15
Posts: 264
Credit: 493,295
RAC: 73
Message 1235 - Posted: 22 Jun 2017, 15:27:37 UTC

Dear volunteers,
These days we have been suffering problems in the database due to an update. Many information was corrupted in the process. We have recovery a backup copy but it is a bit old.
We are working to get everything working well as soon as possible.

Please, be patient and thank you in advance.

Best regards.
Jesús Carro
Universidad San Jorge
@InSilicoHeart
ID: 1235 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TheFiend

Send message
Joined: 7 Nov 15
Posts: 10
Credit: 23,581,533
RAC: 57,155
Message 1236 - Posted: 22 Jun 2017, 15:46:01 UTC - in response to Message 1235.  

My stats have dropped down to 2016 level..... did have 3.186 million, now showing 1.925 million :(
ID: 1236 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Pakal

Send message
Joined: 18 Aug 15
Posts: 3
Credit: 106,678
RAC: 0
Message 1237 - Posted: 22 Jun 2017, 21:16:48 UTC

I had running one task and another few waiting for run, I clicked "Update" in BOINC manager and all tasks from DENIS has been cancelled.
ID: 1237 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Michael H.W. Weber
Avatar

Send message
Joined: 9 Apr 15
Posts: 11
Credit: 327,393
RAC: 3,050
Message 1238 - Posted: 22 Jun 2017, 21:55:21 UTC

You have deleted months of work.

Michael.
President of Rechenkraft.net, Principal Investigator of the RNA World distributed supercomputer.

ID: 1238 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 8 Apr 15
Posts: 32
Credit: 332,000
RAC: 66
Message 1239 - Posted: 22 Jun 2017, 22:22:31 UTC

I HAD 94,882 credits the day before the crash and lost at least 66,781 according to my BOINCSTATS TRACKING PAGE.[url][/url]

ID: 1239 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dirk Broer
Avatar

Send message
Joined: 13 Apr 15
Posts: 13
Credit: 1,435,790
RAC: 4,022
Message 1240 - Posted: 22 Jun 2017, 22:33:16 UTC - in response to Message 1239.  

There are people who have lost more than 1,000,000 credits, look it up in BOINCStats and/or FreeDC...
ID: 1240 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dirk Broer
Avatar

Send message
Joined: 13 Apr 15
Posts: 13
Credit: 1,435,790
RAC: 4,022
Message 1241 - Posted: 23 Jun 2017, 1:07:57 UTC - in response to Message 1235.  

Dear volunteers,
These days we have been suffering problems in the database due to an update. Many information was corrupted in the process. We have recovery a backup copy but it is a bit old.
We are working to get everything working well as soon as possible.

Please, be patient and thank you in advance.

Best regards.


Next time you update, do a backup first...
In the meantime: Good luck with the restore, I hope the project hasn't lost data.
ID: 1241 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
UBT - Timbo

Send message
Joined: 9 Apr 15
Posts: 5
Credit: 523,971
RAC: 986
Message 1242 - Posted: 23 Jun 2017, 1:27:07 UTC - in response to Message 1241.  

hear hear !!!

lesson no. 1: make a backup BEFORE any routine OR important updates
lesson no. 2: make sure the backup can be restored (onto a spare machine if needs be).
lesson no.3: rotate backups on a very regular basis - if your last backup is 6+months old, your disaster recovery procedures need rewriting and your IT dept head should be changed.

I hope people will be understanding...but losing many peoples time and money (electricity bills, hardware etc) makes recommending this project more difficult if the data is lost due to negligence.

Such a shame :-(

Hope you can resolve the issues.

Tim
Founder, UK BOINC Team
ID: 1242 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Michael H.W. Weber
Avatar

Send message
Joined: 9 Apr 15
Posts: 11
Credit: 327,393
RAC: 3,050
Message 1243 - Posted: 23 Jun 2017, 4:28:52 UTC
Last modified: 23 Jun 2017, 4:31:25 UTC

You need a DAILY backup of your entire data and you need to rotate storage of these backups such that potentially bad data doesn't overwrite a good backup. You did not even check your server status for at east 2 days because otherwise you would have noticed much earlier that your entire website was gone, too.

Obviously, you have not made backups for more than 6 months as can be derived from the stats loss. All OUR energy ressources which we invested in your research progress have therefore been wasted if you do not come up with a backup.

The least one can expect, however, is a clean restoration of the credits plus badges which all have gone.

And one more thing: I expect you to STANDARDLY implement the optimized apps for all platforms to stop wasting our energy ressources! The changes to your original code and its testing is well documented and consists of the implementation of profoundly more professional code compared to your original (e.g. GROMACS floating point operations routines, if I remember correctly).

For many years I have witnessed this igorance of, what I like to call, "ivory tower researchers" working in multiple research fields who on the one hand believe, they have all the wisdom in the world while - in fact - they are not even capable of evaluating a code optimization which is suggested to them free of charge and would boost their research throughput profoundly while saving our energy.
I met high energy physicists who laugh at the distributed computing community and continue to waste tax payers money for high performance cluster setup and maintenance not realizing that these HPC, while suitable (and sometimes even required) for some (limited types of) tasks, are basically a "fly's fart in the wind" compared the compute power the DC community offers for free.
The same is true for many biotechnologists.

If you don't comply to the above mentioned requirements, you will probably need a vast amount of money in the very near future to re-fund the re-gathering of all the data that now was lost - then using a costly HPC system.
Because we will be out.

Michael.
President of Rechenkraft.net, Principal Investigator of the RNA World distributed supercomputer.

ID: 1243 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 8 Apr 15
Posts: 32
Credit: 332,000
RAC: 66
Message 1244 - Posted: 23 Jun 2017, 4:36:49 UTC

Until things have finally been FULLY RESTORED to where they were pre-database/server crash I have decided to QUIT SUPPORTING THIS PROJECT with my time and electricity.

As said by UBT - Timbo in Message 1242
"I hope people will be understanding...but losing many peoples time and money (electricity bills, hardware etc) makes recommending this project more difficult if the data is lost due to negligence."


ID: 1244 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ArcSedna

Send message
Joined: 22 May 15
Posts: 2
Credit: 8,123,864
RAC: 193
Message 1245 - Posted: 23 Jun 2017, 5:27:29 UTC

2 days ago, I had approximately 8,190,000 credits.
Now I have only 3,659,616 ...

Right now my computers have about 200 work units pending upload or report,
but they seem to be not on the Task List Web Page

I'm doubt I can continue to contribute to this project or not ...
ID: 1245 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [B@P] Daniel

Send message
Joined: 2 Oct 16
Posts: 7
Credit: 1,038,237
RAC: 0
Message 1246 - Posted: 23 Jun 2017, 7:30:22 UTC

I has about 900k and all this has been wiped out completely :( I hope you will be able to fix this.
ID: 1246 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jesús Carro
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 18 Mar 15
Posts: 264
Credit: 493,295
RAC: 73
Message 1247 - Posted: 23 Jun 2017, 8:21:49 UTC

Dear all,
I give you some more information to try to clarify some issues regarding this problem.
All the results of the simulations are safe. Every hour they are uploaded to Google Drive. Only the workunits of the last three days could be lost.
Your statistics are stored each day in an xml file. We'll be able to solve the problem with your credit but we need time to develop a script to do it.
We have an script to do backup every day and to store it in two different servers. I don't know yet why it has not been working. I have to find out. In parallel, I'm talking with the IT department to improve the backup system.
I fully understand everything you are saying and how you feel. Please also understand my surprise when I discovered that things that would have to be doing well were not working. I ask again patience. Give it for sure that I will do everything in my hand to try to fix it.

Best regards,
Jesús
Jesús Carro
Universidad San Jorge
@InSilicoHeart
ID: 1247 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Michael H.W. Weber
Avatar

Send message
Joined: 9 Apr 15
Posts: 11
Credit: 327,393
RAC: 3,050
Message 1248 - Posted: 23 Jun 2017, 9:20:23 UTC

Thank you very much for this more detailed feedback.
After this, I think, the situation might be much better than I had initially thought.

I hope you will be able to solve the problems fast.

But I stick to my request to deliver the improved apps as standard apps directly from the DENIS@home website: A speed increase to you means an increase in results throughput. To us it means a reduction in electricity costs and hardware stressing. And as far as I remember, that speed increase was more than just significant.

Michael.
President of Rechenkraft.net, Principal Investigator of the RNA World distributed supercomputer.

ID: 1248 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ArcSedna

Send message
Joined: 22 May 15
Posts: 2
Credit: 8,123,864
RAC: 193
Message 1249 - Posted: 23 Jun 2017, 10:56:35 UTC - in response to Message 1247.  

Thank you for the information.

As the simulation result files are safe, I feel relieved.
Good luck for the recovery, and hope we can continue the science!
ID: 1249 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [B@P] Daniel

Send message
Joined: 2 Oct 16
Posts: 7
Credit: 1,038,237
RAC: 0
Message 1250 - Posted: 23 Jun 2017, 11:13:28 UTC

Thanks for update!

BTW, if you could not recover stats from xml files, as a last resort you could try to load data from external stats sites.
ID: 1250 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
UBT - Timbo

Send message
Joined: 9 Apr 15
Posts: 5
Credit: 523,971
RAC: 986
Message 1251 - Posted: 23 Jun 2017, 14:47:11 UTC - in response to Message 1247.  

Dear all,
I give you some more information to try to clarify some issues regarding this problem.
All the results of the simulations are safe. Every hour they are uploaded to Google Drive. Only the workunits of the last three days could be lost.


Hi Jesus

Excellent news - if only a small fraction of the data has been lost then this makes for a better situation. 3 days loss is much much better than 6+ months !!




Your statistics are stored each day in an xml file. We'll be able to solve the problem with your credit but we need time to develop a script to do it.
We have an script to do backup every day and to store it in two different servers. I don't know yet why it has not been working. I have to find out. In parallel, I'm talking with the IT department to improve the backup system.
I fully understand everything you are saying and how you feel. Please also understand my surprise when I discovered that things that would have to be doing well were not working. I ask again patience. Give it for sure that I will do everything in my hand to try to fix it.

Best regards,
Jesús


For me, I think the stats are less important - sure they are nice to know how much each member contributes, but it is the science that is important.

If you can fix some of the "knock on" effects from the server crash then all the better. This will at least restore some peoples faith in the project.

I would even suggest that you suspend the project for a week, so you have time to get any fixes in place and apply them.

regards and good luck with all of this !
Tim
ID: 1251 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DRSMT

Send message
Joined: 10 Oct 16
Posts: 18
Credit: 10,374,522
RAC: 36,189
Message 1252 - Posted: 23 Jun 2017, 16:15:31 UTC
Last modified: 23 Jun 2017, 16:16:25 UTC

Real crunchers are not here to get credits, so just stop crying about something such unimportant as that. Real crunchers are here to support science and in the case of DENIS@home - at least in my opinion - we support the most important section of science: medical research. That the project has maybe lost some data and was down for a few days, well such things just happen. As soon as the servers are on again, we will push this project further than it ever was before :-)
ID: 1252 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TheFiend

Send message
Joined: 7 Nov 15
Posts: 10
Credit: 23,581,533
RAC: 57,155
Message 1253 - Posted: 23 Jun 2017, 16:52:32 UTC - in response to Message 1252.  

Real crunchers are not here to get credits, so just stop crying about something such unimportant as that. Real crunchers are here to support science and in the case of DENIS@home - at least in my opinion - we support the most important section of science: medical research. That the project has maybe lost some data and was down for a few days, well such things just happen. As soon as the servers are on again, we will push this project further than it ever was before :-)


Well said!!!!

My contribution will continue..... the science matters, the BOINC credit does not!!!
ID: 1253 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tex1954

Send message
Joined: 7 Jul 15
Posts: 28
Credit: 29,677,004
RAC: 28,080
Message 1254 - Posted: 23 Jun 2017, 17:23:47 UTC - in response to Message 1253.  
Last modified: 23 Jun 2017, 17:30:22 UTC

To Thomas as well as TheFiend...

Ever notice that items seem to bring a better price at auctions rather than a simple flea market sale? More participation and contests promote more crunching!!!

BOINC Points are a measure of contribution relative to other contributors. Having points and teams and competitions ONLY HELP the cause!

I think it's rather narrow minded and unfair to say "the science matters, the BOINC credit does not!!!" when it does in fact promote MORE participation.

"Real crunchers are not here to get credits, so just stop crying about something such unimportant as that." WOW and WOW! So I am not a real cruncher? Others are not real crunchers? Judge Jury Executioner you are? I have 13 setups running at the moment and I care about points and I am not a real cruncher? Good grief I hope nobody believes that statement!!

Y'all suppose anybody who cares about points should just shut up and BOINC or not without any comments? Doesn't that viewpoint promote alienation and possibly cause folks to participate less?

Points matter; maybe not as much as the science involved, but it's far from zero and is a valid complaint subject.

8-)
ID: 1254 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : News : Problems in the server