Problems in the server
log in

Advanced search

Message boards : News : Problems in the server

1 · 2 · 3 · 4 · Next
Author Message
Profile Chus Carro
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 18 Mar 15
Posts: 82
Credit: 424,346
RAC: 0
Message 1235 - Posted: 22 Jun 2017, 15:27:37 UTC

Dear volunteers,
These days we have been suffering problems in the database due to an update. Many information was corrupted in the process. We have recovery a backup copy but it is a bit old.
We are working to get everything working well as soon as possible.

Please, be patient and thank you in advance.

Best regards.
____________
Jesús Carro
San Jorge University
@ChusCarro

TheFiend
Send message
Joined: 7 Nov 15
Posts: 8
Credit: 3,291,400
RAC: 0
Message 1236 - Posted: 22 Jun 2017, 15:46:01 UTC - in response to Message 1235.

My stats have dropped down to 2016 level..... did have 3.186 million, now showing 1.925 million :(

Profile Pakal
Send message
Joined: 18 Aug 15
Posts: 3
Credit: 106,678
RAC: 0
Message 1237 - Posted: 22 Jun 2017, 21:16:48 UTC

I had running one task and another few waiting for run, I clicked "Update" in BOINC manager and all tasks from DENIS has been cancelled.

Profile Michael H.W. Weber
Avatar
Send message
Joined: 9 Apr 15
Posts: 9
Credit: 110,490
RAC: 0
Message 1238 - Posted: 22 Jun 2017, 21:55:21 UTC

You have deleted months of work.

Michael.
____________
President of Rechenkraft.net, Principal Investigator of the RNA World distributed supercomputer.

Dr Who Fan
Avatar
Send message
Joined: 8 Apr 15
Posts: 26
Credit: 101,303
RAC: 0
Message 1239 - Posted: 22 Jun 2017, 22:22:31 UTC

I HAD 94,882 credits the day before the crash and lost at least 66,781 according to my BOINCSTATS TRACKING PAGE.[url][/url]
____________

Dirk Broer
Send message
Joined: 13 Apr 15
Posts: 6
Credit: 704,192
RAC: 0
Message 1240 - Posted: 22 Jun 2017, 22:33:16 UTC - in response to Message 1239.

There are people who have lost more than 1,000,000 credits, look it up in BOINCStats and/or FreeDC...

Dirk Broer
Send message
Joined: 13 Apr 15
Posts: 6
Credit: 704,192
RAC: 0
Message 1241 - Posted: 23 Jun 2017, 1:07:57 UTC - in response to Message 1235.

Dear volunteers,
These days we have been suffering problems in the database due to an update. Many information was corrupted in the process. We have recovery a backup copy but it is a bit old.
We are working to get everything working well as soon as possible.

Please, be patient and thank you in advance.

Best regards.


Next time you update, do a backup first...
In the meantime: Good luck with the restore, I hope the project hasn't lost data.

UBT - Timbo
Send message
Joined: 9 Apr 15
Posts: 3
Credit: 291,969
RAC: 0
Message 1242 - Posted: 23 Jun 2017, 1:27:07 UTC - in response to Message 1241.

hear hear !!!

lesson no. 1: make a backup BEFORE any routine OR important updates
lesson no. 2: make sure the backup can be restored (onto a spare machine if needs be).
lesson no.3: rotate backups on a very regular basis - if your last backup is 6+months old, your disaster recovery procedures need rewriting and your IT dept head should be changed.

I hope people will be understanding...but losing many peoples time and money (electricity bills, hardware etc) makes recommending this project more difficult if the data is lost due to negligence.

Such a shame :-(

Hope you can resolve the issues.

Tim
Founder, UK BOINC Team

Profile Michael H.W. Weber
Avatar
Send message
Joined: 9 Apr 15
Posts: 9
Credit: 110,490
RAC: 0
Message 1243 - Posted: 23 Jun 2017, 4:28:52 UTC
Last modified: 23 Jun 2017, 4:31:25 UTC

You need a DAILY backup of your entire data and you need to rotate storage of these backups such that potentially bad data doesn't overwrite a good backup. You did not even check your server status for at east 2 days because otherwise you would have noticed much earlier that your entire website was gone, too.

Obviously, you have not made backups for more than 6 months as can be derived from the stats loss. All OUR energy ressources which we invested in your research progress have therefore been wasted if you do not come up with a backup.

The least one can expect, however, is a clean restoration of the credits plus badges which all have gone.

And one more thing: I expect you to STANDARDLY implement the optimized apps for all platforms to stop wasting our energy ressources! The changes to your original code and its testing is well documented and consists of the implementation of profoundly more professional code compared to your original (e.g. GROMACS floating point operations routines, if I remember correctly).

For many years I have witnessed this igorance of, what I like to call, "ivory tower researchers" working in multiple research fields who on the one hand believe, they have all the wisdom in the world while - in fact - they are not even capable of evaluating a code optimization which is suggested to them free of charge and would boost their research throughput profoundly while saving our energy.
I met high energy physicists who laugh at the distributed computing community and continue to waste tax payers money for high performance cluster setup and maintenance not realizing that these HPC, while suitable (and sometimes even required) for some (limited types of) tasks, are basically a "fly's fart in the wind" compared the compute power the DC community offers for free.
The same is true for many biotechnologists.

If you don't comply to the above mentioned requirements, you will probably need a vast amount of money in the very near future to re-fund the re-gathering of all the data that now was lost - then using a costly HPC system.
Because we will be out.

Michael.
____________
President of Rechenkraft.net, Principal Investigator of the RNA World distributed supercomputer.

Dr Who Fan
Avatar
Send message
Joined: 8 Apr 15
Posts: 26
Credit: 101,303
RAC: 0
Message 1244 - Posted: 23 Jun 2017, 4:36:49 UTC

Until things have finally been FULLY RESTORED to where they were pre-database/server crash I have decided to QUIT SUPPORTING THIS PROJECT with my time and electricity.

As said by UBT - Timbo in Message 1242

"I hope people will be understanding...but losing many peoples time and money (electricity bills, hardware etc) makes recommending this project more difficult if the data is lost due to negligence."

____________

ArcSedna
Send message
Joined: 22 May 15
Posts: 2
Credit: 8,120,954
RAC: 0
Message 1245 - Posted: 23 Jun 2017, 5:27:29 UTC

2 days ago, I had approximately 8,190,000 credits.
Now I have only 3,659,616 ...

Right now my computers have about 200 work units pending upload or report,
but they seem to be not on the Task List Web Page

I'm doubt I can continue to contribute to this project or not ...

Profile [B@P] Daniel
Send message
Joined: 2 Oct 16
Posts: 7
Credit: 1,038,237
RAC: 0
Message 1246 - Posted: 23 Jun 2017, 7:30:22 UTC

I has about 900k and all this has been wiped out completely :( I hope you will be able to fix this.
____________

Profile Chus Carro
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 18 Mar 15
Posts: 82
Credit: 424,346
RAC: 0
Message 1247 - Posted: 23 Jun 2017, 8:21:49 UTC

Dear all,
I give you some more information to try to clarify some issues regarding this problem.
All the results of the simulations are safe. Every hour they are uploaded to Google Drive. Only the workunits of the last three days could be lost.
Your statistics are stored each day in an xml file. We'll be able to solve the problem with your credit but we need time to develop a script to do it.
We have an script to do backup every day and to store it in two different servers. I don't know yet why it has not been working. I have to find out. In parallel, I'm talking with the IT department to improve the backup system.
I fully understand everything you are saying and how you feel. Please also understand my surprise when I discovered that things that would have to be doing well were not working. I ask again patience. Give it for sure that I will do everything in my hand to try to fix it.

Best regards,
Jesús
____________
Jesús Carro
San Jorge University
@ChusCarro

Profile Michael H.W. Weber
Avatar
Send message
Joined: 9 Apr 15
Posts: 9
Credit: 110,490
RAC: 0
Message 1248 - Posted: 23 Jun 2017, 9:20:23 UTC

Thank you very much for this more detailed feedback.
After this, I think, the situation might be much better than I had initially thought.

I hope you will be able to solve the problems fast.

But I stick to my request to deliver the improved apps as standard apps directly from the DENIS@home website: A speed increase to you means an increase in results throughput. To us it means a reduction in electricity costs and hardware stressing. And as far as I remember, that speed increase was more than just significant.

Michael.
____________
President of Rechenkraft.net, Principal Investigator of the RNA World distributed supercomputer.

ArcSedna
Send message
Joined: 22 May 15
Posts: 2
Credit: 8,120,954
RAC: 0
Message 1249 - Posted: 23 Jun 2017, 10:56:35 UTC - in response to Message 1247.

Thank you for the information.

As the simulation result files are safe, I feel relieved.
Good luck for the recovery, and hope we can continue the science!

Profile [B@P] Daniel
Send message
Joined: 2 Oct 16
Posts: 7
Credit: 1,038,237
RAC: 0
Message 1250 - Posted: 23 Jun 2017, 11:13:28 UTC

Thanks for update!

BTW, if you could not recover stats from xml files, as a last resort you could try to load data from external stats sites.
____________

UBT - Timbo
Send message
Joined: 9 Apr 15
Posts: 3
Credit: 291,969
RAC: 0
Message 1251 - Posted: 23 Jun 2017, 14:47:11 UTC - in response to Message 1247.

Dear all,
I give you some more information to try to clarify some issues regarding this problem.
All the results of the simulations are safe. Every hour they are uploaded to Google Drive. Only the workunits of the last three days could be lost.


Hi Jesus

Excellent news - if only a small fraction of the data has been lost then this makes for a better situation. 3 days loss is much much better than 6+ months !!




Your statistics are stored each day in an xml file. We'll be able to solve the problem with your credit but we need time to develop a script to do it.
We have an script to do backup every day and to store it in two different servers. I don't know yet why it has not been working. I have to find out. In parallel, I'm talking with the IT department to improve the backup system.
I fully understand everything you are saying and how you feel. Please also understand my surprise when I discovered that things that would have to be doing well were not working. I ask again patience. Give it for sure that I will do everything in my hand to try to fix it.

Best regards,
Jesús


For me, I think the stats are less important - sure they are nice to know how much each member contributes, but it is the science that is important.

If you can fix some of the "knock on" effects from the server crash then all the better. This will at least restore some peoples faith in the project.

I would even suggest that you suspend the project for a week, so you have time to get any fixes in place and apply them.

regards and good luck with all of this !
Tim

Thomas
Send message
Joined: 10 Oct 16
Posts: 17
Credit: 960,009
RAC: 0
Message 1252 - Posted: 23 Jun 2017, 16:15:31 UTC
Last modified: 23 Jun 2017, 16:16:25 UTC

Real crunchers are not here to get credits, so just stop crying about something such unimportant as that. Real crunchers are here to support science and in the case of DENIS@home - at least in my opinion - we support the most important section of science: medical research. That the project has maybe lost some data and was down for a few days, well such things just happen. As soon as the servers are on again, we will push this project further than it ever was before :-)

TheFiend
Send message
Joined: 7 Nov 15
Posts: 8
Credit: 3,291,400
RAC: 0
Message 1253 - Posted: 23 Jun 2017, 16:52:32 UTC - in response to Message 1252.

Real crunchers are not here to get credits, so just stop crying about something such unimportant as that. Real crunchers are here to support science and in the case of DENIS@home - at least in my opinion - we support the most important section of science: medical research. That the project has maybe lost some data and was down for a few days, well such things just happen. As soon as the servers are on again, we will push this project further than it ever was before :-)


Well said!!!!

My contribution will continue..... the science matters, the BOINC credit does not!!!

Tex1954
Send message
Joined: 7 Jul 15
Posts: 28
Credit: 28,032,141
RAC: 0
Message 1254 - Posted: 23 Jun 2017, 17:23:47 UTC - in response to Message 1253.
Last modified: 23 Jun 2017, 17:30:22 UTC

To Thomas as well as TheFiend...

Ever notice that items seem to bring a better price at auctions rather than a simple flea market sale? More participation and contests promote more crunching!!!

BOINC Points are a measure of contribution relative to other contributors. Having points and teams and competitions ONLY HELP the cause!

I think it's rather narrow minded and unfair to say "the science matters, the BOINC credit does not!!!" when it does in fact promote MORE participation.

"Real crunchers are not here to get credits, so just stop crying about something such unimportant as that." WOW and WOW! So I am not a real cruncher? Others are not real crunchers? Judge Jury Executioner you are? I have 13 setups running at the moment and I care about points and I am not a real cruncher? Good grief I hope nobody believes that statement!!

Y'all suppose anybody who cares about points should just shut up and BOINC or not without any comments? Doesn't that viewpoint promote alienation and possibly cause folks to participate less?

Points matter; maybe not as much as the science involved, but it's far from zero and is a valid complaint subject.

8-)

1 · 2 · 3 · 4 · Next
Post to thread

Message boards : News : Problems in the server


Main page · Your account · Message boards


Copyright © 2019 Universidad San Jorge