Project

General

Profile

Actions

Feature #2915

open

a way to archive old leginon session database by project so that they can be removed from the production one

Added by Anchi Cheng over 9 years ago. Updated over 9 years ago.

Status:
In Test
Priority:
Normal
Assignee:
Sargis Dallakyan
Category:
-
Target version:
-
Start date:
08/27/2014
Due date:
% Done:

0%

Estimated time:
Deliverable:

Description

leginondata is getting too big. The queies are slow.


Related issues 1 (1 open0 closed)

Related to Leginon - Bug #2963: archive project database by projectAssignedAnchi Cheng10/15/2014

Actions
Actions #1

Updated by Anchi Cheng over 9 years ago

  • Subject changed from archiving old leginon session database by project so that they can be removed from the working one to a way to archive old leginon session database by project so that they can be removed from the production one

r18540 is done in myami-dbcopy branch. This feature is not possible in the main branch.

Sargis, if you want to look at the code, you should only need the sinedon subpackage.

I will upload an example on how I use it later.

Actions #2

Updated by Anchi Cheng over 9 years ago

  • Status changed from New to Assigned
Actions #3

Updated by Anchi Cheng over 9 years ago

r18544 adds the script for archiving a project by project id.

To use:

1. Create a database read/writable by the mysql user for leginon to import the data to.
2. Add a module [importdata] in sinedon.cfg and define the archive database in 1. as its database.
3. Make sure myami-dbcopy/sinedon will be used when the python file archive_project.py is run.
4. run the script in myami/dbschema like this, the project id in this case is 1.

python archive_project.py 1

Actions #4

Updated by Sargis Dallakyan over 9 years ago

Thank you Anchi. I tried this on one of the projects and I got the following error:

In [34]: %run /home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py 231
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/lib/python2.7/site-packages/IPython/utils/py3compat.pyc in execfile(fname, *where)
    176             else:
    177                 filename = fname
--> 178             __builtin__.execfile(filename, *where)

/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py in <module>()
    708 
    709         checkSinedon()
--> 710         archiveProject(projectid)

/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py in archiveProject(projectid)
    688         p = projectdata.projects().direct_query(projectid)
    689         source_sessions = projectdata.projectexperiments(project=p).query()
--> 690         session_names = map((lambda x:x['session']['name']),source_sessions)
    691         session_names.reverse()  #oldest first
    692         for session_name in session_names:

/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py in <lambda>(x)
    688         p = projectdata.projects().direct_query(projectid)
    689         source_sessions = projectdata.projectexperiments(project=p).query()
--> 690         session_names = map((lambda x:x['session']['name']),source_sessions)
    691         session_names.reverse()  #oldest first
    692         for session_name in session_names:

TypeError: 'NoneType' object has no attribute '__getitem__'

Working on debuging this...

Actions #5

Updated by Anchi Cheng over 9 years ago

Sargis

r18552 fixed the bug that caused the error and also improved the speed a lot by using sinedon timelimit query rather than timestamp comparison in python which had to compare all results since python iteration is slow.

The cause of the bug was that some early imported calibrations have no session field (default to None).

Actions #6

Updated by Sargis Dallakyan over 9 years ago

Thank you Anchi. This time it did a good progress and I'm now getting this result:

[sargis@dewey ~]$ python /home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py 231
['09may25a', '09may25b', '09may25c', '09jul11a']
****Session 09may25a ****
Importing session....
Importing instrument....
Importing calibrations....
Importing C2ApertureSize....
number of images in the session = 278
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . 
imported 278 images
Importing image ddinfo....
Importing image stats....
Importing queuing....
Importing mosaic tiles....
Importing dequeuing....
Importing drift....
Importing focus results....
importing TransformManager Settings....
Traceback (most recent call last):
  File "/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py", line 742, in <module>
    archiveProject(projectid)
  File "/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py", line 726, in archiveProject
    app.run()
  File "/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py", line 710, in run
    self.runStep3()
  File "/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py", line 696, in runStep3
    self.importSettings()
  File "/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py", line 623, in importSettings
    self.importSettingsByClassAndAlias(allalias)
  File "/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py", line 590, in importSettingsByClassAndAlias
    results = self.researchSettings(settingsname,name=node_name)
  File "/home/sargis/ami-workspace/myami-trunk/dbschema/archive_project.py", line 188, in researchSettings
    r2 = q.query()
  File "/home/sargis/ami-workspace/myami-dbcopy/sinedon/data.py", line 429, in query
    results = db.query(self, **kwargs)
  File "/home/sargis/ami-workspace/myami-dbcopy/sinedon/dbdatakeeper.py", line 99, in query
    result = self._query(*args, **kwargs)
  File "/home/sargis/ami-workspace/myami-dbcopy/sinedon/dbdatakeeper.py", line 118, in _query
    result  = self.dbd.multipleQueries(queryinfo, readimages=readimages)
  File "/home/sargis/ami-workspace/myami-dbcopy/sinedon/sqldict.py", line 250, in multipleQueries
    return _multipleQueries(self.db, queryinfo, readimages)
  File "/home/sargis/ami-workspace/myami-dbcopy/sinedon/sqldict.py", line 525, in __init__
    self.queries = setQueries(queryinfo)
  File "/home/sargis/ami-workspace/myami-dbcopy/sinedon/sqldict.py", line 983, in setQueries
    query = queryFormatOptimized(queryinfo,value['alias'])
  File "/home/sargis/ami-workspace/myami-dbcopy/sinedon/sqldict.py", line 1020, in queryFormatOptimized
    joinTable = queryinfo[id]
KeyError: 41493216

Will look into this...

Actions #7

Updated by Anchi Cheng over 9 years ago

  • Status changed from Assigned to In Test
  • Assignee changed from Anchi Cheng to Sargis Dallakyan

r18561 fix this problem. Sinedon thinks "source_session" is in the destination_dbname even though the original db is source_dbname.
Work around that by redo the query for source_session in get.

Actions #8

Updated by Anchi Cheng over 9 years ago

r18645, r18646, r 18647 creates a working pipeline

Usage

Using any myami >= 3.2

  1. initArchiveTables.php is similar to initTablesReport.php and is used to initialize the tables in the archive databases and will then make myamiweb works even if the data is missing i n the archive.
    1. set databases in myamiweb/config.php to the archive databases
    2. goto this page in http://your_myamiweb/setup/initArchiveTables.php
  2. dbschema/tools/copy_refimages.py that makes local copy of the references not in the imaging session.

Using non-dbcopy myami

  1. archive_initialize.py that makes sure that administrator and project owner and those users sessions are shared to are included in the archive
    • This produces archive_users_1.cfg which will be used in the latter part.

Using dbcopy myami to run archiverun.py from dbcopy myami to do the real work.

  1. edit the file to give it the right databases and path to bot the dbcopy version and non-dbcopy version of myami and self.projectids in the function autoSet so it can switch as needed.
These include:
  1. archive_leginondb.py is renamed from priviously archive_project.py and is used to archive leginon database
  2. archive_projectdb.py archives project database
  3. archive_activate.py force auto_increment to all sinedon table `DEF_id`.
Actions #9

Updated by Anchi Cheng over 9 years ago

  • Related to Bug #2963: archive project database by project added
Actions #10

Updated by Anchi Cheng over 9 years ago

r18698 and r18799 gives more options to leginon archiving

Actions

Also available in: Atom PDF