Bug #4709
closedapDatabase.py update buggy
0%
Description
Neil, I think your changes to apDatabase.py in revision 03ef58ad doesn't search aligned & dose-weighted versions of the preset. I've got a session with over 2000 "ed" images, and if I set the preset to "ed" then appionloop finds them all. If I set the preset to "ed-a" or "ed-a-DW" it returns only 14 images.
Updated by Neil Voss almost 8 years ago
Interesting, I tried it on a data set with en, en-a, and en-a-DW and did not see this. I won't be able to test this until Wednesday.
I noticed I was setting the present name twice, so I changed that:
presetq = leginon.leginondata.PresetData(name=preset)
but the code is pretty clean, I do not know why sinedon would be returning all present values...
sessionq = leginon.leginondata.SessionData(name=session) sessiondata = sessionq.query(results=1)[0] presetq = leginon.leginondata.PresetData() presetq['name'] = preset presetq['session'] = sessiondata presetdata = presetq.query(results=1)[0] imgquery = leginon.leginondata.AcquisitionImageData() imgquery['preset'] = presetdata imgquery['session'] = sessiondata imgtree = imgquery.query(readimages=False)
Updated by Anchi Cheng almost 8 years ago
Neil, If you search for the most recent PresetData with the name and then set query to only search that PresetData as you did, you miss all PresetData that the user subsequently changed. Each time the user adjust the preset, a PresetData is created.
If this makes it not efficient again, we will have to go back to the solution that I thought we needed to do, create a PresetNameData table in leginon which each image referred to which has just session and name and therefore will not get a new one when the preset is readjusted such as when aligning against each other.
Updated by Gabriel Lander almost 8 years ago
I just checked on another session, and the number of ed and ed-a/ed-a-DW presets that are returned are identical, so there is something odd about this session.
As Anchi points out, this only returns images up until the first update to the ed preset, but this doesn't explain the discrepancy between the number of ed & ed-a images that are returned for my first dataset.
Updated by Neil Voss almost 8 years ago
Okay, I did not realize that presetname and sessionname were not unique in the PresetData table; Makes sense now that I think about it. Playing with NYSBC session 3154:
+--------+------------+----------+ | DEF_id | session_id | name | +--------+------------+----------+ | 70241 | 3154 | gr | | 70242 | 3154 | sq | | 70243 | 3154 | hln | | 70244 | 3154 | fan | | 70245 | 3154 | fcn | | 70246 | 3154 | gain | | 70247 | 3154 | enn | | 70248 | 3154 | gr | | 70249 | 3154 | sq | | 70250 | 3154 | hln | | 70251 | 3154 | fan | | 70252 | 3154 | fcn | | 70253 | 3154 | gain | | 70254 | 3154 | enn | | 70271 | 3154 | enn-a | | 70272 | 3154 | enn-a-DW | | 70273 | 3154 | hln | | 70283 | 3154 | enn | | 70284 | 3154 | enn | | 70285 | 3154 | enn-a | | 70286 | 3154 | enn-a-DW | | 70287 | 3154 | enn | | 70288 | 3154 | enn-a | | 70289 | 3154 | enn-a-DW | | 70290 | 3154 | enn | | 70291 | 3154 | enn | | 70304 | 3154 | fan | | 70305 | 3154 | fcn | | 70306 | 3154 | enn-a | | 70307 | 3154 | enn-a-DW | +--------+------------+----------+
#start test case from appionlib import apDatabase reload(apDatabase) imgtree = apDatabase.getImagesFromDB("16dec23d", "enn-a-DW") print set([i['preset']['name'] for i in imgtree]) imgtree = apDatabase.getImagesFromDB("16dec23d", "enn") print set([i['preset']['name'] for i in imgtree]) #end test case
result
#before the fix >>> reload(apDatabase) <module 'appionlib.apDatabase' from '/home/nvoss/myami/appion/appionlib/apDatabase.py'> >>> imgtree = apDatabase.getImagesFromDB("16dec23d", "enn-a-DW") ... Querying database for preset 'enn-a-DW' images from session '16dec23d' ... ... 665 images recevied in 320.05 msec >>> print set([i['preset']['name'] for i in imgtree]) set(['enn-a-DW']) >>> imgtree = apDatabase.getImagesFromDB("16dec23d", "enn") ... Querying database for preset 'enn' images from session '16dec23d' ... ... 665 images recevied in 353.03 msec >>> print set([i['preset']['name'] for i in imgtree]) set(['enn']) >>> #end test case
#after the fix >>> reload(apDatabase) <module 'appionlib.apDatabase' from '/home/nvoss/myami/appion/appionlib/apDatabase.pyc'> >>> imgtree = apDatabase.getImagesFromDB("16dec23d", "enn-a-DW") ... Querying database for preset 'enn-a-DW' images from session '16dec23d' ... ... 757 images recevied in 0.66 sec >>> print set([i['preset']['name'] for i in imgtree]) set(['enn-a-DW']) >>> imgtree = apDatabase.getImagesFromDB("16dec23d", "enn") ... Querying database for preset 'enn' images from session '16dec23d' ... ... 757 images recevied in 0.83 sec >>> print set([i['preset']['name'] for i in imgtree]) set(['enn'])
so for me it was not finding all the images, but I could not find a test case where it would find too many images...
Updated by Neil Voss almost 8 years ago
- Status changed from Assigned to In Test
- Assignee changed from Neil Voss to Gabriel Lander
Gabe, could you test this?
Updated by Neil Voss almost 8 years ago
I went back reverted the Jan 11 change commit: 03ef58ad and ran it again to make sure it obtained the same number of images. It does. Sorry, about the bug, it was so broken that it needed a fix.
>>> #start test case ... from appionlib import apDatabase >>> reload(apDatabase) <module 'appionlib.apDatabase' from '/home/nvoss/myami/appion/appionlib/apDatabase.py'> >>> imgtree = apDatabase.getImagesFromDB("16dec23d", "enn-a-DW") ... Querying database for preset 'enn-a-DW' images from session '16dec23d' ... ... 757 images recevied in 21.7 sec >>> print set([i['preset']['name'] for i in imgtree]) set(['enn-a-DW']) >>> imgtree = apDatabase.getImagesFromDB("16dec23d", "enn") ... Querying database for preset 'enn' images from session '16dec23d' ... ... 757 images recevied in 24.13 sec >>> print set([i['preset']['name'] for i in imgtree]) set(['enn']) >>> #end test case
as you can see same number of images, but time went from 21/24 sec to 0.6 sec or 35X faster.
Updated by Neil Voss almost 8 years ago
- Related to Bug #4699: Querying database for preset 'xxxx' images from session 'xxx' ... is VERY SLOW added
Updated by Gabriel Lander almost 8 years ago
- Status changed from In Test to Closed
Works now! This is a great speedup, finds ~7000 images in 7 seconds, this used to take several minutes.