| IUCr Home Page
| CIF Home Page
| CBF
| NeXus
|
| BioSync
| MEDSBIO list
| MEDSBIO list archive
| Meetings
|
| Make a Submission to the MEDSBIO web site
|
Herbert J. Bernstein, yaya@dowling.edu
Robert M. Sweet, sweet@bnl.gov
The new imgCIF workshop series has been sponsored in part
by DOE under grant ER64212-1027708-0011962, NSF under grant
DBI-0610407, and NIH under grant 1R13RR023192-01A1.
This is a report on the fourth in the new series of imgCIF workshops that began with a workshop at the summer 2006 meeting of the American Crystallographic Association. One major objective of these workshops was to find and remove the obstacles to adoption of a common interoperable format for synchrotron diffraction images. We are pleased to report excellent progress in that direction. Three out of four of the major detector vendors had representatives at this workshop, and all three agreed to cooperate in an effort to define an agreed minimum set of common tags that would be provided in synchrotron diffraction images and to participate in a "bakeoff" to help resolve any open issues with respect to interoperability. Much work remains to be done to move from having imgCIF as an available option to having it used as a routine tool in the collection of images.
In addition there is now wide recognition that there is much to be gained from careful consideration of the interactions among multiple raw data image formats in structural biology, such as imgCIF, NeXus, HDF, XML and the microscopy formats. In addition to the effort on the imgCIF bakeoff, it was the consensus of the group that another workshop addressing these more general issues is needed within the next one to two years.
The pace of data collection and the volume of data collected at synchrotron beam lines is increasing. The ACA Data, Standards, and Computing Committee spearheaded an effort to improve the efficiency of the handling and storage of these data by encouraging the adoption of common data formats and standard software interfaces. The goal of this was firstly to have the data be self defining, therefore equally accessible to data-reduction and -visualization codes. The second goal, for the purposes of secure archiving, was to provide robust internal documentation of the source of the data.
The current effort began in 2005, building on work started in the mid 1990's on a Crystallographic Binary Format (CBF) proposed by Andy Hammersley. This effort was the basis for the image-supporting Crystallographic Information Format/Crystallographic Binary Format (imgCIF/CBF). The first imgCIF/CBF workshop took place at Brookhaven National Laboratory in 1997 and proposed a format combining support for an efficient binary representation of images with a fully CIF-compliant ASCII equivalent. An imgCIF/CBF dictionary and software to support the format were created, are available on the web, and are described in Volume G of the IUCr International Tables for Crystallography. Now the community should adopt a consensus standard for management of data at synchrotron beam lines and to make it easier for users to process data taken from various beam lines. Also, as our science evolves, new concepts will be considered: possibilities include NeXus and XML.
The first workshop in the new series on "Management of Synchrotron Image Data: imgCIF File System and Beyond", was held on 22 July 2006 as part of the 2006 ACA meeting in Honolulu, Hawaii. That workshop concluded that that was "the right time for more widespread use of imgCIF ... [and that] SR sources should start writing imgCIF image files as soon as possible, employing the imgCIF dictionary already adopted by the IUCr Committee on the Maintenance of the CIF Standard (COMCIFS) and published on the web and in International Tables Volume G.] " [from the report of the workshop, see http://www.medsbio.org/meetings/ACA_2006_WK02_Report.html].
Subsequent to the Hawaii workshop intensive work was started in response to these recommendations. Both the imgCIF dictionary and the supporting software library were reviewed and, after meetings at SLS and ESRF, extended. See SLS_report.html and ESRF_report.html for more information on those meetings. The work continued in collaboration with members of the community (see the imgCIF mailing list http://www.iucr.org/iucr-top/cif/cbf/imgcif-l/.
In light of this activity a workshop on data formats for synchrotron image data was held after the NSLS/CFN meeting on 24 May 2007 at BNL in the Biology Department. Topics discussed included proposed extensions to imgCIF, the use of NeXus, progress on software and the status of imgCIF at Diamond and at SLS. That workshop concluded that work was needed on support for the handling of uncorrected images and true bitmaps, creation of a utility to "tidy" CIFS, creation of an agreed interface with XML, NeXus and HDF, clarification of the specification of the relationship between detector specification and the physical locations of pixels in the laboratory, tags for robotics and remotes access, and the creation of more cookbooks. See http://www.medsbio.org/mettings/BNL_May07_imgCIF_Workshop_Report.html.
In the short time between the second and third workshops, work on many of the items on this task list began. In addition, building on discussions with J. Steinbrener at the second workshop, discussions with Matt Dougherty on the image needs in the microscopy community, and, through Matt Dougherty, with Mike Folk of the HDF Group on techniques for integration between imgCIF and NeXus, HDF and XML were started. Work on CBFlib continued and version 0.7.8 was released.
The third imgCIF workshop was held in two sessions at BSR 2007 in Manchester and at Diamond. Herbert Bernstein and Alun Ashton organized this workshop. The purpose of this workshop was to provide a review of the status of imgCIF and CBFlib for the European user community and to discuss the integration of imgCIF with NeXus, HDF and XML. The Manchester session was used for an introduction and review of the status of imgCIF and for some discussion. The Diamond Light Source session was used to deal with more detailed technical issues and further discussion. That workshop raised several issues that needed further consideration: the Dectris Pilatus 6m miniCBF headers, integration with NeXus, HDF and XML, and dealing with common issues between microscopy and crystallography image handling. See BSR_2007_imgCIF_Workshop.
After discussion with the funding agencies, and signs of good progress on the adoption of imgCIF, a fourth workshop in the new series of imgCIF workshops was added. That workshop on "Raw Image Formats in Structural Biology" was held on 22 May 2008 in the Biology Department (Building 463) at Brookhaven National Laboratory. It was scheduled just after the NSLS/CFN user meeting.
The charge to the participants was:
We will have reviews of the current status of imgCIF, exploration of ways to move between imgCIF and NeXus using XML and HDF and ways to work with microscopy and tomography images.
The morning and part of the afternoon was used for presentations and rest the afternoon for discussions and plans for the future.
8:30 am Breakfast 8:30 am Welcome and introduction to the workshop H. J. Bernstein, Dowling R. M. Sweet, BNL 9:00 am Participants introduce themselves 9:10 am Breakfast 9:30 am Progress in adoption of imgCIF and integration with NeXus H. J. Bernstein, Dowling handout imgCIF dictionary CBFlib manual cbf2nx.c 10:00 am Discussion 10:30 am "A beamline perspective on data formats" A. Howard, IIT handout 10:20 am "The Importance of Standard Image Formats for Scientific Progress" N. Sauter, LBNL abstract handout 11:00 am Break 11:10 am "Through the Looking Glass: creating an HDF data prism" M. Dougherty, NCMI abstract handout ImageCore V01 RFC (pdf) 11:40 am "Bio-Formats and the Open Microscopy Environment" C. Rueden, LOCI handout 12:30 pm Lunch 1:10 pm "CBF: Issues for Vendors" C. Nielsen, ADSC handout 1:30 pm Discussion of vendor issues with data formats 2:00 pm Discussion of imgCIF/NeXus/HDF integration issues 2:30 pm Discussion of future plans and funding for software support and meetings 3:00 pm Break 3:15 pm Preparation of recommendations and conclusions 4:30 pm Adjourn the BNL workshop 7:00 pm Post-workshop dinner
This workshop involved wide-ranging and detailed discussion on all of the material presented (see the handouts linked to in the agenda above). Three topics generated very strong interest: the impact of the open-source licensing on vendor acceptance of software for support of image formats, the creation of an agreed list of mandatory tags and automatic use of FUSE and HDF (see the Dougherty talk) to facilitate management of image data in a directory structure to facilitate combination of images with appropriate headers. All of these discussions produced the seeds of consensus agreements.
The discussion on open-source licensing was primarily one of mutual education about the needs of vendors and the realities of modern open source licensing practices. The vendors need assurance that they will not have to make public software that they are not ready to make public. On the other hand it is important to protect and preserve access to the base of community software and only Richard Stallman's GNU General Public License and GNU Lesser General Public License have a clear track record of standing up under legal attack. A workable compromise between these competing needs is to use the same license that is already used for many of the libraries used by existing vendor packages, the LGPL. The objection was raised that those libraries distributed with operating systems, such as glibc and libm are subject to less restrictive terms under the LGPL than libraries that are not distributed with operating systems. As holder of the copyright on CBFlib, Herbert Bernstein assured the vendors that they will not be held to a stricter standard in using CBFlib than they are held to in using glibc and libm. We note for the record that recently CBFlib became one of the libraries distributed as part of the Debian Linux operating system, automatically placing CBFlib under the same liberal conditions as apply to glibc and libm.
In the discussion of the interaction with HDF it was noted that the necessary code to convert from imgCIF to HDF is already available, but that the general conversion from HDF to imgCIF is much more difficult. However, despite that difficulty, there was general agreement that it would be worthwhile to try to use the HDF-based FUSE approach suggested by Dougherty as a way to integrate one or more imgCIF headers with diffraction images as an alternative to the use of imgCIF templates.
An outcome of the workshop was the clear agreement by all three vendors present (ADSC, Rigaku, Rayonix) that they would help to support imgCIF (see the recommendations on agreeing to a minimal set of tags and cooperating in a "bakeoff"). The discussion was on how best to do it, not whether to do it. On this basis, it would be fair to say the new imgCIF workshop series has achieved its primary goal. However, it is important to note that acceptance in the sense of routine use by users in real experiments, rather than by detector vendors testing their software, has not yet been achieved.
It was noted that the use of imgCIF for the Dectris Pilatus 6m detector and the efforts at Diamond and SOLEIL have helped to generate interest in imgCIF, but that there is not yet significant use of imgCIF by users other than at SLS, and that the SLS miniCBF format is not a full CBF format.
It was noted by people from both communities that the crystallographers and the microscopists have common problems and can learn useful things from the solutions being adopted in each of the communities.
It was noted that plans are afoot for archiving of raw data. It is hoped that the role of formats such as imgCIF, NeXus, HDF and XML will be considered in the planning.
It was noted that a java version of CBFlib needs to be created. The work done by J. Wright in creating a python wrapper and by N. Sauter in creating a C++ wrapper for CBFlib should provide appropriate templates. The only reason that this has not been made a firm recommendation of the workshop is the uncertainty about obtaining funding for the work.
The major conclusions and recommendations of the workshop were:
It should be noted that a few days after the workshop, N. Sauter reported,
The CBFlib package available from
http://www.sourceforge.net/projects/cbflib
is an open source package covered by the GNU General Public Licence (GPL). The CBFlib Applications Programming Interface (API) is also covered by the GNU Lesser General Public License (LGPL), which is also know as the GNU Library Public License.
Effective immediately, all functions, methods, subroutines and procedures in the CBFlib package will be considered to be part of the API and to be covered by the LGPL as an alternative to the GPL that covers everything in the CBFlib package.
This change results from the discussions at the 22 May 2008 workshop at BNL to help make detector vendors and others with proprietary software more comfortable in using the CBFlib package.
Thanks to Teemu Ikonen, since February 2008 CBFlib is a debian package and you may link to the functions in the CBFlib package from a proprietary program just as you may link to glibc or to the trigonometry functions in the libm math library.
Use it in good health.
-- Herbert J. Bernstein
It should be noted that a proposal on handling sparse images using a background-offset-delta compression was posted to the imgCIF list a few days after the workshop.
"Justin Anderson" <justin at rayonix dot com>
Rayonix, LLC (Formerly Mar USA)
"Georgi Darakev" <darakevg at gmail dot com>
Dowling College
"Nikolay Darakev" <darakevn at gmail dot com>
Dowling College
"Matthew T. Dougherty" <matthewd at bcm.tmc dot edu>
National Center for Macromolecular Imaging
"Kevin Eliceiri" <eliceiri at wisc dot edu>
Laboratory for Optical and Computational Instrumentation
"Frances C. Bernstein" <fcb at bernstein-plus-sons dot com>
Bernstein + Sons
"Herbert J. Bernstein" <yaya at bernstein-plus-sons dot com>
Dowling College
"Andrew J. Howard" <howard at iit dot edu>
Illinois Institute of Technology
"Chris Nielsen" <cn at adsc-xray dot com>
ADSC
"Mark Pressprich" <Mark.Pressprich at Rigaku dot com>
Rigaku
"Curtis Rueden" <ctrueden at wisc dot edu>
Laboratory for Optical and Computational Instrumentation
"Nicholas K. Sauter" <nksauter at lbl dot gov>
LBNL
"John Skinner" <skinner at bnl >b>dot gov>
Brookhaven National Laboratory
"Robert M. Sweet" <sweet at bnl dot gov>
Brookhaven National Laboratory
For those for whom a budget for travel reimbursement has been agreed to, please use this travel form