Research Journal
Spring 2008
*********************************************************************************************************
DOWNLOADS:
*********************************************************************************************************
*********************************************************************************************************
PAPERS:
ProtoMol How-To (PDF)
Current Thesis (PDF)
*********************************************************************************************************
*********************************************************************************************************
NOTES:
TAU - PERFORMANCE PROFILER THAT WORKS WITH PYTHON
Paul's Alanine Trajectories are here.
*NEW* TARGET CONFERENCE FOR FTSM here.
CHECK OUT BIOPERL: http://www.bioperl.org/wiki/Main_Page
PDYNAMO
SUMMARIZING MY WORK AS 3 PSEs
DR. IZAGUIRRE NUMBER 650-724-9826
*********************************************************************************************************
*********************************************************************************************************
USEFUL PERSONAL LINKS:
ND THESIS: STYLEGUIDELINES
REFEREEING PAPERS
SPARC AUTHOR RIGHTS
LCLS NORMAL MODE SITE
REMOVE .svn FILES RECURSIVELY: find . -name .svn -exec /bin/rm -r {} \;
GET DETAILED ARCHITECTURE INFO: cat /proc/cpuinfo | more
*********************************************************************************************************
*********************************************************************************************************
TODO:
* MDL Paper
* String Method
* Linear ProtoMol
*********************************************************************************************************
*********************************************************************************************************
Changes made to ProtoMol 3 outside of Vector3D and Vector3DBlock implementations:
* ProtoMolApp called an std::swap on two Vector3DBlocks, changed that to a member function
* .x, .y .z -> .c[0], .c[1], .c[2]
* To run Chris' GUI: java -jar -Djava.library.path=./ JMV.jar (from the GUIPROTOMOL directory)
*********************************************************************************************************
Thursday, June 5
*Finally* got all the data from the 60x60 box sampling, so I'm going to now write a script which
will read each DCD and output a file of (phi, psi).
At this point, switching over to my Wiki.
Wednesday, June 4
Ok, I ran some tests this morning with the contour script. Box (10,29) works fine if I don't go completely
sequentially, but hop over some interior boxes. Which means, there must be some kind of previous state
that is causing instabilities.
Tuesday, June 3
I tried to pull off one more miracle today with the string - previously I was running block alanine
in vacuum, I tried running the exact same simulation that Santanu was using for his contour plots,
which was the non-block structure. Nevertheless, I get *almost* the exact same string.
So now there's no way out, but to try to get this contour working. That's where basically all of
my energy will have to be for a while.
On the plus side though, I did write about 20 pages of the MDL user guide today. It looks very
nice, and I think this is clear enough for people to use it. Still have to write more however.
Monday, June 2
Here is the new superimposed string...

It still is not directly perpendicular in some areas (more the nonfavorable middle section).
It looks great near the ends however.
I'll see if I can get back on HPCC today. With the maintenance that went on over the
weekend, hopefully that wasn't lost time.
The cluster is definitely up again, and I'm running a new 'test job'.
OK, piping to /dscratch/ works. Just in case, I'm also grabbing a cluster box and running
a serial simulation there as well (nothing to lose). I think something is definitely wrong with the
first simulation, since I've already gotten through the first row of boxes in about 10 minutes,
and there's only 60 rows so certainly it should not be taking 6 days... should take no more than
like 10 hours.
Working more with testing MDL and beginning the user guide... I want to write a very nice
user guide for people before I leave.
Chris also wants me to test his new viewer... done - I see phi-psi!!!!
Ok, I found where the hangup on the contour plot constructor occurs -- row 10, box 29.
Killing the HPCC jobs, they'll never finish.
Here is what I have for AMBER:

Here is what I have for CHARMM:

I think this result almost confirms that the AMBER force field, even for something as simple
as alanine dipeptide in vacuum, differs significanlty enough such that we simply cannot use it.
Which means I need to work the bug out of this sampler script that causes instabilities.
Interestingly, in this energy output I saw no energies that were out of whack. But clearly, the simulation
becamse unstable somehow.
Sunday, June 1
Got caught up on some e-mails this morning and made the change in ProtoMol which keeps bornswitch at
quartic. I also checked on my string but can already see it's wrong:

PM 2 also produces this same output (in the early going), so something must have changed. The upper
part actually seems fine but the middle area is not sampling sufficiently. It's not even worth superimposing.
OH! I made a *very stupid* mistake ... I commented out all forces except harmonic dihedral while
I was testing it. Put them back and I can already see at the beginning the string is moving in the
correct direction. Rerunning...
My sampling is still running on HPCC; but I'm going to shoot off another one which pipes output
directly to dscratch; the one I submitted last Tuesday wrote to /tmp/ then copied to dscratch for
efficiency. Santanu says this simulation could feasibly take a week to ten days; so it could still
be running but I want to make sure there are no bugs.
Oo.. trouble submitting on HPCC... I get:
"Cannot set job credentials."
Checking if Santanu and Chris had any similar issues and then will e-mail JC... it's the same
script I submitted last week but for a few changes in output files, so it should not crash.
Friday, May 30
Fixing the energies bug this morning.... done. I had to call uncache() from IO instead of initialize() at every step.
Fixed the GUIServer to work on 64 bit with GCC using __attribute__(packed) on the core structures.
Making sure it works with all compilers...
Alright!!! I now have MDL simulations running with the GUI! This is *huge*! Chris was also happy.
Now I have to make sure that my little packing strategy also works with intel and cl. Trying intel first...
Works with ProtoMol but MDL does not compile .. the wrapper code generates a familiar incorrect
definition of LONG_BIT.
Oh right I remember ... yeah the intel compiler on here is a 32-bit version. It does produce 64-bit
code, but gets confused when you include Python headers which check the LONG_BIT definition because
a 64-bit Python installation defines it differently than the 32-bit intel compiler.
Which means... in order to verify MDL, I need to include headers and link libraries from 32-bit Python.
Ok, things work with intel and gcc.
I will for now commit the packing fix so Chris can give it a whirl on cl.
Thursday, May 29
Sampling is still not finished, I'll wait another day or so before I start assuming something is wrong...
I'm going to make one last pass through MDL this morning and clean things up, then my second task
will be the string method (and even to fire off another simulation using MDL 3). Then hopefully
this afternoon start wrapping the GUI code.
Ok, I did find another MDL bug...
I get a seg fault if I try to propagate twice with different propagators using the same force fields.
My guess is that the first ones get deallocated and then when I try to use them again, all h**l breaks loose.
I think it would be an inconvenience for users to have to specify different force fields that
are the same across propagations. This really should be fixed I think. Did it with the 'thisown' attribute
again. I think basically the Python interpreter should not be deallocating any C++ objects for propagators
and force fields.
Ok, while running the string, I get very different results for unsolvated alanine from MDL 2. I checked
and with MDL 2 it still works, but with MDL 3 the string just collapses. So there is some kind of bug.
Okay, fixed it - there was a dtor() function call that was causing issues. String is working and running.
Now taking my first looks at the GUI.
When I run MDL and try to bind to the default port 52753, I get lots of the following:
Could not bind to local socket: Address already in use.
And the GUI consistently waits for port 52753.
However, I need to try the same thing with PM. Rebuiding and testing...
Yep, I get the same waiting problem with ProtoMol.
There's got to be something small I'm missing...
Ok, there are two issues to work out.
(1) The GUI does not currently work on 64-bit Linux. Chris hypothesizes that it has to do
with the format that the data is sent through the sockets, and incompabile padding on 64-bit machines for
the size of his visualized structs. A compiler flag may fix this.
(2) Even still, on 32-bit over SSH there are problems with MDL. I get this 'Could not bind to local
socket' as above, although very interestingly -- the molecule does load in the viewer. It just does not
move afterwards. Eventually it throws a ProtoMol::Exception (with BPTI after about 224 steps, with
Argon 260).
In any event, I would like to debug on my office machine, since it's way too slow over SSH on a cluster.
But before I do that, there is an immediate bug fix required in the energies file output ... Santanu found it,
and it is a critical MDL fix.
Wednesday, May 28
At the end of the day yesterday I set off a script which runs Santanu's contour map generator
on HPCC. I am still waiting as of this morning. I wish HPCC had a better way of showing if
progress was being made.
So, I'm going to keep going with MDL in the meantime. There is plenty to do there.
One thing in particular I am not satisfied with right now is the error checking. Ideally,
I want all errors to be caught by my error handling routines and not the Python interpreter.
Right now too many come up as runtime exceptions.
Oh right.. and I have to move this build() function to ForceField. It's becoming a serious issue.
Also, somehow having a dynamically loaded set of libraries breaks the MDL testsuite -- all the simulations
just hang.
Actually that wasn't a problem with the dynamic loading -- it was a bug with LennardJonesCoulomb!
I had only fixed cutoff but not simplefull.
I now seem to have four tests fail, the normal modes and the SCPISMs. Fixed the SCPISMs, small bug
in the forcefield...
OK, all tests pass again.
I want to set up some examples with NormalModes..
Chris and I talked some this afternoon as well about his GUI, I'd very much like to get MDL and
his GUI working together.
Also, the ModifierShadow needs to be added to both ProtoMol and MDL..
And, I still do get a seg fault IF I try to use both CoulombBornRadii and CoulombSCPISM (which
most commonly will be the case). So that's still a problem.
And, the string.
Figured out the problem with SCPISM, it wasn't... I just had incorrect input files, which
means I need to add an error check for this. :)
Got a normal mode example in with alanine.
I'll work the rest of the day today on the Shadow stuff.
Got it into ProtoMol 3, compiling and producing shadow energies...
Now testing it with the log-log plots:
I'll test the 12th order shadow with dt=0.6, 0.8, 1, and 1.2.
Then, the total simulation steps will be: 80, 60, 48, and 40.
dt log(dt) E log(E)
--------------------------------------------------------------------------------------------------------------------------------------------------------
0.6 -0.736965594166 1.2163141605e-09 -29.61483694489457
0.8 -0.321928094887 3.15141406304e-08 -24.919425437162104
1.0 0 4.00504411857e-07 -21.25167852912023
1.2 0.263034405834 2.91280276343e-06 -18.38916055471692
Here is the resulting plot, which when linear fitted yields y = 11x - 21. For a 12th-order shadow,
that seems about right (ideally the slope would be 12, but it's a linear fit so it's approximated):

I'll commit things.
Ok, shadow is in!
Tomorrow and Friday's tasks:
* Clean up the Parallel In Time Tempering simulations, and go through the MDL framework adding
error checking wherever necessary, and also try to simplify some things whenever possible.
* Get FTSM at least working in MDL 3
* If my sampling finishes, complete the WHAM analysis
* Start working with the GUI
Tuesday, May 27
Meeting this morning and Dr. Izaguirre wants slides... need to put them together.
I'm putting the wham-2d command here, so I remember:
wham-2d Px=pi -3.14159 3.14159 500 Py=pi -3.14159 3.14159 500 1 300 0 metadata.wham freeenergy.wham
Alrighty.. I have a new plot, superimposing...

The big change that I made here was to not assume zero for infinite free energy (which was a mistake anyway),
but an upper free energy bound. I also used a cubic interpolation as opposed to linear, hoping it would
eliminate some of the 'squarish' curves that we saw in the old plot.
The contour plot is better, but I'm still not really happy with it. There are numerous points where the line
is not perpendicular to the contours, and also we still see those 'triangular' contours in the areas
of lower sampling. Santanu unfortunately did not have the results we had thought with the 60x60 -- his
simulation had crashed. We are estimating that will take at least 10-12 days to run, so I'd like to improve
this if possible.
I have tried very fine meshes and indeed those more 'unexplored' areas finally show up, but the problem
is that the contours in the more favoable areas are very difficult to see. I think the only solution is
indeed going to be to sample more. Got ahold of Santanu's script, and some good news is that it runs
with MDL 3!!!
Friday, May 23
Debugged MDL this morning.... fixed about 5-6 more...
I am satisfied with this plot right now:

Now I just have to superimpose the string on it, and judge for myself if it is publishable.
Here is what we have:

I'm not *totally* satisfied with this, because it is not clear that our string is perpendicular to the
free energy contours at every spot in collective space.
Nevertheless, I still have to get the string working with ProtoMol 3 before we submit (it will
be most ideal if people can use the new MDL rather than install the old one, a terrible pain.
It looks good in the beginning and end, but near the middle seems not quite perpendicular
(I feel like it should dip more).
However, soon enough I should be able to do an HPCC run with the new MDL.
But first, I've just found a bug that MDL cannot do LJ-Coulombic force pairs. That needs to be fixed..
Actually I need to clean up this interface period. Right now I have extended all force classes to have a makeNew()
method which constructs a new force object, but there's too much hardcoded right now -- I think I can
rewrite it to take an arbitrary number of parameters, pushing back just enough of the passed parameters.
Thursday, May 22
Okay, the big WHAM day today....
Read up on WHAM this morning and found a program which runs WHAM in 2D and assumes a biasing potential
like the one we use, here. I wrote a Python script which takes Santanu's output data from his contour.py and creates input
files for wham-2d, run like this:
python makeWham.py 900 5000 5.0 1.0 allsample.dihed phi0psi0 metadata.wham timeseries
Where the usage is:
python makeWham.py [boxes] [samples/box] [kappa] [dt] [phi-psi file] [phi0-psi0 file] [metadata file] [timeseries prefix]
So I provide the number of boxes and samples per box, kappa and dt, plus an input file containing phi-psi pairs
and phi0-psi0 target pairs. The meta data file and the time series files (one per box) will be input to wham-2d.
Running now....soaking up my CPU, let's see if it does something.
Had to run on another processor, couldn't get any other work done ... let's see...
I ran and got some data but intentionally chose a smaller number of points for free energy calculations, going to
increase that some.
In the meantime, debugging MDL...my Python implementation of Hybrid Monte Carlo works with MDL 3 which
makes me very happy... but I am also uncovering some bugs. I need to test all IO and force combinations, if possible.
When I use LennardJones-Coulomb together things fault with cutoff.
Testing plots.... okay I have found a bug in the ScalarStructure I think - plotting potential energy always returns zero.
So does bond energy, etc.
Fixed it .. the ScalarStructure used by the internal ProtoMolApp object was different from the Energies structure used
in MDL simulations. To fix it, I had to modify the functions in the API of Energies to invoke the ProtoMolApp isntead
of their own member functions.
Anyway, I got some WHAM results so I'm going to check this code in and then move there, see if I can't produce a free energy
plot.
Here is my first contour plot:

I'm not sure if we can use this for the paper or not, although the wells seems to be in the appropriate places.
I may need to decrease the granularity in the phi-psi plane. More tomorrow!
Wednesday, May 21
Fixed the STS/MTS bug yesterday and now Python propagators work properly in MDL.
Continuing work on the autocorrelator... I am getting there, my prototypes compile but I still need
to remove water.
Done.
Now I need to remove the SCPISM forces from the energy calculation.
Got it! Now throwing in some comments so people can actually use the code.
Oops - one thing wrong about what I'm doing, when I compute SCPISM I only should be doing it
for the non-water atoms....
Did it - I changed CoulombbornRadii and CoulombSCPISM to return zero if either
of the atoms belonged to a water molecule. That won't be the case usually, but in a
case like this or a general case of comparing forces, it is possible.
Ah...shoot. I forgot, you cannot run SCPISM in a simulation with periodic boundary conditions.
I have to think about the best way to fix this. Theoretically we have mixed boundary conditions in this
simulation (vacuum for the protein but periodic for pairwise interactions with solute-solvent
and solvent-solvent molecules). That functionality is not currently available in ProtoMol.
Of course the most immediate solution would be to register SCPISM/Born forces for PBC,
but I'm not convinced that makes the most sense. On the other hand, any other would require
quite an overhaul. :(
Anyway, did that but am currently seg faulting. And I do have the infamous memory problem:
*** glibc detected *** free(): invalid next size (fast): 0x0941ea48 ***
Program received signal SIGABRT, Aborted.
[Switching to Thread -1208736064 (LWP 21547)]
0x002ea7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) backtrace
#0 0x002ea7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x0032f815 in raise () from /lib/tls/libc.so.6
#2 0x00331279 in abort () from /lib/tls/libc.so.6
#3 0x00363cca in __libc_message () from /lib/tls/libc.so.6
#4 0x0036a55f in _int_free () from /lib/tls/libc.so.6
#5 0x0036a93a in free () from /lib/tls/libc.so.6
#6 0x007c3041 in operator delete () from /usr/lib/libstdc++.so.6
#7 0x080f281e in std::vector<ProtoMol::Vector3DImpl<double*>, std::allocator<ProtoMol::Vector3DImpl<double*> > >::_M_insert_aux ()
#8 0x08262dc3 in ProtoMol::Vector3DBlock::push_back ()
#9 0x0826208b in ProtoMol::AutoCorrelatorOuter::run ()
#10 0x08262e5d in ProtoMol::ProtoMolApp::step ()
#11 0x0804bc47 in main ()
Argh..
Ok, got them running but energy and temperature are increasing (wrong).
Meeting time.
After the meeting, changed the internal autocorrelator to inherit from Leapfrog,
and everything is fine now (conservation of energy observed). Sent my input files to Chris.
Now, onto WHAM...
Monday, May 19
Still debugging the Python normal mode tests... this one is TOUGH!
Okay, finally cleared it -- I found out that in Python, for any C++ object that is allocated using
a constructor, you can set a 'thisown' attribute to prevent deallocation by the Python interpreter!
Was tought to find, but fortunately I encountered this remote site.
I think then the next task will be the correlator. Chris unfortunately is gone for the moment,
so I think the best thing I can do is develop a design and discuss it with him to ensure accuracy.
My immediate thought is that the best thing will be to build an application. In this way I could
use the functionality of ProtoMol without making a nontrivial modification to the framework.
My only hesitancy is that Chris in general does not like extraneous materials in the repository,
so I'm going to wait to see him and discuss a bit in the morning.
Actually, I know something that I immediately have to do today anyway, I need to clean up
that simulations/ directory which has tons of old and outdated files, and upgrade them to PM 3
(and also throw in some new examples). Also cleaning up the FTSM implementations,
some are seg faulting right now...