Office of Information Technologies
  About the OIT
  Help Desk
  Solutions Center
  Training
  Services
  FAQ
  Responsible Use Policy
  Contact Us 
OIT Headlines
OIT Home > HPCC Home > Synchonizing Open Files in AFS Search the OIT website  

 

Synchronizing Open Files in AFS:

The Problem

You have a long-running process on Machine A (e.g. a HPCC compute machine which you can't log into) that is continually appending data to a file in AFS. You'd like to look at the data on Machine B (e.g a HPCC front-end machine, LINUX node, or SGI/SUN workstation in the cluster) while the program is running (to see if things are running OK or just for curiosity). Unfortunately, though you can open the file in AFS on machine B, it doesn't have any data in it.

This "problem" is caused by the AFS cache manager. AFS's caching system allows programs running on the same host to do most AFS file I/O to and from a cache located in the memory or disk of that host. This greatly speeds operation compared to non-cached network file systems like NFS, especially for files used read/write when the network is slow or congested. Unfortunately, the AFS cache manager daemons don't store files to the server until they are closed or fsync'd. That means data written to a file by a program can only bee "seen" by other programs on the same machine until the file is closed. Programs on other machines only "see" data that was in the file when it was last closed/created (hence new files appear empty).

The Solution -- fsync()

If you have access to your source code, the easiest solution is to periodically close() or fsync() the file so that the AFS cache manager will flush the file to the server. Unfortunately, many users don't have access to their source code and/or understanding of the low-level Unix file routines.

The fsync Program

The fsync program is compiled and has been installed as a part of the SGE (batch queueing module). Fsync synchronizes the given filename back to the AFS server every 60 seconds. If you wanted to synchronize a output file from the SGE batch system back to the AFS server every 60 seconds,( so that you can see the output as explained above), the command below could be started in the background from batch script:

fsync /path/to/filea &

You can also use the $SGE_STDOUT_PATH variable in your script if you wish. The $SGE_STDOUT_PATH is the pathname of the file to which the standard output stream of the job is diverted. A sample script using this is given below.

#!/bin/csh
#$ -l arch=irix6
#$ -pe smp 4
#$ -M rich@nd.edu
#$ -m ae

fsync $SGE_STDOUT_PATH &

g98 < testDFT.com

The Source

 

 

 

 

 

   
ND Home

OIT Home

Copyright © 2003, Office of Information Technologies (OIT),
P.O. Box 539, University of Notre Dame, Notre Dame, IN 46556

Page last modified on January 8, 2003