Minutes of CRAG Meeting #19 held on 6th February 2006 ====================================================== Time and Place: --------------- 13:00 - 14:30, Monday 6th February 2006, Room E1, Physics and Astronomy Present: -------- John Brodholt (Earth Sciences - Chair) Clovis Chapman (Computer Science - Condor Systems Administrator) Simon Clifford (Chemistry - Prism Systems Administrator) Alice Fage (IS Operating Systems - C^3 Support Group) Clare Gryce Secretary - Research Computing Manager) William Hay (IS Operating Systems - C^3 Support Group) sally Price (Chemistry) Ben Waugh (Physics and Astronomy) Callum Wright (Physics and Astronomy/IS - Altix and Sun Cluster Systems Administrator) 1) Apologies, Minutes, Actions ============================== Apologies --------- Apologies had been received from Paul Kellam and Andrew Dawson. Minutes ------- The minutes of the last meeting (#18) had been circulated, approved and published on the RC website. Actions from last meeting ------------------------- ACTION 12-4: Clare to update all CRAG facilities pages with systems administrators providing input. Ongoing, awaiting info for Condor update. Altix and Sun Cluster info awaiting full install of Sun Cluster and announcement of UPaCS service. New pages to include info on installed and available software. ACTION 16-1: John to talk with William about adapting C^3 jobs for Condor. Ongoing. ACTION 17-1: Clare to ask Research Administration about adding item about possible usage of Research Computing facilities to UCL research proposal checklist. Ongoing. Currently awaiting response from Research Admin. ACTION 18-1: William and C^3 team to consider reporting options and prepare more detailed usage statistics for next CRAG, to determine limiting factors re usage. Done (see below) 2) Condor Status Report & stats =============================== No statistics had been received prior to meeting - last years script for generation of statistics to be updated. January had been a quiet month. All machines have been transparently upgraded to XP. 3) Altix Status Report & stats ============================== No statistics had been received prior to meeting. System now running fine, recent hardware issues have been fixed. Beta test of LSF now expired so new full license being sought. Switch should not interrupt usage. 4) CCC Status Report & stats ============================ Statistics generated still based on previous accounting procedures; 68% for January. Investigations into new accounting methods have suggested that usage should be measured on a 'per node' basis to give a more realistic measure of system usage: - no processes on node = 0% use - 1 process on node = 50% use - 2,3,4 or 5 processes on node = 100% All agreed to use of new methods. Suggested that in future users should be asked to specify whether they wish to run their job on one 'actual processor or a 'virtual' processor. Another useful statistic would be wait time as proportion of runtime. To be investigated. Discussion about the need for checkpointing function - need to asses user needs. Possibility of running a C^3 'user forum' meeting. ACTION 19-1: William/Alice to consider how wait times could be meaningfully measured and reported. 5) Prism Status Report ====================== No activity to report. RCSC action to investigate possible partnership with National Grid Service (visualisation). 6) Sun Cluster Status Report =========================== Installation now complete, a few issues outstanding with myrinet and some minor cabling issues all being addressed. Moving data has taken some time. All current HiPerSpace account holders have succesfully moved work to new cluster. A couple of new accounts have been requested and set up. Queues still to be implemented; some servers with MPI others for serial jobs, will need to keep fluid for a while. Overall, new machines have been fairly full though stats for January to be provided. 7) User requests ================ Andrey Mysovsky (Physics CMMP) - Sun Cluster - granted Antonio Sanchez Torralba (Physics CMMP) - Altix and Sun Cluster - granted Alexandros Kalampokidis (Physics) - C^3 - granted David Munoz Ramo (Physics) - Altix - granted Claudio Cazorla Silva (Physics CMMP) - C^3 - pending Tom Trevethan (Physics) - Altix - pending Mars Express Project - new user has been added. To be asked to fill in form for requested Condor use as well. 8) Web updates ============== Pending 9) News (incl. relevant items from RC Forum & RC Sub-Committee) ================================================================ Benchmarking progressing well and generating some interesting results. ACTION 19-2: clare to distribute interim benchmarkng reports to CRAG members. 11) AOB & next meeting date =========================== AOB --- Noted that the charging mechanisms permitted under fEC are still not clear. Reported that UCL is hosting a meeting of all interested parties on 17th Feb to attempt to clarify position. Next Meeting Date ----------------- 6th march 1-2 Ben to try to book E1, Physics. LIST OF CURRENT AND ONGOING ACTIONS =================================== ACTION 12-4: Clare to update all CRAG facilities pages with systems administrators providing input. Ongoing, awaiting info for Condor update. Altix and Sun Cluster info awaiting full install of Sun Cluster and announcement of UPaCS service. New pages to include info on installed and available software. ACTION 16-1: John to talk with William about adapting C^3 jobs for Condor. Ongoing. ACTION 17-1: Clare to ask Research Administration about adding item about possible usage of Research Computing facilities to UCL research proposal checklist. Ongoing. Currently awaiting response from Research Admin. ACTION 19-1: William/Alice to consider how wait times could be meaningfully measured and reported. ACTION 19-2: clare to distribute interim benchmarkng reports to CRAG members.