Survey 2017: Summary of findings
From June 16th to July 30th, 2017, CÉCI users were invited to respond to an anonymous satisfaction survey.
The main questions were:
- How did you learn about us ?
- Was it easy to create an account, to connect ?
- In how many publications have you already acknowledged the use of CÉCI clusters ?
- What was the main problem you faced when you renewed your account?
- How interesting to you was the reply to the previous survey posted on the CÉCI ?
- What do you need ? (hardware/software/policy)
- What is your typical job ?
- What would your dream job be ?
The form ended with a free text field where users could leave suggestions or remarks. We got 22 comments, questions or suggestions and those who left an email address were contacted.
More than 80 users responded to the survey, out of approximately 490 users active in the past few months. They originated from all CÉCI universities, with very diverse research interests in science and engineering. We thank those users for the time they took to fill the survey.
The present document offers a summary of all comments and suggestions made in the responses. This other document offers a synthetic view of the responses.
Contents
Acknowledgement in publications
This is the second time we made this question in the survey. More than half of the respondents have cited the CÉCI at least in one publication with a total of 198 acknowledgements. That is a 30% increase respect to last year total amount.
The acknowledgments is the most direct way to show the utility of clusters. These testimonials are very useful for getting funding to ensure the project continuity and to give access to computing power to researchers.
Documentation
Some respondents complained about the explanations on the FAQ page being insufficient or too long
The documentation is seen as not rich enough by some, and too long by others. The CÉCI documentation is usually written according to the so-called inverted pyramid principle use in journalism so that the most important information is at the top of the articles and the details are at the bottom so the reader can decide when to stop reading.
We are permanently working on improving and extending the support documentation and since the beginning of this year there is available a new CÉCI Support website.
This new CÉCI documentation website contains all the previous technical
information on the FAQ with more details and a better structure to organize the
different topics covered.
The website also supports searching for keywords through all the pages, notice
the Search docs
box on top of the left frame. Try performing a search for
'ssh'.
Some of the Window's users complained about the difficulty to setup the ssh client
The problems mentioned were about converting the CECI SSH private key to the correct format, or dealing with many different applications to be able to setup the login environment.
We are aware of these difficulties to work on a Windows environment and we have been working on an alternative solution to the previous Xming+Putty+WinSCP suite. We now provide in the support page a detailed guide for configuring on Windows the MobaXterm application.
We suggest all Windows users to give it a try, since this single tool provides a complete environment for connecting to the clusters and also copying/retrieving files to/from them. In addition, no conversion is required to use the CÉCI private key that you get by email.
One respondent mentioned that finding information about 'lm9' was not easy
There is no lm9
cluster being part of the CÉCI, so it is expected not to
find any information about it on the CÉCI documentation.
The list of machines the CÉCI is responsible for can be found on the
CÉCI clusters page.
Several respondents asked for some training, support and tutorial for new users.
The CÉCI organizes each year around October a training session held at the UCL for all users specially for beginners. The FAQ, and tutorials referred-to there in, are oriented to help users during their first steps as a cluster user.
Common storage
This year we finally started supporting a common storage system that is visible to all compute nodes of all clusters. Several users expressed their satisfaction for having this new option available which they are already making use of.
Some users were not aware of the common storage feature.
We cover in the CÉCI support web a detailed explanation about the common storage solution usage.
In addition, on this year CÉCI training session Introduction to data storage and access will be covered explicitly how to make use of the common storage.
To summarize, the common space can be accessed from any of the CÉCI universities
clusters through the $CECIHOME
environment variable, you can list its
contents with the command
ls $CECIHOME
In the long term, this filesystem will be the default home on the clusters, but
at the moment, you will need to copy files there with the cp
command
explicitly.
There is a 100GB quota on the main storage, to get your current usage use the
quota
command from one of the clusters. If this command does not list the
central storage, try first listing its contents with ls
, it should appear
afterwards.
We refer again to the
documentation
to read about all the details and to learn about the extra common space for
fast transfers $CECITRSF
between the scratch partitions of the clusters.
One user mentioned that moving files between clusters with scp is more efficient than copying to the common space
While this might be valid for small files, this is certainly not the case for
big files ~1 TB when using the $CECITRSF
space.
One respondent mentioned not being able to use the common storage due to a buggy 'quota exceeded' error
The CÉCI common storage is a pretty complicated system with many parameters and degrees of freedom (read carefully the answer below for more details). Especially quota are complicated to set due to the asynchronous nature of the replication. We have had to adapt the quota a few times in the early weeks to cope with problematic situations. Now, it should be more stable in that respect. Thanks everyone for their patience.
One user mentioned to be confused due to different information obtained with 'du -sh' for the same file on different clusters
This is a very interesting point, as different results with du -sh
can be
obtained
not only on a complex solution as the CÉCI common storage but also for
the same files stored on standard
partitions having different file systems.
The solution implemented for the CÉCI common filesystem is based on the GPFS cluster filesystem developed by IBM. In addition, to keep the data on the common partitions synchronized among the different geographical locations of the CÉCI clusters, it makes use of the Active File Management (AFM) implementation. Within this setup a global namespace is defined to share a single filesystem among the different gateway nodes which are physically deployed on each of the CÉCI universities.
When copying, for example, a 6MB file from your nic4 home folder to
$CECIHOME
and perform a du -sh
you will obtain:
$ cp file_6MB.dat $CECIHOME/
$ du -sh $CECIHOME/file_6MB.dat
6.0M $CECIHOME/file_6MB.dat
In order to avoid the inherent problems due to wide-area interconnects
latencies, the rest of the gateways on the AFM setup, are served only the
metadata of the files created on one of them. That is to say, your
file_6MB.dat
file will be actually copied, synced and be stored on both
main storages
at ULiège and UCL, but the rest of the gateways will only have the information
that the file exist on the common space, the actual contents will be
transferred only on demand.
Then, if after the previous steps on nic4 you login on dragon1 and perform
du -sh
you will see:
$ du -sh $CECIHOME/file_6MB.dat
0 $CECIHOME/file_6MB.dat
If in dragon1 you access the file, i.e. open with an editor, do a cat
or
less
, etc. then the actual data contained on the file will be transferred
from one of the main storages to the gateway, afterwards you should get:
$ du -sh $CECIHOME/file_6MB.dat
6.0M $CECIHOME/file_6MB.dat
To summarize, you should never rely on du -sh
to verify if a file is
properly stored or copied, and this is valid in general for any kind of
filesystem. A more appropriate action would be to check for file consistency
with a hash tool and verify to get the same output on different gateways:
$ md5sum $CECIHOME/file_6MB.dat
da6a0d097e307ac52ed9b4ad551801fc $CECIHOME/file_6MB.dat
In the case that, for some reason you want to know which is the approximate
space taken by a file or directory on $CECIHOME
then add the
--apparent-size
option to du
:
$ du -sh --apparent-size $CECIHOME/file_6MB.dat
6.0M $CECIHOME/file_6MB.dat
the output should be closely the same on all the clusters, independently if
files were accessed or not. Notice that the man page for du
defines the
tool as du - estimate file space usage.
If you want to know which is your current space used on the $CECIHOME
area
you should always do so with the quota
command. The output must
return the same information for your usage on /CECI/gateway/home
on all
the clusters.
In case it does not, please submit a ticket on the CÉCI Support
page.
Support
Some users complained about their jobs running or in queue being killed
Jobs are killed by a sysadmin only when they represent a potential problem for keeping the cluster running. They can also be killed automatically due to entering some error state.
In any case, when there is an issue of this kind, please contact the local system administrator of the cluster where your job was killed immediately to understand why that action was required.
It might not be your fault, but in case it was, is important to understand what happened to avoid falling into that problem again.
One respondent mentioned that getting access to zenobe takes a very long time
Zenobe is a very large machine (Tier-1 level, as opposed to Tier-2 level for the CÉCI clusters) so getting access to it requires a few administrative steps like submitting a project. This is a decision that was made by the funding agency in coordination with the Vice-Rectors of our universities. But when a project already exists, adding a user to a project is done within 24 hours on average.
If you have no answers after that time frame, you can always contact again Cenaero, the sysadmin of your local university and the CÉCI logisticiens.
One respondent who moved between CÉCI universities mentioned there was no procedure to change emails when the old one has already expired
The procedure to change the email address indeed requires both the old and new addresses to be usable. As a measure to protect the logins from identity theft, in the event your old email address has expired, you need to contact the system administrators from your former and new universities to offer a proof that you have indeed changed emails.
One user complained about a required software installation taking too long
Users must keep in mind that some system administrators have many more duties others than taking care of the CÉCI clusters and are not backed up by a team. Thus, they must prioritise their tasks according to the impact the task has on the cluster usability. Feel free too to contact them by phone to have a more precise answer if you feel you have been waiting for too long.
One respondent complained about receiving an inappropriate response to simple questions
The CÉCI team gives a high priority to maintain a useful and practical documentation about the clusters' usage and to explain how the different components work to make an efficient use of them. When a user has some questions about these topics, before contacting the system administrators, it is mandatory to go through the documentation to verify if they are not already answered there.
One respondent asked for Matlab on the clusters
A training session is dedicated to using Matlab on the clusters by means of the Matlab Compiler to avoid the licensing issue. You can take a look at the slides of previous year or still join for the 2017 edition.
Resources
Some respondent complained about the different configuration of libraries and modules between clusters
Uniforming the software modules is indeed something we are working towards.
Now that we have the common storage installed we are closer to provide a
solution of this kind. There is a CECI/Soft
partition created that
will be used to store all the modules and compiled software to provide an
homogeneous configuration on all the clusters.
Some users required to have software updated more frequently
After going in the direction of uniforming the software modules, their maintenance will be centralized and that will help to keep the software stack up-to-date easily.
One particular user asked specifically for keeping gcc and gfortran updated more often, we will try to update the forthcoming CÉCI soft space at least once per year. But, in the case you need for some reason a specific version of a code or compiler, you can request that to the local system administrator or also follow the instructions on the documentation to compile software from sources on your own.
One user requested the possibility to compile Fortran CUDA codes
CUDA for Fortran is only supported by the PGI Fortran
compiler. This compiler is bundled
in a commercial suite and thus is not available on all the clusters. Among the
CÉCI clusters which have GPUs available,
Dragon1 from UMons is the only
one with a recent version of the PGI compilers. By loading the module
pgi/17.7
the enabled pgfortran
compiler should be able to compile
current versions of Fortran CUDA codes.
For running the job, remember to ask for the specific GPU generic resource by adding to the slurm script:
#SBATCH --gres="gpu:1"
One user asked for implementing a uniform greeting message on all the clusters
We are working on it but in the end, the final configuration of a cluster is the responsibility of the local team. Do not hesitate to ask them directly if there is some specific info you find useful to be shown.
One respondent asked for more resources to the frontends as sometimes they are clogged by users running many things on them
The only runs which are resource intensive that could take place on the frontends is compiling code from source. It might happen rarely that several compilation procedures could be taking place at the same time, thus making the frontend temporarily a bit unresponsive.
Other than that, batch scripts running on the frontends should be restricted for very basic tasks running in seconds. Otherwise some of the fast or postprocessing queues should be used to run them. If you detect some hourly long and heavy resource consuming batch script running on a frontend, please contact the local system administrator to see which action should be taken.
One respondent asked for scratch space on zenobe nodes for intensive I/O runs
We will transmit this request to the Tier-1 users' committee.
One user requested the possibility to deploy virtual machines on the clusters
This topic is being reviewed at the moment as it seems clear that HPC and Cloud are converging. At the moment, no infrastructure is available for that unfortunately. Lemaitre3, the next CÉCI cluster to come at the UCL, will have tools installed to run Linux containers in jobs, but not (yet) for virtual machines.
Some respondents asked for bigger and faster scratch storage up to 15TB
As part of the scheduled upgrades of the CÉCI clusters infrastructure, the next one being Lemaitre3 at the UCL during the beginning of 2018, will count with 600TB of very fast parallel scratch storage to use.
Scheduling
One respondent mentioned that a user monopolized half of lemaitre2 for two weeks and that balancing is better implemented on nic4
As lemaitre2 is often heavily loaded it is nearly impossible for a single user to grasp a large portion of the cluster. The rare situation of a single user occupying half of lemaitre2 could take place only for instance after a cluster restart, such as after a maintenance period.
In that case, this can last only for 3 days, which is the maximum running time allowed on lemaitre2, after that the fairshare of the user will drop and the priority of extra jobs will decrease substantially. In lemaitre2 we favour no limits to hard limits that would prevent jobs from starting when the cluster is not used
Nic4 is more balanced because since it is the cluster with the lowest maximum running time of all, it has the highest turnover.
One user requested to keep working on having a unique scheduler for all clusters
This is a work in progress and is something that we will start to implement and test as newer generations of clusters start rolling on the CÉCI.
Several respondent asked for longer time limits for the jobs.
This is a question that is often asked. Some of the clusters are configured to favour jobs that scale well, i.e. where you can trade job wall time for number of CPUs, because a lot of money has been spent in a very fast interconnect (Infiniband).
Users must also take into account the fact that long jobs are incompatible with short waiting times. One user requested to set the wall time to less than 20h to increase turnover. The current times are chosen to accommodate at best the very different requirements of all CÉCI users' in the section about max wall time.
Users are encouraged to try and use checkpointing software such as http://dmtcp.sourceforge.net. A training session is dedicated to it for CÉCI users. This year will not be organized but you can take a look at the slides available online.