12.07.2015 Views

Ganga Tutorial - CSC

Ganga Tutorial - CSC

Ganga Tutorial - CSC

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

User interfaces*** Welcome to <strong>Ganga</strong> ***Version: <strong>Ganga</strong>-4-2-8Documentation and support: http://cern.ch/gangaType help() or help('index') for online help.In [1]: jobsOut[1]: Statistics: 1 jobs CLIP--------------# id status name subjobs application backendbackend.actualCE# 1 completed Executable LCG lcgcompute.hpc.unimelb.edu.au:2119/jobmanageGUI#!/usr/bin/env ganga#-*-python-*-import timej = Job()j.backend = LCG()j.submit()while not j.status in [‘completed’,’failed’]:print(‘job still running’)time.sleep(30)./myjob.execganga ./myjob.execIn [1]:execfile(“myjob.exec”)GPI & ScriptingEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 8


<strong>Ganga</strong>• <strong>Ganga</strong>: Job Management Tool– a utility which you download to your computer or it is already installed in your institute in a shared area• for example: /nfs/sw/ganga/install/5.1.0– <strong>Ganga</strong> is an add-on to installed software Grid UI must be installed separately Batch system should be set-up separately NorduGrid ARC middleware is shipped with <strong>Ganga</strong><strong>Ganga</strong>Application SoftwareLSF ClientLCG UIEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 9


Hello World Examples#1. Hello World Locallyj = Job()j.backend=Local()j.submit()#2. Hello World on Nordu Gridj = Job()j.backend=NG()j.submit()#3. Hello World on EGEEj = Job()j.backend=LCG()j.submit()You may use allfeatures of a first-classprogramminglanguage to writecomplex scriptsloops, ifs, variablesfiles, math, networkmodules,...EGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 10


Real Application: the ATLAS data analysis application$ ganga athena \--inDS myInputDataset.txt\--outputdata myOutput.root \--split 3 \--maxevt 100 \--lsf \jobOptions.pyScripting modequickj = Job()j.application=Athena()j.application.prepare()j.application.option_file='jobOptions.py‘j.inputdata=DQ2Dataset()j.inputdata.type='DQ2_LOCAL'j.inputdata.dataset=“myInputDataset.txt”j.outputdata=DQ2OutputDataset()j.outputdata.outputdata=[‘myOutput.root']j.splitter = AthenaSplitterJob(numsubjobs=3)j.merger = AthenaOutputMerger()j.backend = LSF()j.submit()CLIP modeapplicationinputdataoutputdataSplitter & Mergerflexiblej2 = j.copy()j2.backend=LCG( CE=’ce102.cern.ch:2119/jobmanager-lcglsf-grid_2nh_atlas’ )j2.submit()EGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 11


Behind the scenes ...User has access tofunctionality of GANGAcomponents throughGUI, CLI or batch scriptsapplicationconfigurationEGEE-II INFSO-RI-031688storage and retrieval of jobinformation in a local orremote DB, storage ofsandbox filesDataManagementinput data selection,output location specification<strong>CSC</strong> Finland – EGEE Training 12


GANGA User CommunitiesMore than 2300 individual usersHARPGarfieldFusionLQCDSixtrack/LHCEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 13


More than 1400 unique users>2300 different users, ~300-500 users monthly~60% Atlas~25% LHCb~15% othersEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 14


<strong>Ganga</strong> usageInstalled at ~200 sitesCLIP and scripts most popularEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 15


Why not portal?• Many (not all) scientific applications are:– developed/modified/extended on local machine– use local resources (files, environment)– are scripted (e.g. higher-level logic is built around them)• What user mobility (desktop vs laptop)?– <strong>Ganga</strong> may use remote repository (and workspace)– So user may see his jobs submitted from a different machine andinteract with them (resubmit, kill)...• <strong>Ganga</strong> approach:– lean and neat utility with interface in Python– + GUI if you like– + scripting if you like moreEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 16


More info.• <strong>Ganga</strong> Home: http://cern.ch/ganga• Official <strong>Ganga</strong> User’s Guide:http://cern.ch/ganga/user/html/<strong>Ganga</strong>Introduction/• Looking for help: project-ganga-developers@cern.chEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 18


The <strong>Ganga</strong> development team- <strong>Ganga</strong> is supported by HEP-Support for development work-Core team:F.Brochu (Cambridge), U.Egede (Imperial), J. Elmsheuser (Munich),K.Harrison (Cambridge), H.C.Lee (ASGC Taipei), D.Liko (CERN), A.Maier (CERN),J.T.Moscicki (CERN), A.Muraru (Bucharest), W.Reece (Imperial), A.Soroko (Oxford),CL.Tan (Birmingham), D.Vanderster (CERN)EGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 19


Enabling Grids for E-sciencEPart I: Practical <strong>Ganga</strong>www.eu-egee.orgEGEE-II INFSO-RI-031688EGEE and gLite are registered trademarks


Download, Install, First launchwget http://cern.ch/ganga/download/ganga-installDownload &Installdownload installerpython ganga-install \--prefix=/usr/local/ganga/prefix \--extern=<strong>Ganga</strong>GUI,<strong>Ganga</strong>Plotter \5.0.0<strong>Ganga</strong> versioninstallation prefixInstallation of external modulesFirst Launchexport PATH=$HOME/opt/ganga/install/5.0.0/bin:$PATHstart <strong>Ganga</strong> with inline configurations$ ganga*** Welcome to <strong>Ganga</strong> ***Version: <strong>Ganga</strong>-5-0-0Documentation and support: http://cern.ch/gangaType help() or help('index') for online help.In [1]:Do you really want to exit ([y]/n)?<strong>Ganga</strong> CLIP-D to exit <strong>Ganga</strong> CLIPEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 21


Submitting a Job with <strong>Ganga</strong>What is a <strong>Ganga</strong> Job?Run the default job locally:Job().submit()Default job on the EGEE grid:Job(backend=LCG()).submit()Listing of the existing jobs:jobsGet help (e.g. on a job):help(jobs)Display the nth job:jobs(n)Copy and resubmit the nth job:jobs(n).copy().submit()Copy and submit to another grid:j=jobs(n).copy()j.backend=NG()j.submit()Kill and remove the n th job:job(n).kill()job(n).remove()EGEE-II INFSO-RI-03168822


ConfigurationSyntax of .gangarc file[Configuration]TextShell = IPython... ...[LCG]GLITE_ENABLE = True... ...Python ConfigParser standardHard-coded default configurationsrelease configHow to set configurationssite configexport GANGA_CONFIG_PATH=/usr/local/ganga/prefix/install/etc/Gilda.ini:<strong>Ganga</strong><strong>Tutorial</strong>/<strong>Tutorial</strong>.iniganga --config-path=/usr/local/ganga/prefix/install/etc/Gilda.ini:<strong>Ganga</strong><strong>Tutorial</strong>/<strong>Tutorial</strong>.ini~/.gangarcganga -ouser configOverride sequenceuser config > site config > release configEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 23


`gangadir`• gangadir folder is created at the first launch within $HOME directory• To locate it in different directory:– [Configuration] gangadir = /alternative/gangadir–• Job Repository may also be stored remotely in a databaseMetadata of jobsData of jobsEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 24


Some handy functions• completion• for cmd history• system command integrationIn[1]: j = jobs[1]In[2]: cat $j.outputdir/stdoutHello World• Job template• In[1]: plugins()– plugins(‘backends’)• In[2]: help()• etc.In[1]: t = JobTemplate(name=’lcg_simple’)In[2]: t.backend = LCG(middleware=’EDG’)In[3]: templatesOut[3]: Statistics: 1 templates--------------# id status name subjobsapplicationbackendbackend.actualCE# 3 template lcg_simpleExecutableLCGIn[4]: j = Job(templates[3])In[5]: j.submit()EGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 25


Enabling Grids for E-sciencEPart II: <strong>Ganga</strong> hands-onwww.eu-egee.orgEGEE-II INFSO-RI-031688EGEE and gLite are registered trademarks


Step 0: launch <strong>Ganga</strong> CLIPhttps://twiki.cern.ch/twiki/bin/view/ArdaGrid/EGEE<strong>Tutorial</strong>Package• Skip the installation step• Start your <strong>Ganga</strong> CLIP session:shell> gangaEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 27


In [1]: !pico myscript.sh#!/bin/shecho “Hello ${1} !”echo $HOSTNAMEcat /proc/cpuinfo | grep 'model name’cat /proc/meminfo | grep 'MemTotal‘echo “Run on `date`”Step 1: Your first <strong>Ganga</strong> job - an arbitrary shellscriptIn [2]: !chmod +x myscript.shIn [2]: j = Job()In [3]: j.application = Executable()In [4]: j.application.exe = File(‘myscript.sh’)In [5]: j.application.args = [‘Helsinki’]In [6]: j.backend = Interactive()In [7]: j.submit()In [8]: jobsEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 28


In [9]: j = j.copy()In [10]: j.backend = Local()In [11]: j.submit()In [12]: jobsIn [13]: j.peek()In [14]:cat $j.outputdir/stdoutStep 2: Your first <strong>Ganga</strong> job - an arbitrary shellscript./myscript.sh BudapestEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 29


Step 3: your first <strong>Ganga</strong> job on the GridIn [15]:j = j.copy()In [16]:j.backend = LCG()In [17]:j.application.args = [‘Somewhere in the world...’]In [18]:j.submit()In [19]:jIn [20]:cat $j.backend.loginfo(verbosity=1)In [21]:jobsEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 30


Exercise: Prime factorization• Please follow the instructions from:https://twiki.cern.ch/twiki/bin/view/ArdaGrid/EGEE<strong>Tutorial</strong>Package#Exercise_Prime_number_factorizatEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 31


Enabling Grids for E-sciencEPart III: Advanced use-caseswww.eu-egee.orgEGEE-II INFSO-RI-031688EGEE and gLite are registered trademarks


Dashboard monitoring• analysis and simulations jobs run in ATLAS– one month (mid-March to mid-April 2007) 20K analysis jobs, 10K simulation jobs– only LCG jobs, others not shown– data collected by a ARDA Dashboard sensor integrated with<strong>Ganga</strong>EGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 34


Integration with frameworks200K short jobs in 12 hours (500CPUh)• Reduced Grid overhead• Efficient workloadbalancing• Fast error-recovery• Automatic execution• InteractivityEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 35


Web portal for biologists• Interface created by biologists (Model-View-Controller design pattern)– The Model makes use of <strong>Ganga</strong> as a submission tool and DIANE to better handle docking jobs on the Grid– The Controller organizes a set of actions to perform the virtual screening pipeline; The View represents biologicalaspectsEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 36


EGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 37


Extending <strong>Ganga</strong>SplitterMergergeneric logicApplication pluginDatasetsspecific backendbindingDetailed guide available at:https://twiki.cern.ch/twiki/bin/view/ArdaGrid/HowToPlugInNewApplicationTypeA good example: python/<strong>Ganga</strong><strong>Tutorial</strong>/Lib/*.pyEGEE-II INFSO-RI-031688<strong>CSC</strong> Finland – EGEE Training 38


Remote backend• Remote backendssh channel:<strong>Ganga</strong> JobInput SandboxNormal <strong>Ganga</strong> JobSubmissionLocal <strong>Ganga</strong> InstanceLocal Analysis CodeRemote <strong>Ganga</strong> Instance(e.g. lxplus)Exec HostBatch or Grid• Remote allows users to submit jobs from the anywhere, even ifyou don’t have the batch/grid client tools installed:– Local ganga packages the input sandbox, connects to a remote gangainstance via ssh– The remote instance actually submits and monitors via the realbackend (e.g. LCG/Dirac/Panda)– Useful for example to work around quota issues etc…EGEE-II INFSO-RI-03168839


GridWay and InterGrid backends• There are novel contributions coming from outside the core<strong>Ganga</strong> team:– KISTI (Korea Institute of Science and Technology Information) and theWISDOM project Continued effort to use grids for Avian flu, Malaria drug searches– Some of their resources couldalready be reached by the existingLCG/gLite backend.– But, other resources areGlobus sites, managed byGridWay. No GridWay backend, so theydeveloped one.– Further, they wanted to hide thebackend details from the users: InterGrid backend: selects betweenLCG and GridWay backends using load information.– See Soonwook Hwang et al. “An Approach to Grid Interoperability using<strong>Ganga</strong>”. EGEE User Forum, Catania, March 2-6 2009EGEE-II INFSO-RI-03168840


Testing Framework & Releases• One of the strengths of <strong>Ganga</strong> isthe stable release procedure andextensive testing• Release Procedure:– <strong>Ganga</strong> developers rotate through 6week terms as release manager– Release manager’s job is quiteeasy, most of the process isautomated.– When enough tags have beencollected, a pre-release is created• Testing Framework:– Each pre-release is validated withnearly 500 test cases– Tests of new features and toprevent regressions– Ideally, each bugfix gets a test case.Failures usually due to timeoutsEGEE-II INFSO-RI-03168841


<strong>Ganga</strong>Robot and HammerCloud• <strong>Ganga</strong> is not only useful forend users:– The Python API allowsdevelopment of Grid applications• Two examples come from theATLAS Distributed AnalysisTesting tools.• <strong>Ganga</strong>Robot is used to runmany short functional testsdaily to continuously validatethe DA workflows.– Results are fed into <strong>Ganga</strong> so thatbroken sites can be avoided.• HammerCloud is used to runlarge stress tests:– Measuring the behaviour of thestorage, networks, databases,etc… under load.• See talk “Functional and Large-Scale Testing of the ATLASDistributed Analysis Facilitieswith <strong>Ganga</strong>” 26-Mar-2009 @15:20EGEE-II INFSO-RI-03168842


EGEE-II INFSO-RI-031688http://cern.ch/ganga

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!