12.07.2015 Views

Migration Planning Guide: IBM Rational ClearCase to Perforce SCM

Migration Planning Guide: IBM Rational ClearCase to Perforce SCM

Migration Planning Guide: IBM Rational ClearCase to Perforce SCM

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Perforce</strong> Software’s migration <strong>to</strong>ol for <strong>ClearCase</strong> migrations is “front door” with respect <strong>to</strong> extractionfrom <strong>ClearCase</strong>, and “back door” with respect <strong>to</strong> importing in<strong>to</strong> <strong>Perforce</strong>. Being front door on the<strong>ClearCase</strong> side enables the <strong>to</strong>ol <strong>to</strong> work across many <strong>ClearCase</strong> versions, while being back dooron the <strong>Perforce</strong> side allows it <strong>to</strong> generate the most accurate representations of <strong>ClearCase</strong> his<strong>to</strong>ry interms of <strong>Perforce</strong> journal records.Cus<strong>to</strong>m ScriptingA detailed migration sometimes involves cus<strong>to</strong>m scripting due <strong>to</strong> differences in <strong>ClearCase</strong> usage andany oddities in a particular data set.Hardware Capacity <strong>Planning</strong>Hardware capacity planning may be impacted significantly with DHI migrations. An <strong>SCM</strong> system with,for example, 12 years of his<strong>to</strong>ry would require more hardware (more disk space, more RAM, faster CPUsand I/O subsystems, etc.) than one with no his<strong>to</strong>ry. If you import 12 years of detailed his<strong>to</strong>ry, yournew <strong>Perforce</strong> system will initially require as much hardware as if it had been in operation for 12 years.Pros of a Detailed His<strong>to</strong>ry Import• Most complete approach. Detailed his<strong>to</strong>ry imports can transfer the most his<strong>to</strong>rical detail,including branching his<strong>to</strong>ry, from <strong>ClearCase</strong> <strong>to</strong> <strong>Perforce</strong>.• After the migration, comprehensive his<strong>to</strong>rical research and “merge forensics” can be done in<strong>Perforce</strong> without the need for going back <strong>to</strong> <strong>ClearCase</strong> (we recommmend keeping one userlicense in <strong>ClearCase</strong> as a backup).• The ability <strong>to</strong> view file his<strong>to</strong>ry with <strong>Perforce</strong>’s powerful visualization <strong>to</strong>ols like Time-Lapse Viewcan shed new light on the evolution of source code and help increase understanding of thechanges over time.• There is an increased benefit for systems integrated with version control. For example, themeaning of the linkage between a set of files originally modified in <strong>ClearCase</strong> and an issue fromyour issue tracking system can be maintained. Code review <strong>to</strong>ols such as SmartBear CodeCollabora<strong>to</strong>r will have greater context.• Unlike <strong>Perforce</strong>, <strong>ClearCase</strong> does not validate the integrity of versioned file contents usingchecksums. File corruption, for example, because of disk failures, goes undetected. 1 Oncehis<strong>to</strong>rical data is in <strong>Perforce</strong>, it will gain the benefit of checksum verification of contents of allrevisions, which improves IP provenance.Cons of a Detailed His<strong>to</strong>ry Import• Detailed his<strong>to</strong>ry import <strong>to</strong>ols have a variety of limitations and technical caveats because of thepotential complexity of <strong>ClearCase</strong> environments, including unusual patterns (or even corruption)in the data. So-called evil-twin elements or branching scenarios created by misconfigured configspecs can be difficult <strong>to</strong> follow.• Complex migrations can mean potential schedule and budget risks for the migration project.Baseline and Branch Import (BBI)The BBI strategy provides a lightweight migration alternative that is more sophisticated than the Tipsapproach and avoids the technical complexity and schedule and budget risks of detailed his<strong>to</strong>ryimports.BBI is a generic from-anything-<strong>to</strong>-<strong>Perforce</strong> process, and has been used <strong>to</strong> migrate <strong>to</strong> <strong>Perforce</strong> froma variety of <strong>SCM</strong> systems, including <strong>IBM</strong> <strong>Rational</strong> <strong>ClearCase</strong>, Borland StarTeam, Merant PVCS,1. <strong>ClearCase</strong> does have a “checkvob” utility that can detect and fix some forms of metadata corruption. However, this utility does not detect data container corruption, andthus the contents of versioned files cannot be audited.7


Subversion, Mercurial, CVS, Microsoft Visual Source Safe, AccuRev, and even unsophisticated <strong>SCM</strong>“systems” like a set of network drives with direc<strong>to</strong>ries named for releases.With the BBI approach, the “interesting his<strong>to</strong>ry” <strong>to</strong> be imported is specified using a branch diagramthat shows the baselines (snapshots of a direc<strong>to</strong>ry structure at a specified point in time) and majorbranching operations. For example, Figure 1 represents a software product. The baselines (blue dots)indicate the “interesting versions” that are <strong>to</strong> be imported. The arrows indicate branching operationsthat affect an entire branch. In this example, a 2.0-Rel branch has been created, and four patcheswere created from that branch. Two of those four patches have been merged back <strong>to</strong> MAIN. The BBIprocess imports all the baselines, records that the two patches were merged with resulting updates<strong>to</strong> MAIN, and tracks the two unmerged patches. After migrating, you can use <strong>Perforce</strong> <strong>to</strong> completethose merges.Delivered(Released)p1 p2 p3 p4 p12.0 RelBranch3.0 RelBranchMAINBranch1.0 2.0 2.1 3.0 4.02.1 (new features)+ patch p1 merged2.1 (new features)+ patch p2 mergedFigure 1: Sample baseline and branch diagramImporting the branching operations enables <strong>Perforce</strong> <strong>to</strong> select common ances<strong>to</strong>rs when merging, soyou can resume branching activities after migration. The BBI process imports branching operations ata high level, capturing the sum of merge operations. For example, in Figure 1, the merge of p2 back<strong>to</strong> MAIN most likely consisted of a series of merges by several developers. The individual file mergesare not tracked, but the sum of the results of the merges (file adds, edits, and deletes) are tracked.The imported baseline represents the point in time when the merge of p2 is complete.The intent of this approach is <strong>to</strong> bring over just enough branching his<strong>to</strong>ry <strong>to</strong> answer key questionssuch as what did Release 2.0 look like, where was this file branched from, and what files do I need inmy workspace <strong>to</strong> start maintenance work on Release 2.3? The BBI approach preserves file contentsat key points and preserves enough branching his<strong>to</strong>ry so that the switch <strong>to</strong> <strong>Perforce</strong> can happen atany point in the release cycle, rather than just at “convenient points” in the schedule (which tend <strong>to</strong>be hard <strong>to</strong> find).After conversion, <strong>Perforce</strong> contains the his<strong>to</strong>ry of your software product viewable using the RevisionGraph <strong>to</strong>ol, and the his<strong>to</strong>ry is similar <strong>to</strong> what would have been recorded if development was doneusing <strong>Perforce</strong> <strong>to</strong> begin with. Detailed data is discarded: you know what your product looked like atRelease 1.0 and Release 2.0, for example, but hundreds of check-ins between those baselines arediscarded, as are the user ID, date, time, and check-in comments.Accurate diagrams are essential for planning a BBI migration. Ideally, release engineers can quicklydraw an accurate branch his<strong>to</strong>ry for each software product <strong>to</strong> be imported. If that is not the case,such information can be extracted by exploring <strong>ClearCase</strong> manually. After the diagram is drawn andverified, it is translated in<strong>to</strong> the <strong>Perforce</strong> commands that re-create the baselines in <strong>Perforce</strong>. The first<strong>Migration</strong> <strong>Planning</strong> <strong>Guide</strong>: <strong>IBM</strong> <strong>Rational</strong> <strong>ClearCase</strong> <strong>to</strong> <strong>Perforce</strong> <strong>SCM</strong> 8


4. Terminology and ConceptsThe following sections map <strong>ClearCase</strong> concepts <strong>to</strong> their <strong>Perforce</strong> equivalents.VOBs and DepotsA <strong>Perforce</strong> depot is roughly analogous <strong>to</strong> a <strong>ClearCase</strong> VOB (which stands for “versioned objectbase”). VOBs and depots both appear as <strong>to</strong>p-level direc<strong>to</strong>ries <strong>to</strong> users, and both s<strong>to</strong>re a set of files.One VOB or depot must exist before any file can be versioned.A VOB is a container for versioned file contents and metadata relating <strong>to</strong> those versioned files. A<strong>Perforce</strong> depot contains only the contents of versioned files. All metadata is s<strong>to</strong>red in a database onthe <strong>Perforce</strong> server.When mapping VOBs <strong>to</strong> depots, consider the following:• Unlike files in VOBs, files in <strong>Perforce</strong> can be branched <strong>to</strong> other depots. VOBs in <strong>ClearCase</strong> areeffectively islands of code.• <strong>Perforce</strong> manages binary files efficiently, enabling you <strong>to</strong> manage all your digital assets.<strong>ClearCase</strong> sites avoid s<strong>to</strong>ring large number of binaries because of performance considerations.For example, a software product might consist of source code, software products built from thatsource code, and various released configurations of software products. A separate depot mightbe assigned for each, for example, //gizmo, //gizmo-build, and //gizmo-release.• <strong>Perforce</strong> depot names should be kept short. //Engineering is OK, but //Eng is better.//Eng-AdvancedTechnologyGroup is a bit long, so //Eng-ATG is better.Regions and ProtectionsIn <strong>ClearCase</strong>, network registry regions are used <strong>to</strong> segregate VOBs. A region sees only a subset ofall VOBs in a <strong>ClearCase</strong> installation.To achieve similar segregation in <strong>Perforce</strong>, you use the protections table. Users are assigned <strong>to</strong> different<strong>Perforce</strong> groups. Access is then managed at the group level by assigning permissions <strong>to</strong> groups.VOB Servers and the <strong>Perforce</strong> ServerIn <strong>ClearCase</strong> environments, you might have multiple VOB server processes, possibly distributedamong multiple VOB server machines. With <strong>Perforce</strong>, a single <strong>Perforce</strong> Server can support an entireinstallation. 3 The <strong>Perforce</strong> Server process, P4D, runs on a single machine, is frugal with systemresources, and is much less demanding than <strong>ClearCase</strong>. One P4D process can scale <strong>to</strong> supportextremely large environments (for example, 10,000+ users) on a single server using enterprise-gradeserver machines.One of the first steps in any migration is <strong>to</strong> set up <strong>Perforce</strong> hardware. It is common <strong>to</strong> allocate two orthree identical server machines <strong>to</strong> <strong>Perforce</strong> <strong>to</strong> achieve high availability and disaster recovery goals.A typical configuration is two servers (a primary and a hot spare) in a primary data center and a thirdserver (warm spare) in another data center located far from the primary data center.Operating System SelectionThe primary fac<strong>to</strong>r in selecting an operating platform for <strong>Perforce</strong> is the platform that the local ITgroup is most comfortable supporting. In mixed Windows/*nix environments, Unix platforms are almostinvariably selected for <strong>ClearCase</strong>, because of case sensitivity reasons and because of its reliance on3. Deploying multiple <strong>Perforce</strong> Server instances within an enterprise is possible and common. For the purposes of this document, we consider only the single-server-perenterpriseapproach because that best suits most <strong>ClearCase</strong> migration scenarios.<strong>Migration</strong> <strong>Planning</strong> <strong>Guide</strong>: <strong>IBM</strong> <strong>Rational</strong> <strong>ClearCase</strong> <strong>to</strong> <strong>Perforce</strong> <strong>SCM</strong> 10


effectively monodirectional (Unix Ê Windows) file system mounts <strong>to</strong> make VOB data accessible <strong>to</strong>clients on both Unix and Windows.You can configure the <strong>Perforce</strong> server on Windows or *nix in multiplatform environments. Only a TCPconnection is needed between clients and servers. You can configure the case-sensitivity behaviorof the <strong>Perforce</strong> Server independently of the platform.Registry and License Servers<strong>Perforce</strong> does not require license or registry server processes or hardware <strong>to</strong> support it. A simple0.5KB license file on the <strong>Perforce</strong> Server machine composes the entire <strong>Perforce</strong> license mechanism.For large environments with lots of turnover, <strong>Perforce</strong> provides alternative licensing schemes thatsupply extra licenses for growth.Release Servers and InstallationWith <strong>ClearCase</strong>, <strong>to</strong> ensure users run client software that is compatible with the current version of theserver, you must manage server and client versions carefully. To aid in ensuring consistency, some<strong>ClearCase</strong> installations deploy a release server, which is a defined network resource from which allusers are expected <strong>to</strong> download correct versions of client software. This approach provides a morescalable alternative <strong>to</strong> making sure everyone has the installation CD available.With <strong>Perforce</strong>, all client components (and the server) install in minutes over the Web. More importantly,<strong>Perforce</strong> clients and the server have a very flexible forward- and backward-compatible relationship,because of a version-aware client/server pro<strong>to</strong>col. Users can generally run client versions that areolder or newer than the server. Client programs simply hide or disable features that require newerversions of the server, and new server versions rarely require client upgrades.For Windows sites with a large number of users, <strong>Perforce</strong> supports a centrally configured, au<strong>to</strong>mateddeployment of <strong>Perforce</strong>. You can use such sites <strong>to</strong> ensure that users download consistent, trustedversions of software that are supported by IT and/or release engineering staff. However, becauseof extreme compatibility and extreme ease of installation, maintaining such areas is much less of arequirement than in <strong>ClearCase</strong> environments.View Servers, Protecting Unversioned and Checked-Out Files<strong>Perforce</strong> s<strong>to</strong>res all metadata in a database on the central server, managed by the P4D process. 4A <strong>ClearCase</strong> View Server process has no equivalent in <strong>Perforce</strong>, and administra<strong>to</strong>rs don’t need <strong>to</strong>allocate and configure View Server machines.When dynamic views are used, View Server machines contain the contents of checked-out andunversioned-view private files. Some organizations back up view s<strong>to</strong>rage areas regularly <strong>to</strong> protectagainst the loss of such files. If protecting checked-out and unversioned files is a priority, you willneed <strong>to</strong> back up client machines.Some organizations devise a process <strong>to</strong> protect unversioned and checked-out workspace files in<strong>Perforce</strong>, such as locating workspaces on network drives that are backed up. During <strong>Perforce</strong> training,users are advised <strong>to</strong> avoid keeping files checked out for <strong>to</strong>o long, using sandbox branches if needed.<strong>ClearCase</strong> MultiSite versus <strong>Perforce</strong> ProxiesIf your <strong>ClearCase</strong> environment relies on MultiSite, you will find the <strong>Perforce</strong> Proxy mechanism simpleand effective. <strong>Perforce</strong> proxies cache the contents of versioned files at remote sites, greatly reducingdependency on the WAN.4. Some GUI programs temporarily cache metadata in running processes, but such information is not persisted.11


Proxies do not cache any metadata, thereby ensuring that there is a single source of state informationand eliminating the need for branch mastership, scheduling batch replication, and so on.Replicated VOB servers generally run on server machines that are similar <strong>to</strong> the primary server, thusrequiring similar-tiered data centers. Because of the significant investment in licenses, hardware,and administrative overhead, MultiSite installations are used only where major development centersexist, and they are of little use <strong>to</strong> small, geographically diverse teams.By contrast, <strong>Perforce</strong> Proxy servers are lightweight programs that can run on desk<strong>to</strong>p-grade hardware,even in enterprise environments. Proxy servers do not require high-performance hardware or largeamounts of disk space, so they can be deployed anywhere that a few developers gather. In somecases, individual users deploy <strong>Perforce</strong> Proxy instances in small offices without IT support. Thereare no additional costs or licenses required <strong>to</strong> deploy a <strong>Perforce</strong> Proxy server.<strong>ClearCase</strong> Views with <strong>Perforce</strong> WorkspacesThe term “workspace” is familiar <strong>to</strong> both <strong>ClearCase</strong> and <strong>Perforce</strong> users: it is where developersmanage files under version control on their local machines.With <strong>ClearCase</strong>, it is typical for a developer <strong>to</strong> maintain several workspaces, called “views.” Developerswho are working on multiple branches typically use a different view for each activity, working in oneview at a time. For example, a developer might maintain a joe_user_main_dev view with a configspec selecting /main/LATEST versions and a separate joe_user_rel_2.3 view selecting/main/REL2.3/LATEST versions. 5A <strong>Perforce</strong> “client specification” defines a workspace and determines the files from the server thatare visible in the workspace. Branches in <strong>Perforce</strong> appear <strong>to</strong> the user as fully-populated direc<strong>to</strong>rytrees. For example, the server might contain a direc<strong>to</strong>ry named MAIN and another direc<strong>to</strong>ry namedREL2.3. A developer might have a joe_user_dev workspace, that includes both MAIN and REL2.3direc<strong>to</strong>ries. The developer can work in both activities at the same time.In <strong>Perforce</strong>, only user files are s<strong>to</strong>red on the local disk. All metadata, including information about thename and contents of a user’s workspace, resides on the server.Label StrategiesBoth <strong>ClearCase</strong> and <strong>Perforce</strong> provide labels, which identify the versions of files that constitute abaseline. For many <strong>ClearCase</strong> users, labels are manda<strong>to</strong>ry. Applying labels is time-consuming, oftenaccounting for 30 percent or more of the time associated with creating stable builds.In <strong>Perforce</strong>, labels are just one way of reproducing baselines. Changelists accomplish the samegoal in ways that are less taxing on the build process and faster and easier <strong>to</strong> reference than a label.Each <strong>Perforce</strong> check-in generates a unique changelist number that reflects the state of the reposi<strong>to</strong>ryat a point in time. Any changelist can be used <strong>to</strong> describe the state of every file in the reposi<strong>to</strong>ry,even though it affects only a small subset of the reposi<strong>to</strong>ry.Branches in <strong>Perforce</strong> are represented as direc<strong>to</strong>ries, making it easy <strong>to</strong> combine branches andchangelist numbers <strong>to</strong> represent a baseline. Alternately, labels can refer <strong>to</strong> changelist numbers limited<strong>to</strong> an identified scope in the server, where the scope is typically a particular branch.5. This is an oversimplication since a typical config spec is several lines or more.<strong>Migration</strong> <strong>Planning</strong> <strong>Guide</strong>: <strong>IBM</strong> <strong>Rational</strong> <strong>ClearCase</strong> <strong>to</strong> <strong>Perforce</strong> <strong>SCM</strong> 12


Unified Change Management (UCM)UCM adds a layer of process <strong>to</strong> <strong>ClearCase</strong>. Out of the box, <strong>Perforce</strong> provides a<strong>to</strong>mic changelistsand jobs, which correlate <strong>to</strong> UCM functionality that enables <strong>ClearCase</strong> developers <strong>to</strong> group files andconnect them <strong>to</strong> an activity description. UCM also provides guidance on common software promotionmodels and branching strategies. <strong>Perforce</strong>’s open architecture accommodates various enterprise-widedevelopment methodologies and <strong>SCM</strong> cus<strong>to</strong>mizations.<strong>Migration</strong> Technical Details<strong>Perforce</strong> and <strong>ClearCase</strong> have very different internal representations and models of parallel development,branching, and merging. Both support parallel development, but you should be aware of the differences.Evil TwinsDescribes the problem in which two elements with the same name appear in different branches.For example, say you have a MAIN, DEV_A, and DEV_B branches, with each of the DEV branchesparented directly from MAIN. In DEV_A branch, a developer does a ‘ct mkelem’ <strong>to</strong> create a new fileelement, foo.c. Independently in DEV_B branch, a developer does the same thing —does a ‘ct mkelem’<strong>to</strong> create a new file element, foo.c. Someone then does a ‘ct findmerge’ <strong>to</strong> make file that originatedon DEV_A appear on MAIN. Later, someone does another ‘ct findmerge’ intending <strong>to</strong> merge changesfrom MAIN <strong>to</strong> DEV_B, including the new foo.c merged <strong>to</strong> MAIN earlier from DEV_A.Which is the real “foo.c” in DEV_B, the one that originated in DEV_A, or the one that originated inDEV_B? It’s not clear. One is identified as the correct file, and the other is the evil twin. One mightexpect them <strong>to</strong> be branch-relations of the same element, but they’re not related in the <strong>ClearCase</strong>database. As far as <strong>ClearCase</strong> is concerned, they’re completely independent elements, referencedin its database by different OIDs (object identifiers).The findmerge completes successfully, and you would have two foo.c’s in the branch, but the operatingsystem permits only one foo.c <strong>to</strong> appear in the direc<strong>to</strong>ry. The one that shows in your view is deterministic,but not obvious <strong>to</strong> most users. <strong>ClearCase</strong> provides no warning about this problem. Situations are evenworse when there are evil twin direc<strong>to</strong>ries.This is one of the more insidious complexities of <strong>ClearCase</strong>. <strong>ClearCase</strong> admins aware of this potentiallyconfusing scenario sometimes put in “Evil Twin Detection” and “Evil Twin Prevention” triggers. Whenmigrating data from <strong>ClearCase</strong>, evil twins are murky his<strong>to</strong>ry that should not (and cannot) be migratedin<strong>to</strong> <strong>Perforce</strong>. We detect instances of evil twins, manually select the correct element from the pair,and use ‘ct rmelem’ <strong>to</strong> eliminate the evil twin.In <strong>Perforce</strong>, the same filename can be created independently in two branches. However, <strong>Perforce</strong>enables you <strong>to</strong> resolve the situation the first time you merge the branches <strong>to</strong>gether. The file his<strong>to</strong>riescan be combined, instead of having <strong>to</strong> kill off an evil twin and lose its his<strong>to</strong>ry.Symlinks on WindowsThrough use of Multi-Version File System (MVFS), <strong>ClearCase</strong> supports symlinks on Windows indynamic views. <strong>Perforce</strong> does not have a cus<strong>to</strong>m filesystem and does not support symlinks on platformsthat do not natively support it. If you use symlinks on Windows, you must decide how <strong>to</strong> handle themwhen planning a migration.<strong>Perforce</strong> allows symlinks <strong>to</strong> be versioned. When a file of type ‘symlink’ is brought in<strong>to</strong> a <strong>Perforce</strong>workspace on a Windows machine, it appears as a text file containing the path of the target file. Forexample, using a Linux workspace, you issue the ‘ln –s hello.hpp hello.h’ and the file hello.h isa symlink pointing <strong>to</strong> hellol.hpp. In <strong>Perforce</strong>, if you sync the hello.h symlink <strong>to</strong> a Windows workspace,13

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!