yy," 1c1~n62 - Central Institute of Brackishwater Aquaculture
yy," 1c1~n62 - Central Institute of Brackishwater Aquaculture
yy," 1c1~n62 - Central Institute of Brackishwater Aquaculture
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
1996<br />
NATIONAL WORKSHOP CUM TRAINING ON<br />
BlOlNFORMATlCS AND STATISTICS IN AQUACULTURE RESEARCH<br />
February 2 - 5. 1<br />
S. AYYAPPAN<br />
DIRECTOR<br />
A.K. ROY<br />
COORDINATOR<br />
I<br />
Sponsored by<br />
DEPARTMENT OF RIOTECHNOLOGY<br />
Ministry <strong>of</strong> Science & Technology, Govt. <strong>of</strong> India<br />
and<br />
-<br />
<strong>yy</strong>,"<br />
I CENTRAL INSTITUTE OF FRESHWATER AQUACULTURE & J INS4 ~<br />
\Sf1 4 r,<br />
Indian Council <strong>of</strong> Agricultural Research<br />
s<br />
Ti 77 1c1~n<br />
!CAP Kauzalyapanga. Bhubanecwar-75 1002, Onsca, lND1A @-
NATIONAL WORKSHOP CUM TRAINING ON<br />
BIOINFORMATICS AND STATISTICS IN<br />
AQUACULTURE RESEARCH<br />
BIOINFOIUMATICS DIVISION<br />
DEPARTMENT OF BIOTECHNOLOGY<br />
Ministry <strong>of</strong>science & Technology<br />
Government <strong>of</strong> India<br />
New Delhi<br />
BIOINFORMATICS CENTRE ON AQUACULTURE<br />
CENTRAL ~NSTIWTE OF FRESHIYATER AQUACULTURE<br />
Indian Council ~~Agricullural Research<br />
Coordinator: A. K. ROY<br />
Director: S.AWAPPAN<br />
CENTRAL lNSTlTUTE OF FRESHWATEIt AQUACULTURE<br />
(Litdian Council <strong>of</strong> Agricultural Rrsearcl~)<br />
Kauulyngrngr, Bhubaatmrr-751002, Oriasa
FOREWORD<br />
Since the adkt <strong>of</strong>modirn science, attempu have been mad to impmve the<br />
speedandefincy <strong>of</strong>scientrfit communication Most <strong>of</strong> th schohrfj infonnatton<br />
howewr, fi<br />
cotltirlued to 6e puGfulied in print, it., in jounlali, 6006, cot$ermce<br />
proceedings etc. Ihc emergence <strong>of</strong> the Internet is radlua$ chattgi~tg the dentratiorr<br />
flow ojutllisation <strong>of</strong>injonnatwngIbbaQ.<br />
Wth the advent <strong>of</strong> information age, major initintives have been taheprc by the<br />
Indian CounciC$~gricuCturalQseanli (Iu@ to modntiee attd6nng information<br />
ntanagement cuCture in aa areas <strong>of</strong>&ncufturaC Qseatcli. '&epittg in view the<br />
06jectives <strong>of</strong>Iu$ CIFJ is aalro engagedin the ttas[<strong>of</strong>modintizing the hardware and<br />
s<strong>of</strong>tuare itutalTations in ordrr to cope with the (ntsst developed<br />
information<br />
technolbgics. Wth tie impkmentation <strong>of</strong>uy cottnectiw'ty, it L possi6l for the<br />
Scientists to share common rtsources f& 'VSJls, Laser Rinters, Statistical%c&ges<br />
andData6a.s~. ata66hrncnt <strong>of</strong>rSwinfomaticJ Centre on JquacuCture at CIFJ 6y<br />
QlotechnoJbBy Ir$onnatwn System (BIIS) <strong>of</strong> Department <strong>of</strong> Qiotechnobgy, Ministry<br />
<strong>of</strong>science andlcchnolbgy, Gwt. <strong>of</strong> India fa h[pedu~ immensely to stretrgtlien our<br />
system which wrlSsun(y boost th 8&D efJoort~ in thfielii <strong>of</strong>fislieries science iir<br />
genera[ and JquacuCture in particuhr. Jpart fmm <strong>of</strong>line Gibfiographic fiterature<br />
search thmllgh WM, the globalinfonnation onfinr highway [nown as internet<br />
can 6e ued6y aascientists with internet connectioru to see adeqlbre thowad <strong>of</strong><br />
cihtabases stored there.<br />
llir prarent worhhop is Lsigned to introduce the participants to the<br />
interesting wodf<strong>of</strong> &ta communication, data6ases, intenlet, muftirnediu, homepage<br />
rkvelbpment, statistical methodolbgics and packages and their application to<br />
aquacuftutz rueanh. lie epenetue aained from this workhop-cum-training<br />
programme, d enabb &nt$cation <strong>of</strong> spec* applications in dizerent<br />
mvironmenu. I tab this opportunity to thankthe participants, organizationr andan<br />
othn WKO haw contn6utcdto this worbhop for itJ success.<br />
S. AWAPPAN<br />
DIRECTOR
Bioirlforntatics Cetrtre, CIFJ expresses its sinceregratitude to Dr. 3. '1(, Jmra,<br />
Jdviser, Departntent <strong>of</strong> Biotechnobgy, Nittistry <strong>of</strong> Science arid kchnology,<br />
Goventmetit <strong>of</strong> I11diu for his coiutant advice andertcouragentent for devebpntent <strong>of</strong><br />
this Bioitlfonnatics Centre otr Pquacufture. Yfea~tful thanb are aGo due to Dr. 1:<br />
%fad7ianmohan, fi~'tic$al Scientijc Offirer, Wl for his continuous touch arid<br />
support for impruuement <strong>of</strong>tfiu centre.<br />
lfie Centre u indebtedto agth resource persou <strong>of</strong>various orgarrisatiorrs fik<br />
CIPW, WD/1, C W , STPI, MC, IGjIU, Cakutta Utriversity, ISI, Ut&d<br />
University, @criiampur Vrriversity, State (Tisheries (Omsa oZ WB), IG%o'L) art6<br />
ClFJ for cotrtriiution andpresentatiolr <strong>of</strong>papers andexchange <strong>of</strong>their vafwGlk tdeas<br />
with the participants to ma@ this "Won$iop-cum-Training programme" a grand<br />
success.<br />
Qioirlfomatics Division, Department <strong>of</strong> Qiotecfirrolbgy, Gover.rrnterrt <strong>of</strong> ltrdi~<br />
is gratefilto Dr. S. &yappan, Director, CIFJ forprmdirrg aa thefacifities to thu<br />
Qioinformatics Centre to qecu:ccutc aa its oi)ectives hid down by Biotechnoby<br />
Infonnatwn System (BIIS) <strong>of</strong>W1; Nw Deffii
Director<br />
Dr. S. A<strong>yy</strong>appan<br />
Coordinatoc<br />
Shri A. K. Roy<br />
Associates:<br />
Shri P. K. Satapathy<br />
Shri D. P. Rath<br />
Shri Ramesh Dash<br />
Cover photo : VSAT inslalled et ro<strong>of</strong> top <strong>of</strong> ClFA
CONTENTS<br />
1. Status <strong>of</strong> Bioinformatics Centre on <strong>Aquaculture</strong><br />
- A. K Roy & S. A<strong>yy</strong>appon<br />
2. Internet and the lntranet<br />
- Manas Patnaik<br />
3. The World Wide Web & Information Searching<br />
- Bikash Panda<br />
4. Internet and die Emerging Networked Society<br />
-A. K Roy<br />
5. Establislinient <strong>of</strong> Local Area Network and Internet under the<br />
ARISNET: A Case Study<br />
- G. R Marulhi Sankar<br />
6. Putting Education Online: A Case Study<br />
-A. R Tl~akur<br />
7. Web Site Design & Hosting<br />
- Bikash Panda<br />
8. Multimedia - a magic mantra<br />
- Jayaram Parida<br />
9. Multimedia - on the Web<br />
- Jayaram Parida<br />
10. World Wide Web, the lnformation Store House<br />
- B. K. Panda, A. K. Nayak, A. KRoy & P. K. Satapatly<br />
I I.<br />
Designing and Planning your Database<br />
- Swya Kumar Parranqvak<br />
12. Database on Fish Disease<br />
- B.B.Sahu. A.KRoy, P.KSaiapafhy, S,C, Mukherjee and S. A<strong>yy</strong>appan<br />
13. Quantitative and Qualitative Fish Production Database<br />
- B.B.Sahu, J.X. Jerta, A. X Roy, & S. A<strong>yy</strong>oppan<br />
14. Database <strong>of</strong> Induced Breeding Experiments on an Indian Major Carp<br />
hbeo rohira (Ham.)<br />
- S.D.Guprcr. A. K. Roy, S.C.hrl~. P. K.Saraporhy<br />
15. The Millennium Bug or the Y2K War<br />
-A. K Roy
STATISTICS<br />
Scopc <strong>of</strong> Applicnlion <strong>of</strong> Statistical Mcthodologies in <strong>Aquaculture</strong><br />
I\escarcl~<br />
A. # Roy<br />
Many Faccs <strong>of</strong> Slatistics<br />
- A!. Nour<br />
I:unda~uentals <strong>of</strong> Sa~npliny and its Application in Fishery Resource 96- 107<br />
Snnlplir~g Tccl~niques Applied in Assessi~lg Inland Fishery Resources I08 - 1 18<br />
and I'roduclion<br />
- H. A. Guplcr<br />
Corrclal~ons and llegressions<br />
-A I! Suryu Roo<br />
011 SOIIIC Slilt~st~cal I'rocedurcs for A~~alysis or Data from Field 128 - 135<br />
Expcrimc~its<br />
- G. R. A4arull1i Sartkar<br />
I:uridanler~tals <strong>of</strong> Design and Analysis <strong>of</strong> Field Experiments with a<br />
Note on l'ransfonnation <strong>of</strong> Data<br />
Rmri R. Sare~rand A. K. Roy<br />
Advo~iced Statistical Methods for Dab Analysis<br />
- R. N. S~~burliri<br />
AII Overvicw <strong>of</strong> Statisticnl Packages<br />
- Ravi It. Snre~to arrd A. K. Roy<br />
EXCEL for Smtistical Data Analysis<br />
- P. K Surnparl~y, A. K. Roy and R Dm11<br />
Ins~ructions for Operating Minitab Statistical Package<br />
- Srabashi Das~r
STATUS OF BIOINFORMATICS CENTRE ON AQUACULTURE<br />
A. K. Roy and S. A<strong>yy</strong>appan<br />
B~oinfonnatics Cenlre<br />
Cenlml lnsblute <strong>of</strong> Freshwaler <strong>Aquaculture</strong><br />
Kausalyaganga. Bhubaneswar<br />
BIOINFORMATICS, STATISTICS AND INFORMATION TECHNOLOGY<br />
The term 'Bioinformatics" refers to the area <strong>of</strong> interaction between the<br />
information technology (IT) and the Life-Sciences including biotechnology. Again IT is a<br />
convergence and integration <strong>of</strong> three main technolog~es taken together viz., Computer.<br />
telecommunication and microelectronics. Further to trace the connection between a<br />
statistics and information technology, ~t is necessary to go back to h~s royal society<br />
address delivered many years ago when famous statistician Maurice Kendall quoted<br />
"Statistics, is indeed, not confusion but fuslon, a sort <strong>of</strong> unified whole, the matr~x <strong>of</strong><br />
quantitative knowledge <strong>of</strong> nearly every kind, the pr~nclpal instrument yet devtsed by<br />
men for brtnging within his grasp the complex~ty <strong>of</strong> things". He elaborated that just as<br />
statistics per se was the totality <strong>of</strong> information, the technology <strong>of</strong> statistics was nothtng<br />
but the totality <strong>of</strong> technology <strong>of</strong> information or information technology. He further rtghtly<br />
pr<strong>of</strong>essed that the era <strong>of</strong> computers would only be heralded by future generattons <strong>of</strong><br />
statisticians. W~th the entire cosmos as one cybernet~c entity, the umly~ng disc~phne <strong>of</strong><br />
statistics and information technology now appears to be a reality Presently, it is<br />
emphasised the need to use the Markov Chain Monte Carlo simulat~on techn~que in<br />
order to improve the quality and reliability <strong>of</strong> computer s<strong>of</strong>tware.<br />
Bioinformatics gained a new dimension when 11 was understood that all the<br />
biological processes depend on genetic information stored as linear codes along<br />
gigantic chain molecules. It provided the structural and functional information on macromolecules<br />
and development <strong>of</strong> mathematical models that illustrate the dynamic<br />
interaction within and between cells. The advantages that will come from finding the<br />
right solutions to the questions posed by the interaction <strong>of</strong> biotechnology and IT are<br />
unlimited. Various activities <strong>of</strong> bioinformatics would be creation <strong>of</strong> databases either<br />
bibliographic or containing properlies and results; access and retrieval <strong>of</strong> information<br />
from databases either on line or <strong>of</strong>f line: analysis <strong>of</strong> information which may be either<br />
empirical model building based on various results <strong>of</strong> experiment or literature surveys<br />
and training. The need for bioinformatics started gett~ng attention due to gradual<br />
realisation <strong>of</strong> the fact that the basic and applied research in the areas <strong>of</strong> Life Sciences<br />
and Biolechnology is becoming increasingly dependent upon an understanding <strong>of</strong> the<br />
Biological processes at the molecular level Moreover it is felt the need for applying<br />
computer based analytical tools to the huge biological data accumulated over the past<br />
and sharing the data among workers and synthesizing information from isolated<br />
literature references. It is well known that a database provides information for surveys,
prevents duplication <strong>of</strong> works, cross veriiy experiments and predicts common<br />
characteristics, and helps writing <strong>of</strong> research papers, project proposals, etc. Due to its<br />
importance, Departrnent <strong>of</strong> Biotechnology started the Biotechnology Informatics<br />
Systems to provide an informatics based national infrastructure in the form <strong>of</strong> a<br />
distributed database and network organisation for harnessing the scientific knowledge<br />
in various interdisciplinary areas <strong>of</strong> biotechnology and its dissemination to scientists<br />
working in RBD organisation. BTlS has been established to serve as a distributed data<br />
base and network organlsation. It is comprised <strong>of</strong> nine specialized distributed<br />
information centre (DIGS) in six identified areas <strong>of</strong> Biotechnology (Genetic engineering.<br />
Animal cell culture and Virology, Plant tissue culture, Cell transformation, Nucleic acid<br />
and protein sequences, Immunology, and Enzyme engineering), nineteen Sub-DlCs for<br />
distribution <strong>of</strong> scientific information across the network. Another 15 Sub-DICs are in the<br />
process <strong>of</strong> establishment located at different national institutes and laboratories. The<br />
principal objectives <strong>of</strong> the DlCs is to function as an information base In each speciality,<br />
to provide a computer based information storage and retrieval system <strong>of</strong> databases, to<br />
provide retrieval service either online or <strong>of</strong>fline, to provide communication I~nk, to<br />
develop s<strong>of</strong>tware packages specific to user needs and to conduct training courses in<br />
the specialised areas for manpower development, to promote awareness about the<br />
computerised storage and retrieval facility among bio-scientists and information<br />
scientists.<br />
DEVELOPMENT AND MAJOR ACHIEVEMENTS OF BTlS ON AQUACULTURE<br />
The Bioinformatics Centre established at <strong>Central</strong> lnstilute <strong>of</strong> Freshwater<br />
<strong>Aquaculture</strong>. Kausalyaganga, Bhubaneswar is a Distributed Information Sub-centre<br />
(Sub-DIC) under Biotechnology Information System (BTIS) Network <strong>of</strong> the Departrnent<br />
<strong>of</strong> Biotechnology, Government <strong>of</strong> India during 1991-92. The centre specialises in the<br />
field <strong>of</strong> aquaculture and serves as an information source in the country.<br />
Infrastructure and physical facilities developed<br />
The BTlS being an informatics based infrastructure required special attention for<br />
right selection <strong>of</strong> computers and communication systems. Procured the following<br />
essential hardwares and s<strong>of</strong>twares and distributed to different Divisions <strong>of</strong> the lnstilute<br />
for use <strong>of</strong> the Scientists and Research workers using LAN connectivity with Server at<br />
BTlS room.<br />
Hardwares :<br />
486 Computers (9 nos.). Pentium (26), Macintosh SE (1). Multimedias (2). 3<br />
KVA UPS (I), Server (I), Dot Matrix Printers (lo), HP Deskjet (15), HP Laserjet (4),<br />
LCD Projection Panel (I), Colour Scanner (2), Fax machine (I), Modem (2) and VSAT<br />
(4).
Wldows 95. UNIX. MS-Office. Novel Netware 4.1, SPAR1. SAS, FOXPRO and<br />
QPRO.<br />
Creation and Procurement <strong>of</strong> Databases, Databank, <strong>Aquaculture</strong> Dlrectoriea, etc.<br />
Databases: Created the following databases related to aquacultural activities covering<br />
statistics, bioinformatics, resources, bibliography, nutrition, pathology,<br />
meteorology, biodata <strong>of</strong> Scientists and other activities related to aquaculture.<br />
a) Database on Freshwater Fishes (Textual)<br />
b) Database on Freshwater Aquatic Plants (Textual)<br />
c) Database on Fish Disease (Textual)<br />
d) Database on Fish Pathology (Bibliographic)<br />
e) Database on Fish Nutrition (-do-)<br />
f) Database on Aquatic Microbiology (-do)<br />
g) Database on Institutions and Companies working in the field <strong>of</strong> fishing<br />
technology and aquaculture (Textual database supplied by FAO)<br />
h) Database on Suppliers and manufacturers <strong>of</strong> fishing technology and aquaculture<br />
equipment (Textual database supplied by FAO)<br />
i) Individual experts in the fields <strong>of</strong> fishing technology and aquaculture (Textual<br />
database supplied by FAO)<br />
j) Personnel Information System (PIS) obtained from /CAR<br />
k) A databank has been created at the centre incorporating the factual figures on<br />
fish production statistics <strong>of</strong> all varieties, species, water area available, etc. for<br />
different states alongwith other 168 items on agricultural products i.e. rice,<br />
wheat, potato, cotton, maize, etc, and Animal Husbandry products i.e. egg.<br />
meat, milk. etc. This has facilitated the supply <strong>of</strong> information to users<br />
besides the information on fisheries.<br />
I) <strong>Aquaculture</strong> Directorlea : <strong>Aquaculture</strong> Directories have been prepared which<br />
cover detailed information on addresses <strong>of</strong> Educational and training<br />
<strong>Institute</strong>s in different countries along with courses, programmes, feed<br />
manufacturers, exporters, address <strong>of</strong> services, consultants on<br />
aquaculture, capture fisheries, fish processing and fisheries information<br />
services for literature on films, videos available in different countries. A<br />
directory covering all universities in India, ICAR, CSIR, Fisheries Directors<br />
and National Research Centres and Project Directors is also available at Ulis<br />
Centre.
m) Acquired CD-ROM on ASFA and CD-ROM on Fish Base for facilitating <strong>of</strong>fline<br />
bibliographic search by the Scientists and Technicians <strong>of</strong> the <strong>Institute</strong>s around<br />
Bhubaneswar and also other ICAR <strong>Institute</strong>s, Universities and Fisheries<br />
Colleges engaged in Research, Training and Teaching activities.<br />
S<strong>of</strong>tware development<br />
More than 35 programs In Fortran 77 and FOXPRO have been developed for<br />
statistical data analysis and information retrieval respectively. Some <strong>of</strong> these are<br />
ANOVA, Probil Analysis, Multivariate analysis, fish growth, length-weight analysis,<br />
Split Plot design, DMRT, Heterogeneity test along with no. <strong>of</strong> statistical test programs.<br />
Programs have also been developed for library management system, paybill, etc.<br />
Network Linkage<br />
The centre has acquired a VSAT for e-mail uploading and downloading.<br />
Micro Earth Station w~th C-200 controller has been installed at ClFA from Nov., 1995<br />
and E-mail facilities both national and international have been provided to the <strong>Institute</strong>.<br />
It has also a MODEM connected through telecom to NICNET to access databases<br />
developed by NIC i e. GIST, RENNIC. SLlPlPPP Connectivity for internet browsing has<br />
been acquired by the centre for online search <strong>of</strong> information.<br />
Library and Office Automation<br />
The library system is under computerisation. CDSIISIS package is used for the<br />
authonuisel titlewiseldiscipiinewise search for entire books available in the librav. This<br />
has facilitated to a great extent for the search on availability <strong>of</strong> library books at the<br />
centre.<br />
Manpower Development through WorkshopITrainingfreaching<br />
Studentship<br />
and <strong>of</strong>fering<br />
The following Workshops and Training programmes were conducted and<br />
studentships <strong>of</strong>fered by the BTlS centre on <strong>Aquaculture</strong> for extending information<br />
related to aquaculture and role <strong>of</strong> information technology (IT) on aquaculture<br />
development using modem tools.<br />
a) National Workshop on Perspectives in Bioinformatics and Its Application<br />
to <strong>Aquaculture</strong> was conducted during February 22-26, 1994.<br />
b) National Workshop on Networking and Biological Data Analysis was<br />
arranged during February 4-6, 1997.<br />
c) National Workshop on Information Technology in <strong>Aquaculture</strong> Research<br />
was arranged during February 10-13, 1998.
d) Students <strong>of</strong> Orissa University <strong>of</strong> Agricultural and Technology are being<br />
regularly trained on the use <strong>of</strong> Computer Application in <strong>Aquaculture</strong><br />
Research in their Master <strong>of</strong> Fisheries Science and Ph. D, courses apart<br />
from periodical training <strong>of</strong> the Scientists <strong>of</strong> CIFA. Teachers.<br />
Researchers <strong>of</strong> Utkal University, ICMR, RRL, Regional College <strong>of</strong><br />
Education as well as workers <strong>of</strong> other lnstitut~ons also avail this facility.<br />
e) Several training programmes were also conducted for staff<br />
members <strong>of</strong> CIFA. The centre has also conducted many training<br />
programmes for <strong>of</strong>ficials <strong>of</strong> State Fisheries, different colleges and<br />
universities <strong>of</strong> Orissa.<br />
f) Regularly students are trained in Bioinformatics <strong>of</strong>fercng studentship<br />
under BTlS project.<br />
lnternational Collaboration<br />
The centre is collaborating with lnternational Development Research<br />
lnformation system (IDRIS) <strong>of</strong> IDRC, Canada for obtaining information on fisheries<br />
activities located in or concerned with developing countries in diskettes which are<br />
updated by them every six months. This centre is selected by the Fishery Advlsory<br />
Services (INTI86/D12) <strong>of</strong> FAOIUNDP, Rome for dissemination <strong>of</strong> information on<br />
fisheries and its allied disciplines through the diskettes prepared by them The centre<br />
has also received CD-ROMs on Fish Base from ICLARM, Philippines, which provide<br />
databases on fisheries, particularly for fishery research workers. Maxims. Ecopath<br />
and Fish growth parameters packages have been collected and are being utilised.<br />
Future Programmes<br />
LAN service will be upgraded, KU Band VSAT is intended to be procured for<br />
best use <strong>of</strong> lnformation Technology in <strong>Aquaculture</strong>. This system will help in providing<br />
electronic bulletin and e-mail to the scientific and technical personnel independently<br />
by using existing VSAT as well as dlal-up MODEM A remote login system is<br />
proposed to be developed lo give an access to all Bioinformatics Centres, ICAR<br />
<strong>Institute</strong>s and other research organisations. This remote login system will help to<br />
share the information generated here amongst research organisations. Creation <strong>of</strong><br />
CD-ROM on databases developed at the centre will be distributed to other research<br />
organisations for <strong>of</strong>f-line search facilities. Attempts will be taken to prepare menudriven<br />
s<strong>of</strong>tware packages for carp culture, prawn culture, catfish culture, pearl<br />
culture, paddy-cum-fish culture, etc. which will guide the entrepreneurs for taking up<br />
the aquaculture independently. Physical, chemical and biological parameters <strong>of</strong> fish<br />
ponds will be monitored from the model to be developed during this period. CDNET<br />
facility will be developed in LAN system lor sharing <strong>of</strong> bibliographic search by<br />
researchers and Scientists <strong>of</strong> the <strong>Institute</strong>. Training course for training researchers in<br />
the field <strong>of</strong> Bioinformatics is being taken every year.
INTERNET AND THE INTRANET<br />
Manas Patnsik<br />
Director,<br />
STPI-Bhubaneswar<br />
So what is the difference between the lntemet and lntranet 7<br />
Mainly the location <strong>of</strong> the infomation and who has access to fi<br />
lnlemet IS public, global and wide open to anyone who has an lntemel connection. The<br />
Internel is a phenomenon, created by the physical connection between thousands <strong>of</strong> prlvate<br />
networks. Like the phone system. the Internet allows instant communication between any<br />
two points on a network, lnstead <strong>of</strong> connecting phones, however, it connects computers.<br />
Instead <strong>of</strong> voice and fax, you are exchanging digital information, including:<br />
Documents<br />
Data<br />
Multimedia (recorded video, audio)<br />
lnlranets are restricted to people who are connected to the private company network. Other<br />
than that, they work esseritlaily the same way. lntranets can help empower their employees<br />
thtouph more timely and less costly information flow This empowerment bolsters a<br />
company's competiliie advantage, through improvement <strong>of</strong> employee morale and asssting<br />
in gelling more timely information to customers and supplien.<br />
Wille 1995 was clearly ttre 'year <strong>of</strong> the Internet'. 1996 is being termed the 'year <strong>of</strong> the<br />
Intranel'.<br />
lritemet technologies llnplemenled internally wer dlentlse~er networks are called Intranets.<br />
lnlranets can operate behind firewalls in conjunction with lnlemet access, or be<br />
implemented exclusively as internal distributed networks over LANs and WANs.<br />
A key fact to undersland is thta the lnlranel can work on any local area network (LAN), but<br />
really provides its greatest power on wide area networks (WAN). All companies began their<br />
network activities using LANs, but the plummeting cost <strong>of</strong> network connections now makes it<br />
increasingly affordable to connect all the far-flung LANs into a single, integrated WAN. Most<br />
network computer applications are geared to the LAN, whereas internet applications were<br />
originally designed to be used wer a WAN. Because <strong>of</strong> this WAN capability, the intranet<br />
makes il possible to connect any user in the company's wide area network to any web site<br />
located on that network. So, for instance, if your company has internal web sites in London,<br />
Singapore, Seanle, and Information from any <strong>of</strong> those sites with equal ease.<br />
lntranets present a less challenging development environment, so that many organisations<br />
preler to lmpiemenl tntranets first, perhaps with a modest, network isolated Internet site.<br />
before a full blown, firewall protected lntemel sile Is contemplated.<br />
Some Key dlfferencss between the lntemet and lntranet are :.<br />
INTERNET<br />
Client toots divene<br />
Browser compliance an issue<br />
. Client connection speeds vafiable.<br />
Users have divene skills sets<br />
Animation. video reslrided.<br />
Mintmat implicalions for work-flow
Can standardize client tools<br />
Bmwser compliance generally not an issue.<br />
Network speed standardised<br />
Users can be trained<br />
Full multimedia <strong>of</strong>len possible<br />
8 Implications for work-flow end process re-englneenng<br />
How can inlranets save time in a corporale environments 7<br />
Wflh corporations under tremendous pressure to empower employees and lo better<br />
leverage internal information resources, inlranets are being seen as the solutions.<br />
A basic intranet can be set up in hours or days and can ultimately serve as an 'Information<br />
hub' for the entire company, its remote <strong>of</strong>fices, parlnen, suppliers and customen.<br />
Key differentiaton that distinguiosh lntranels as the future medium for corporale internal and<br />
external comunications.<br />
Freedom <strong>of</strong> choice<br />
Ease <strong>of</strong> Use<br />
Cost effed'weness<br />
Richness<br />
Powerful tool for sharlng infonnation across networks<br />
Merges documents, data and mult~media<br />
Universal access<br />
Universal interfaces to all file system<br />
Totally in-house. protected from publlc security (i.e IntemeVwww)<br />
How does one authenticale user lo make sure they are who they claim to be<br />
How can one perform authentication without send~ng user names and passwolds across the<br />
network in the clear<br />
How can single user log in services be provided to avo~d costly user name end account<br />
maintenance for all the servers (web. Proxy, directory, mail, news, and so on) across the<br />
enterprise<br />
How can one protect the privacy <strong>of</strong> communication, both lhose in real time (such as the data<br />
flowing between a web client and a web server) and those with store-and-folward<br />
applications such as e-mail<br />
How can one ensure the messages have not been tampered with between the sender and<br />
the recipient<br />
How can one . .eguard wnfidenttal documents to ensure that only authonsed indivlduais<br />
have to awe to them<br />
Today. the, is a single technology that provides the foundallon for soking aH these<br />
challenges: Ctyplography. These standards provide the foundation for a wide variety <strong>of</strong><br />
sewrfty services, including encryp(lon, message integrity veritlcation, authentication and<br />
digiial signatures.<br />
Encryption transforms data into some unreadable form to emre prtvacy. It Is the dlgnal<br />
equivalent <strong>of</strong> a sealed envelope.<br />
7
Decryption is the reverse <strong>of</strong> encryption, it transforms encrypted data back into the<br />
original, intelligible form<br />
Aulhenticat~on idenlines an entity such as an individual, a machine on the network, or an<br />
organlzatlon<br />
Digital signalures blnd a documenl to the possessor <strong>of</strong> a particular key and are the<br />
d~gital equivalent <strong>of</strong> paper signalures<br />
Signature verificalion is the inverse <strong>of</strong> a digital signature. A verifies that a particular<br />
signature is valid.<br />
INTRANET APPLICATIONS IN A CORPORATE ENVIRONMENT<br />
Some common appl~calions are :<br />
Sales and marketing applications<br />
1. Product specificallons, price lids and new collateral<br />
2. Sales Leads<br />
3 Competitive informallon<br />
4. Lists <strong>of</strong> key cuslorners wins, including winlloss analysis<br />
5 Online training rnalerials<br />
8. Sales presenlallons<br />
Product development appllcallons<br />
1 Product spec~ficaltons, destgns, schedule mtieslones, and charges<br />
2 Team member llst~ngs and cespons~bld~es<br />
3 Cuslomer Issues<br />
4 Features <strong>of</strong> key competlt~ve products<br />
Cuslomer service and support applications<br />
1. Share the latest reports on problems so that any team member can respond to<br />
customer calls<br />
2. Get the current information on the status <strong>of</strong> cuslomer's orders<br />
3. Be alerted lrnnlediately lo any important changes such as special <strong>of</strong>fers or issues<br />
4. Traln onllne to respond lo customer queries and complaints<br />
. Human resources appllcal~ons<br />
1. Company mission and goals<br />
2. The annual repod<br />
3. Searchable telephone directories<br />
4. Job poslings and internal job transfer forms<br />
5. Employee development<br />
0. Departmental and personal home pages<br />
7. Classified bulletin boards <strong>of</strong> items for sale, housing etc.<br />
8. Medical referrals<br />
9. Online employee enrollment in specific benefit plans<br />
10. Employee surveys<br />
11. Employee lookup <strong>of</strong> vacalion balances, oplions elc and<br />
12. Ollllrle submission <strong>of</strong> employee status change<br />
FINANCE APPLICATIONS<br />
Wdh inlranel a~~lications. finance de~arlments can disseminate information to key<br />
manages by &curely posting corpoite financial data or by providing simple form:based<br />
query capabilllies. The purchasing site <strong>of</strong> financial operations can also benefn from intranet
OTHER APPLICATIONS<br />
Numerous other corparate departments such as legal or MIS groups currently ustng paper<br />
based forms or polides can reap the benerns <strong>of</strong> making transaction a~Dlicalions available<br />
through intranets<br />
ELECTRONIC MAIL AS A PART OF INTRANET<br />
When a person takes an internet from the ISP (Internet Sewlce Providen) the e-mail<br />
address will be that <strong>of</strong> the ISP. It is like ustng a business center for an <strong>of</strong>fice Say ClFA has<br />
taken service from STPl Bhubaneswar, then their mail address w~ll be ~j~$c&!i..$~D~~c_l.<br />
The above do not present serious options to a corporate organisations.<br />
For its employees to use e-rnail, the corporate can gtve an address like IL!~IIW~~ITJI$S~~<br />
indicating the name <strong>of</strong> the organisation/corporate Cifa as a research organisation in Indta.<br />
SUMMARY<br />
The Internet has not only brought about a technology revolution, but it is also taunchtng a<br />
second revolution in corporate computing The internal use <strong>of</strong> lntemel s<strong>of</strong>tware has become<br />
known as the 'INTRANET. For India, Internet is a great opporlunily. Although currently we<br />
do not have more than 50.000 lntemet conneclion but it has already caught irnaglnatlon <strong>of</strong><br />
the people. The numben <strong>of</strong> usen are estlrnated to be more than 2 lakhs Undoubtedly<br />
Internet has emerged as the largest non-stop talent show lntemet business in India Is likely<br />
to fetch revenue <strong>of</strong> more than Rs 70 billion by theyear 2000 Wflh Its low entry barrier and<br />
high intellectual opportunity, the intrarlet is <strong>of</strong> stgnificance for organisatlons In Ind~a. A<br />
standard part <strong>of</strong> any business internet connection is the firewall, wh~ch keeps internet users<br />
from connecting into the company's private internal network If company has its own<br />
internal web sites on the internet, people on the internet will not be able lo see them without<br />
specla1 access authority.
THE WORLD WIDE WEB 8 INFORMATION SEARCHING<br />
Bikash Panda<br />
HIG-188, Kanan V~har, Bhubaneswar-751031<br />
World Wide Web<br />
The World W~de Web (WWW) is one <strong>of</strong> the most popular client-sewer based<br />
Internet services. In the late 1980's. CERN (the European Lab for Particle Physics)<br />
began experimenting with a service that would allow anyone to easily access and<br />
display documents that were stored on a server anywhere on the Internet. To do this,<br />
they developed a standard format for the documents that enabled them to be easily<br />
displayed by any type <strong>of</strong> display device, and allow links to other documents to be<br />
placed within documents.<br />
Although the WWW was developed for the CERN researches to use, after the<br />
service was made public it became tremendously popular. A number <strong>of</strong> different client<br />
applications (the ones that actually display the documents on-screen were developed to<br />
read WWW documents. There are graphical-based clients (one <strong>of</strong> the most popular <strong>of</strong><br />
these is Netscape), and term~nal-based clients such as Lynx. Most WWW clients also<br />
allow you to use the same interface to access other lnternet services such as FTP and<br />
Gopher.<br />
Accessing WWW<br />
To use WWW you just require lnternet connectivity & preferably a graphical<br />
browser. The most popular browsers are Netscape Navigator. Micros<strong>of</strong>t lnternet<br />
Explorer. If your computer is properly configured to access lnternet using TCPllP<br />
protocol, then you can start browsing the W using your browser application. You<br />
need to know the Web Site address which you desire to view. This Web Site address is<br />
known as URL which stands for Uniform Resource Locator 8 it has the following<br />
syntax.<br />
An example <strong>of</strong> URL is Error! Reference source not found. This means you want to view<br />
an HTML document called default.htm available at the Web Server Error! Reference<br />
source not fouud. using HTTP (Hyper Text Transport Protocol). The name <strong>of</strong> the sewer<br />
is called the Domain Name which is unique worldwide. The Top Level Domain ('in" in<br />
this case) decides what type <strong>of</strong> Server that is. IN means that particular Web Sewer is<br />
an Indian Domain. Every country worldwide had this type two letter country domain.<br />
International domains are three letter ones.
.COM is for Commercial Organisations<br />
.NET is for Networks or lSPs (Internet Service Providers)<br />
.ORG is for Non-commercial orgnisations<br />
.EDU is for Universities or Educational Institutions<br />
.INT is for International Ongaisations<br />
.MIL is for Military Organisations<br />
.GOV is for Government Site<br />
Out <strong>of</strong> these top level domains .edu, .~nt. .mil & gov are only for USA based<br />
organisations.<br />
World Wide Web Authorities<br />
No body owns Internet & hence there is least numbers <strong>of</strong> controll~ng bod~es Thls is<br />
what makes the W so popular & masslve, IANA (Internet Assigned Names<br />
Authority) is the USA based Organisat~on which assigns Umque IP address for Web<br />
InterNlC (Internet Network Information Centre) manages the Domain Name regstration<br />
<strong>of</strong> International domain names. More details can be found at Error! Ilelerence aource<br />
not found. An organisation World Wide Web Consort~um sets the standard <strong>of</strong> WWW 8<br />
HTML tags. Their details can be found at www.w3c.org<br />
Information Searching<br />
Nobody expects you to remember every possible s~te names & browse accordingly.<br />
One has to search the sites which might be hav~ng reference to the Keyword you are<br />
searching. For this purpose special Websites called Search Engines available The<br />
most popular one is www.yahoo.com<br />
The following is a 11st containing various URLs for variety <strong>of</strong> purposes<br />
Search Engines<br />
www.yahoo.com<br />
www.altavista.com<br />
w. hotbot.com<br />
www.infoseek.com<br />
www.khoj.com<br />
Free E-mail Service Providers<br />
www.hotmail.com<br />
www.rocketmail.com<br />
mail.yahoo.com<br />
www.mailcity.com<br />
www.excite.com<br />
www.usa.net<br />
www.lycos.com<br />
www.excite.com<br />
www.search.com<br />
w.webcrawler.com<br />
www.web-search.com<br />
Oniine News Sites<br />
w.times<strong>of</strong>~ndia.com<br />
www.expressindia com<br />
w.sarnachar.com<br />
www.asianage.com<br />
w.aajlak.com<br />
w.hinduonline.com
www.poeox.com<br />
www.letterbox.com<br />
www.juno.com<br />
People Finder Sites<br />
www.four1 l.com<br />
www.whowhere.com<br />
www.alumni.net<br />
w.batchmates.com<br />
Free Web Hosting<br />
www.geocities.com<br />
www.angelfire.com<br />
www.xoom.com<br />
www.forlunecily .corn<br />
www.tripod.com<br />
w.cnn.com<br />
w.hindustantimes.com<br />
w.economictimes.com<br />
Job Providers in Internet<br />
www.naukri.com<br />
w.winjobs.com<br />
w.dice.com<br />
www.careerpath.com<br />
www.bestjobsusa.com<br />
w.ciol.com
INTERNET AND THE EMERGING NETWORKED SOCIETY<br />
A. K. Roy<br />
B~ornforamabcs Centre<br />
<strong>Central</strong> <strong>Institute</strong> <strong>of</strong> Freshwater Aquacuffure<br />
Kausalyaganga, Bhubaneswar - 751002<br />
INTRODUCTION<br />
In the simplest form Internet is the network <strong>of</strong> networks. Internet (known<br />
as Net) is the world's largest computer network. A computer network is generally a<br />
bunch <strong>of</strong> computers hooked together somehow for exchanging ~nforrnalion freely. It<br />
is a new communicatton technology that is affecting our llves on a scale as slgnlficant<br />
as the telephone and television. It is a worldwide computer network connecting nearly<br />
5 million computers around the world. There is no censorship. Probably that is one <strong>of</strong><br />
the reasons <strong>of</strong> its popularity and exponential growth.<br />
COMPUTER NETWORK<br />
Computer networking refers to a method is which the computer systems are<br />
connected together is such a way that they can exchange informallon among<br />
themselves. They can be connected by wires, phone lines, satell~te llnks or any<br />
combination <strong>of</strong> these. Each computer network has a host computer, known as<br />
server, which controls the complete network. If networking is done in the same<br />
bullding or in small area, it is known as Local Area Network (LAN), if the computers<br />
are spread over the metropolltan Area then it IS known as Metropolltan Area<br />
Network (MAN). When the computers are spread over larger area, the network is<br />
called Wide Area Network (WAN). Networking IS done for sharing resources like<br />
printers, hard disc drive and s<strong>of</strong>tware<br />
SOME INDIAN NETWORKS<br />
NICNET, ERNET, INDONET, METNET, PRESS NETWORK, OILCOMNET. SIRNET,<br />
AIRLINE NETWORK, INFLIBNET.<br />
WHO USES INTERNET 7<br />
Once closely guarded by sc~entlsts and technocrats, today the lnlernet IS open<br />
to researchers, students, parents, poltce, buslnessrnen, world leaders, executives,<br />
sport fans, shoppers and terror~sts Internet is the largest and most complete<br />
learning tool for groups <strong>of</strong> people with varied educational backgrounds and<br />
interests.
SUBJECTS COVERED BY INTERNET<br />
lnternet covers almost all the subjects imaginable. Some <strong>of</strong> which are Arts<br />
and Culture, Books and literature, Business and Career, Computers and<br />
S<strong>of</strong>tware, Education and Teaching tools, Environment and Nature, Food and<br />
Cooking, Games and Sports, Government and Politics, Health and Nutrition,<br />
History, Household and Consumer finance, Humor, International affair, Language<br />
and Linguistics, Law, Movies and video tapes. Music, Religion and new age, Science<br />
and Technology, Space and Astronomy, Shopping. Sports, Recreation and Hobbies,<br />
Television, Travel and Geography and many more.<br />
LENGTH AND BREADTH OF INTERNET<br />
The information available on the internet has been indexed. If one reads<br />
only index pages at the rate <strong>of</strong> 100 pages daily, it will take 4 years to read the<br />
complete index only wh~ch is equivalent to 1,46,000 page. As per the latest report<br />
available, there are 2.2 million current users <strong>of</strong> internet and every month 1,50,000 new<br />
users are joining it. The internet has 40,000 host computers also known as web sites.<br />
It is estimated that by 2000, there will be 100 million users and 1 million hosts on the<br />
internet.<br />
NAVIGATIONAL TOOLS OF INTERNET<br />
The following are the navigational tools <strong>of</strong> internet:<br />
E-mail (electronic mail), File Transfer Protocol (FTP), Telnet Gopher, World<br />
Wide Web (Mosaic), Finger, Usenet, Mailing Lists (Listservers, Viewers, Archives,<br />
Encoding, Lynx, lnternet Relay Chat (IRC), Wais, Veronica, Bulletin Board System<br />
(BBS) and Free Nets.<br />
VARIOUS APPLICATIONS OF INTERNET<br />
lnternet has given access to an enormous amount <strong>of</strong> information. This<br />
information can be accessed and used from any comer <strong>of</strong> the world and knowledge <strong>of</strong><br />
access tools is necessary to make maximum use <strong>of</strong> interenl. In India and all over the<br />
world the lnternet is being used for wide variety <strong>of</strong> purposes, only few are mentioned<br />
below.<br />
ELECTRONIC PAPERSIJOURNALSINEWSLETTER<br />
Newspapen and magazines are available on the Internet. Recently many<br />
Indian News papers have been introduced on the Interent. Many International<br />
Scientific Journals are available in the Internet.
MATRIMONIAL ALLIANCES<br />
Matrimonial alliances are being done through lnternet for which some<br />
companies have started matrimonial service site.<br />
PATIENT CARE SUPPORT<br />
lnternet is a continuously updated database for providing patlent care<br />
support and serves as a d~stant learning facility for student physicians. On-line<br />
medical journals, through which the latest research and development in the field IS<br />
known.<br />
INTERNET PHONE<br />
One can now-a-days place calls over the lnternet to standard phones or PC's<br />
running Vocal Tee lnternet S<strong>of</strong>tware along with placing calls via the internet. It<br />
gives lnternet users a vocal two-way communication facility. This lnternet phoning<br />
is now as simple as E-mailing or traditional phoning. The rate is lower than STDllSD<br />
calls.<br />
NET VARSITY<br />
Another Interesting thing is that recently NllT has establ~shed an on-line<br />
learning fac~lity on the Internet by the name <strong>of</strong> 'Net Vars~ty' based on the<br />
conventional model <strong>of</strong> a university. According to NIIT, the NllT varsity has all the<br />
features <strong>of</strong> an institution <strong>of</strong> higher learning including registration procedure,<br />
testing and certification. Other features include a library where the vast<br />
resources <strong>of</strong> the internet have been summarised, a student querylng service to<br />
<strong>of</strong>fer tutor support to students, a student advisory service to provide counseling on<br />
learning opportunities and a placement assistance service The students will be<br />
eligible for certification for the education they get at the 'Net Vars~ty'.<br />
POSITIVE USE IN INDIA<br />
Government organizations l~ke CSIR, ICAR have set up Websile on the lnternet<br />
which gives information about their objectives, activ~ties and also about various<br />
labcratories. Department <strong>of</strong> Science and Technology Website informs about<br />
National Resources available for Science and Technology. NIC has a wealth <strong>of</strong><br />
information on its Website.
DARKER SIDE OF INTERNET<br />
Due to the scope <strong>of</strong> unhindered use on uncensored subjects, it is being<br />
mlsused also in areas like pornography, nefar~ous and subversive activities by<br />
unscrupulous criminals breaking the database <strong>of</strong> banks, confidential records <strong>of</strong><br />
defence establishments and secrets commercial rivals. Recently, there appeared<br />
news about the credit card fraud that hits the internet by school boy hackers. This<br />
computer scam fuels fears about shopping on the web. These are darker sides <strong>of</strong><br />
internet which can not be ignored.<br />
A NETWORKED SOCIETY (NS)<br />
Communication technology is based on computers is computer mediated<br />
communication (CMC) which encompasses e-mail, virtual reality and computer game<br />
etc, Internet is a new way <strong>of</strong> using space and time. CMC provides a space - the<br />
cyberspace, within which forms a new society known as Networked Society (NS) or<br />
Cyber Society.<br />
Impact <strong>of</strong> Networked Society (NS) on the culture <strong>of</strong> people all over the world :<br />
1. W~th network spanning all over the world the convert <strong>of</strong> borderless nations is<br />
likely to be a reality.<br />
2. In the NS, the houses are likely to be the activity centre, not the <strong>of</strong>fice<br />
3. Less travel society if not a travel-less society<br />
4. Physical location may become irrelevant for develop~ng and receiving services<br />
5. Radical change in workculture due to flexi hours <strong>of</strong> working coupled with<br />
innovative management <strong>of</strong> resources and manpower resulting in enhanced<br />
productivity.<br />
G. Home centred act~vities would lead lo better creativity, innovation and<br />
product~vity.<br />
7. Telecommunication culture w~t home curbed activities would ultimately lead to<br />
home centred economy.<br />
8. Present society is characterised by community formation based on work<br />
centres. In a home centred environement, the communities will comprise <strong>of</strong><br />
groups from among people pursuing different works and pr<strong>of</strong>essions in life. A<br />
true social community is likely to emerge.<br />
9. The concept <strong>of</strong> association may vanish because in a networked society , small<br />
community dwellings which are self contained would emerge.
10. A networked society (NS) can be characterised by (anyone, anytime,<br />
anywhere, any information and any format )<br />
11. A full-fledged NS implies that every human being on the earth has an access to<br />
network which is considered essential like elctricty and water.<br />
12. A poorest person from villages will have access to information resources in the<br />
richest in the cities.<br />
13. W~th round the clock operation <strong>of</strong> Newtwork infrastructions, tlme and holiday<br />
patterns may be irrelevant in the lifestyle <strong>of</strong> people.<br />
14. There communication technologies will play complementary roles There are<br />
optical fibres, sattelttes and short-wave radio which will provide bandwidth.<br />
qu~ck remote area connectivity and excellent last mile link respectively.<br />
15. Network computers and multimedia personal computers w~ll emerge.<br />
16. Virtual reallty is considered as the ultimate evaluation <strong>of</strong> a networked society<br />
17. An NS would emerge as the <strong>Central</strong> theme <strong>of</strong> llvlng wlth the societies trade<br />
economy, occupation, development, education culture and leisure all centred<br />
around networking.<br />
CONCLUSION<br />
Computer network~ng IS perhaps one <strong>of</strong> the most rmportant m~lestones In<br />
the rnnovat~ve creations using lnformat~on Technology (IT) and an even blgger<br />
phenomenon IS the lnternet lnternet has brought computer network~ng to an<br />
unprecedented frontler and can be described as the biggest IT event In computer<br />
and commun~cat~on technology In sclenllfic and research communlly, Internet IS an<br />
essent~al and ~nd~spensable tool Through Internet, sc~ent~sts can yaln Instant<br />
access to the world's most advanced research facilities and discuss the~research<br />
problems w~th others worklng In the same fleld They may be benefitted most through<br />
proper use <strong>of</strong> lnternet fac~llt~es after gainlng basic Ideas about the lnternet its<br />
navlgat~onal tools and servlces available as dfscussed above Never before such<br />
freedom <strong>of</strong> thought and expression have been posslble for ordlnary and not so<br />
ord~nary people allke At thls moment ~t IS very d~ff~cult to comprehend the<br />
consequences <strong>of</strong> the newly formed Cyber or Networked Soc~e~ty
ESTABLISHMENT OF LOCAL AREA NETWORK AND INTERNET UNDER<br />
THE ARISNET: A CASE STUDY<br />
G. R. Maruthl Sankar<br />
Contra! Research <strong>Institute</strong> lor Dryland Agricullura (ICAR)<br />
Sanloshnagar, Hydembad - 500 059<br />
1. Establishmant <strong>of</strong> NICNET at CRlDA<br />
During 1994-95, ICAR has made it compulsory for all ~nstitutes to establish<br />
Nal~onal Informatics Centre's Network (NICNET) for E-mail transmission through a<br />
MODEM and a dial-up telephone through Public Swltched Telephone Network (PSTN)<br />
connected to a Computer. Accordingly, CRlDA has established its NICNET services.<br />
Tlie services included transmission and downloading <strong>of</strong> E-mail messages through a<br />
low speed Multi-Tech MODEM and a PSTN through National :nformatics Centre (NIC),<br />
Hyderabad and further linkage to NIC, New Delhi through the Indian Satellite. The<br />
transmission <strong>of</strong> text was usually in the form <strong>of</strong> ASCII files through the PROCOMM<br />
s<strong>of</strong>tware used for communication after getting connected to the VAX system at NIC,<br />
New Delhi. The protocol that was provided by NIC for all ICAR institutes was that <strong>of</strong><br />
Simple Mail Transfer Protocol (SMTP) using which exchange <strong>of</strong> simple electronic mails<br />
can be exchanged. CRlDA has been provided with an E-mail address through X-400<br />
services <strong>of</strong> NIC, New Delhi as CRIDA@ X400. NICGW. NIC. IN for using the SMTP<br />
for exchange <strong>of</strong> information. Transmission <strong>of</strong> either non-ASCII text, graphics I images<br />
or use <strong>of</strong> any advanced s<strong>of</strong>tware (Windows based) including the data except binary<br />
attachment was not possible due to the limitations <strong>of</strong> the PROCOMM s<strong>of</strong>tware and also<br />
the protocol that was provided to the ICAR institutes. Further, the network was slow and<br />
problematic due lo the low speed <strong>of</strong> MODEM (1.2 Kilo bauds per second) being set by<br />
the NIC, New Delhi for all ICAR inst~tutes and the transmission errors in the satellite<br />
communication through the unreliable PSTN, apart from the problems In functioning <strong>of</strong><br />
a telephone linkage. In spite <strong>of</strong> the different problems, messages have been transmitted<br />
and received periodically.<br />
2. Establishment <strong>of</strong> ARISNET at CRlDA<br />
During 1995-96, ICAR has made it mandatory for establishment <strong>of</strong> Agricultural<br />
Research Information System Network (ARISNET) at all ICAR institutes and augment<br />
the services <strong>of</strong> NICNET for exchange <strong>of</strong> agricultural research information, data and<br />
reports and various other kinds <strong>of</strong> information through the network. ICAR has supplied<br />
different hardware and s<strong>of</strong>tware to all institutes for ARISNET establishment. Under<br />
ARIS.NET program, each institute was asked to establish a Local Area Network (LAN)<br />
through any <strong>of</strong> the three types <strong>of</strong> cabling viz., BNC, UTP or Fiber Optic cabling that<br />
suits the institute depending on the location, size and other requirements <strong>of</strong> the<br />
institute.
Accordingly. CRIDA has established its Local Area Network (LAN) under<br />
ARISNET during 1996-97. The network cabling for different rooms (54 nodal points)<br />
was done by the Electronics corporation <strong>of</strong> India Limited (ECIL), Hyderabad. The<br />
cabling has been done with the features <strong>of</strong> STAR Topology i.e., the Untwisted pair<br />
(UTP CAT-5) cables are connected from the ARISNET Server room to the different<br />
rooms through three 16-port HUBS (2 Bee-Line and 1 D-Link Hubs) which are located<br />
at three different places in the institute. CRIDA has been prov~ded with a SUN-<br />
SPARC UNlX Server (ICIM-Fujitsu make) and a Meteor LAN Server (HCL-HP make).<br />
While the UNlX server is a 8-node capacity Server, the LAN Setver is a 32-node<br />
capacity Server. While the UNlX Server was installed by ICIM-Fuj~tsu. Hyderabad the<br />
LAN Server and the three Workstations provided by ICAR have been itistalled by the<br />
HCL-HP, Secunderabad. The existtng NICNET has been merged w~th ARISNET. The<br />
NIC, Hyderabad has installed a htgh-speed Motorola MODEM (with a speed <strong>of</strong> 19.2<br />
Kilo bauds per second) for transmission and downloading <strong>of</strong> E-mail and other type <strong>of</strong><br />
files and has connected it to the ARlS Workstation-l through a telephone cable under<br />
PSTN. Apart from the three ARlS Workstations provided by ICAR, 9 computers<br />
(nodes) from different rooms have been connected to the LAN Server The equ~pment<br />
supplied by ICAR are thus being used for day-to-day work with different s<strong>of</strong>tware like<br />
Micros<strong>of</strong>t Office (WORD, EXCEL. POWERPOINT and ACCESS), Micros<strong>of</strong>t Visual C++<br />
and other licensed s<strong>of</strong>tware <strong>of</strong> the inst~tute.<br />
3. Establishment <strong>of</strong> VSATI Earth Station at CRIDA<br />
In view <strong>of</strong> advancements in computer hardware and s<strong>of</strong>tware, and<br />
improvements in the Satellite communication and a revolution in the Information<br />
Technology all over the world during the last two years, ICAR has procured the latest<br />
Ku-Band Very Small Aperture Terminal (VSAT) from NIC. New Delhi and prov~ded them<br />
to a few selected institutes. The VSATs procured by ICAR are Frequency Time Division<br />
Multiple Access (FTDMA) VSATs, which have a very high downioad~ng and<br />
transmission speeds viz., 32 Kilo bauds per second (for transmission) and 256 Ktlo<br />
bauds per second (for downloading). They are very small, compact, less problemalic,<br />
less costly and highly efficient, easy to handle and have high speeds in cornmunical~on.<br />
They have many advantages when compared to the existing C-Band and S-Band<br />
VSATS <strong>of</strong> NIC in all features for satellite communication. CRlDA has been provided<br />
with a Ku-Band FTDMA VSAT. The Earth Station <strong>of</strong> CRIDA was developed and the<br />
VSAT has been successfully installed. The VSAT Earth Station <strong>of</strong> the <strong>Institute</strong> in<br />
Hyderabad is linked to the Master Earth Station <strong>of</strong> NIC at New Delhi through the Indian<br />
Satellite and will be catching signals uninterruptedly w~thout any error and are ut~lised<br />
for further processing. The VSAT has two units viz., an Out-Door Unit (ODU) and an In-<br />
Door Unit (IDU). The NIC has connected the ARlS Workstation - I to the IDU through<br />
UTP CAT-5 Cable. The IDU in turn is connected to the ODU <strong>of</strong> the VSAT Earth Statlon<br />
Ulrough enor-free UTP cables. The NIC, New Delhi has provided two dedicated IPaddresses<br />
(164.100.255.13 and 164.100.255.14) to the institute viz., one to the VSAT
(164.100.255.13) and the other to the ARlS Workstation - 1 (164.100.255.14). This is a<br />
statutory requirement for provision <strong>of</strong> INTERNET to an user by linkage to the Indian<br />
Satellite through a VSAT for direct communication with mill~ons <strong>of</strong> users on the World<br />
Wide Web (WWW). The ARIS workstation - I has been configured with the<br />
Transmission Control Protocol I Internet Protocol (TCP I IP) and the INTERNET has<br />
been provided to CRIDA by NIC, New Delhi. This Workstation has WINDOWS-95 as<br />
the restding Operating System (0s) and Net Scape Navigator Gold (3.1 Version) for<br />
browslng different Web sites on the INTERNET. Thus CRIDA has been provided with<br />
INTERNET facility for accessmg and browsing the WWW and downloading all relevant<br />
rnformation for furiher advancement in dryland research. Ever since the FTDMA Ku-<br />
Band VSAT has been rnstalled and INTERNET being provided to CRIDA, Scientists at<br />
the institute are making an efficient use <strong>of</strong> the INTERNET facility for direct transmission<br />
and downloading <strong>of</strong> E-mails, text and data files, graphics and images, browsing the<br />
W and visiting different Web sites for obtaining relevant information The<br />
information is oblained by vlsiting different Hyper Text Transmission Protocol (HTTP)<br />
addresses and making use <strong>of</strong> powerful search engines like YAHOO, ACTA VISTA.<br />
WEB CRAWLER, NET SEARCH and others that are available in the INTERNET. Most<br />
<strong>of</strong> Web ales can also be reached and the relevant Information that 1s requlred can be<br />
downloaded directly through the Hyper Text Marker Language (HTML) and JAVA<br />
s<strong>of</strong>tware with proper protocols that are ava~lable in INTERNET. The NIC has provided a<br />
dedtcated INTERNET address viz.. CRIDA@AP. NIC IN to the institute for interaction<br />
with mtlllons <strong>of</strong> users on the INTERNET. The institute has been provided with a facillty<br />
for interacting with the Post Office Protocol (POP3) Sewer <strong>of</strong> NIC for exchange <strong>of</strong> mails<br />
through INTERNET drrectly. It is observed that the E-mails are transmitted and received<br />
with out any technical problem and in a quick lime through the INTERNET unlike the<br />
erstwhile PSTN through a Dial-up and a low speed MODEM. Apart from the Netscape<br />
Navigator Gold, Eudora Light and Alexa s<strong>of</strong>tware are also used for exchanging E-mall<br />
and other information through POP3 facility provided by NIC, New Delhi.<br />
4. Establishment <strong>of</strong> INTRANET and INTERNET through LAN<br />
The ICAR has provided Novell Netware Version 4.10 which does not have<br />
INTRANET and INTERNET facililies. Hence it is not possible to get INTERNET facility<br />
for all nodes in the LAN through the existing Novell Netware s<strong>of</strong>tware (Version 4.10)<br />
without dedicated IP-addresses. The ultimate requirement <strong>of</strong> establishment <strong>of</strong><br />
INTRANET and accession <strong>of</strong> INTERNET on different nodes <strong>of</strong> users in CRIDA has<br />
been established by installing a Windows-NT server as an INTERNET sewer for<br />
different users through LAN. The users are able to browse INTERNET through PROXY<br />
server s<strong>of</strong>tware and getting connected to the Windows-NT server. 25 Pentium systems<br />
localed In different rooms have been linked to the UNlX and Wrndows-NT servers for E-<br />
mail and INTERNET respectively. A dedicated Switch and 3 Hubs are used for<br />
connecting the users lo the servers. The UNlX and Windows-T servers are connected<br />
to the FTDMA Ku-Band VSAT for satellite communication and INTERNET browsing
The NIC has recently improved the bandwidth <strong>of</strong> VSAT and many users are able to<br />
access E-mail and INTERNET with out any difficulty. CRlDA has been making the best<br />
use <strong>of</strong> the INTERNET facility for research and development in different acttv~ttes, and<br />
thus making full use <strong>of</strong> the hardware and s<strong>of</strong>tware.<br />
5. Role <strong>of</strong> VSAT In satellite communication<br />
Reliance on traditional ways <strong>of</strong> doing buslness like personal meetings, signed<br />
papers, and communication through normal terrestrial (telephone) lines is fast<br />
being replaced by wireless technologies like the VSATs<br />
About 6000 VSATs have been tnstalled in the country from 1995 onwards<br />
VSAT is a dish antenna along with integrated untts ~nstalled between 2 or more<br />
user locations They relay communicat~on signals between 2 locations through a<br />
satellite. They are suitable and ideal alternative to terrestrial communication<br />
Ilnes. Like terrestrial Ilnes, VSATs also rely on pipes which are Invisible in the<br />
sky which allow information to flow back and forth<br />
VSATs allow establishment <strong>of</strong> dependable links to sites where conventional<br />
telecom infrastructure is poor or non-existent Thts is useful for organizat~ons<br />
whtch have operations In reniote areas. They can easily be setup even in<br />
remote areas owing to their compact size, ruggedness and ease <strong>of</strong> tnstallation<br />
VSATs <strong>of</strong>fer cheaper and cost effective means lo communicate as compared to<br />
land I~nes. The cost <strong>of</strong> a VSAT operation IS dlstance independent<br />
VSATs transmlt high volumes <strong>of</strong> voice, data and video any where In the country<br />
and also in the entire world. Corporate8 and different organizat~ons are trylng to<br />
march further by deploy~ng VSATs for commun~cation In India there are at least<br />
8 VSAT service providers competing in the market VSAT termtnal consists <strong>of</strong> 3<br />
elements A dish shaped antenna ranging In slze from 1.2 m - 3.8 m Outdoor<br />
unit mounted on the antenna for signal reception and transm1ssion7 Indoor un~t<br />
which connects to computer, telephone and customer equipment<br />
VSATs help companies in avoiding long delays involved In deployment <strong>of</strong><br />
conventional teased Lines provided by DOT<br />
VSAT terminal transmits a radio signal to satellite. Radio s~gnal carries data,<br />
voice or images The satellite has a transponder whtch recetves the signal,<br />
amplifies it and sends it back to the receiver<br />
VSAT terminal operates In conjunctton with a large aperture hub earth station.<br />
This hub is installed and operated by a VSAT service provtder The hub directs<br />
the signals to and fro between satellite & communicating VSATs besides<br />
managing data transmission between them Advantages <strong>of</strong> VSATs
Independent <strong>of</strong> terrestrial infrastructure : Leased line networks from DOT do not<br />
normally service locations other than major cities and also line availability issues<br />
necessitates a lead time <strong>of</strong> 6 to 8 months. VSATs are deployed irrespective <strong>of</strong><br />
these problems<br />
Distance independent costs: Cost <strong>of</strong> VSAT network and cost <strong>of</strong> data<br />
transmission are independent <strong>of</strong> distances and country specific tariff.<br />
Operational costs are lower as compared to leased lines<br />
High reliability : VSATs <strong>of</strong>fer 99.5 % uptime when compared to at best 95 %<br />
<strong>of</strong>fered by terrestrial lines due to very few or negligible polnt failures. They <strong>of</strong>fer<br />
cross border connecl~vily as well. They are also useful for business houses that<br />
operate globally<br />
Easy scalability : Wlth ava~lable network, new sites can be commissioned rapidly<br />
with relatcvely l~ttle effort. Increased requirements <strong>of</strong> voice, data or video<br />
transmission from existing sites can also be met comfortably, with out a delay,<br />
from a central management system<br />
VSATs <strong>of</strong>fer a ro<strong>of</strong>top to ro<strong>of</strong>top connectivity. Terrestrial back haul lines are not<br />
required. Thus there will no problems like in land lines<br />
Organizations that matched their network needs to right VSAT provider infer that<br />
VSAT services deliver connectivity that conventional network solutions cannot<br />
just match<br />
VSATs can be used across industries : VSATs provide cost effective solutions<br />
and meet all communication needs ranging from on line banking, ATMs,<br />
manufacluring, movement <strong>of</strong> relocation <strong>of</strong> orders to factories, online<br />
reservations on airlines, railways, hotels etc., These are also used in courier<br />
companies, RBD, financial institutions, publishing houses, television channels,<br />
stock broking, heavy engineering, consumer durables etc.,<br />
Many organizations like Pepsi, Compaq, Citibank, Hong Kong Bank, Unilevers,<br />
Mahindra Ford, Procter and Gamble, Kelloggs, Nicholas Piramal and others<br />
have reaped the benefits <strong>of</strong> installing VSATs in their respective industries.<br />
Benefits have ranged from shorter order processing items, fewer stock outs,<br />
more control to savings in their operational costs Will VSATs save money <br />
* Yes. Voice communication is 75 minutes per site per day. Each site sends on an<br />
average 30 A4 sized faxes per day. Data transfer is 2 MB per site per day.<br />
Working days per year =3D 300<br />
If a company goes for a 9.6 Kbps link, using DAMA technology with a cost <strong>of</strong><br />
11.5 lakhs per VSAT. The total capital investment <strong>of</strong> 46 lakhs is amortized over
5 years. The AMC is at least 10 % <strong>of</strong> capital cost and license fees to<br />
government are Rs.55.100 I- per VSAT<br />
' DOT charges Rs.43- per minute for voice, fax and data communicalion<br />
whereas VSAT service provider has <strong>of</strong>fered a rate <strong>of</strong> rate <strong>of</strong> Rs.201- per minute<br />
for dial-up connection (V-Dial. Dama service from Telstra V-Comm)<br />
Not taking depreciation into account, a company would save Rs.28.90,0001-<br />
(42%) every year <strong>of</strong> its annual communication bill. ARer providing deprecialion,<br />
pay back period for capital investment would be 2 to 3 years<br />
Invisible savings like guaranteed uptimes (99.5 %) and greater connectivity.<br />
Better voice quality, more reliable faxes and data transfers and options <strong>of</strong><br />
teleconferenc~ng, E-mails wh~ch reduce the need for repeated communication<br />
Better service & commercial terms <strong>of</strong>fered by service provider lower the unit<br />
cost for higher usage. Videoconferencing could easily reduce cost <strong>of</strong> travel for<br />
review meetings, training programmes and annual planning processes<br />
Faster flow <strong>of</strong> critical communication (stock outs, dispatches, production<br />
schedules) would ensure an increase in business<br />
* For organizations which operate multiple locations or have higher<br />
communication needs, savings in operating expenses will be incredibly higher.<br />
For locations which need only tlata communicstions, the TDMA VSAT<br />
technology would serve the process at only 40% <strong>of</strong> the cost <strong>of</strong> DAMA VSAT or<br />
less, thereby ensuring that break even point is reached even earlier. How lo<br />
decide on a service provider 7<br />
Look for a service and a solutions approach. Reject mere equipment vendors<br />
Be sensitive to transparency in billing systems and itemwise location-wise<br />
billing. Some service providers typically operate by quoting low prices. They<br />
would make their money in annual service charges all at customer's expnee<br />
Look for a one stop , shop. . A service provider who performs activities starting<br />
from consultancy to network design, equipment supply, network implementatio"<br />
and even network management. You are better focusing on your core strengths,<br />
not running after an area-that you may no1 have expertise<br />
Ask for performance guarantees & other customer-friendly features such as 24<br />
hour help lines, trained man power, previous records etc., Look for a service<br />
provider who is moving with technology, with world wide trends and who could<br />
be your long term partner. Price is not directly related to efficiency. Lowest<br />
bidder in price may be the lowest in service too
PUTTING EDUCATION ONLINE : A CASE STUDY<br />
A. R. Thakur<br />
Bioinformetics Centre<br />
Deparlmenf <strong>of</strong> Biophysics<br />
Molecular B~ology and Genetics and Computer Centre<br />
Calcutta University<br />
Information Technology is rapidly becoming the all encompassing engine <strong>of</strong><br />
development. This development is fuelled by an exponential growth in computing<br />
power, as f~rst observed by Intel co-founder Gordon Moore; microchips double in power<br />
and halve in price roughly every 18 months. Along with this, the second and equally<br />
important component which is pushing the information revolution is the rapid conceptual<br />
and technical development in the field <strong>of</strong> communication. A combined effect <strong>of</strong><br />
development in these two areas, which has in effect become the third component<br />
pushing the information revolution is the concept <strong>of</strong> distributed computing. The idea to<br />
enable computers to work with documents stored in other computers gradually<br />
culminated into what is called W or World Wide Web <strong>of</strong> the Internet.<br />
The number <strong>of</strong> computers serving as hosts on the lnternent has exponentially<br />
grown to about 50 million all over the world and the number is increasing everyday. The<br />
internet wave has reached our shore late and only during the last 3 years has it really<br />
caught on. Initially it has been pushed forward through the combined efforts <strong>of</strong> ERNET,<br />
VSNL, DOT and NIC. This distributed networked availability <strong>of</strong> information has<br />
progressed much beyond transfer <strong>of</strong> Electronic mail or browsing information <strong>of</strong> the web<br />
site-the web surfing.<br />
lnformation Technology, at the threshold <strong>of</strong> twenty first century, is the most<br />
important tool that will form the principal component <strong>of</strong> all our economic/social activities<br />
including education. It may no longer be a fashionable proposition to debate whether<br />
harnessing this component for development is desirable, we may have reached a stage<br />
wherein it is imperative that we do so. Questions may be asked whether it is affordable<br />
and the answer is a simple yes.<br />
A major impediment in this technological revolution has been a few deeply<br />
ingrained misconceptions. These are:<br />
One has to be a mathemafical wizard to use computers<br />
Actually. 95% <strong>of</strong> the computer users are people who hardly know anything<br />
about even programming. W~thln the last decade and a half, advent <strong>of</strong> user friendly<br />
s<strong>of</strong>tware for different types <strong>of</strong> works has made it possible to work with computers a<br />
simple task for any literate person. The technology to be handled is no more
complicated Ulan typing on keyboard or moving the cursor with the help <strong>of</strong> a 'mouse' to<br />
which one can easily become accustomed. It is no accident that Bill Gates is the richest<br />
person on Earth with an estimated income <strong>of</strong> $500 per second. The revolution that he<br />
initiated was to make the s<strong>of</strong>twares user friendly to the extent that it made people shed<br />
their inhibition and start accepting PCs as part <strong>of</strong> their daily life.<br />
This is needed only for those involved in Science and Technology<br />
Any information that may be needed in any area which is now part <strong>of</strong> this<br />
process is available. I shall briefly narrate an incident to illustrate this point. Recently we<br />
were in a training session on Internet with teachers <strong>of</strong> Kldderpore college, mainly from<br />
faculty <strong>of</strong> arts. There was one request for anthology <strong>of</strong> Urdu poetry, and a site could be<br />
found wilh poems given in Urdu script. The second request was for a list <strong>of</strong> works<br />
available on Bhartendu Harischandra at lndra Office Library. Yes, we had to struggle a<br />
bit to get these but ultimately the information could be retrieved<br />
This is not an affordable technology in developing economies like ours<br />
I would like to submit that once we start thinking <strong>of</strong> quality teachlng which might<br />
determine the rate <strong>of</strong> our economic growth, larger sections <strong>of</strong> the society can be<br />
reaches properly only on adoption <strong>of</strong> these technologies. Distance Learning Education<br />
has now taken a new dimension in that it no longer fulfills the necessity <strong>of</strong> reaching out<br />
to the underprivileged and underachievers; today Distance Learning Education is<br />
synonymous with extending educational opportunities to those who have already<br />
become pr<strong>of</strong>essional and would like to enture into new areas. Thus this should cover<br />
course curricula meant for reskilling people wilh a fair amounl <strong>of</strong> competency who have<br />
limitation <strong>of</strong> moving into a fixed educational environment for a specific stipulated period.<br />
It has been suggested that the education system in West Bengal has not kept<br />
pace with the more developed regions. One would like to contest the data since one<br />
does see a large number <strong>of</strong> students from West Bengal mannlng the various National<br />
<strong>Institute</strong>s in numbers in larger proportion than the relative population strength <strong>of</strong> the<br />
State. However that does not mean that one can afford to be complacent. In fact, that<br />
immediately suggests opening <strong>of</strong> new disciplines which is bound to attract the more<br />
adventurous students, who are not afraid to cross the boundary in order to gain<br />
knowledge.<br />
Java and other interactive technologies brlng new possibilities for developing<br />
content on the web. What does th~s capability mean for information dissemination and<br />
communication The capabilities <strong>of</strong> interactive technologies can now be used to<br />
effectively support communication amongst users. The significance and role <strong>of</strong><br />
interactive learning is to be used in providing an environment for indrvidual and<br />
collab~rative work both within the University Department and externally. As intranet can
provide seemless access to a variety <strong>of</strong> information resources, this may be used to<br />
broadcast an interactive structured course work on a particular subject over the network<br />
which we might call Tele teaching.<br />
This will involve:<br />
i) Multimedia cooperative content creation. Every teacher creates hidher own<br />
course based on modular collection <strong>of</strong> semi-independent units (e.g. textual<br />
explanations; problems pictures; applets; videoclips <strong>of</strong> demo's).<br />
ii)<br />
iii)<br />
Database lo store 8uch ramourca units<br />
Teachers' interface for different assembly <strong>of</strong> a course by 'drag and drop' which<br />
will involve - 1000 html pages; -800 pictures (stored as gif files) - 50 Java<br />
applets; - 300 homework problems; - 10 interactively corrected multiple choice<br />
practice tests with solutions; - 15 separate questions<br />
How Is thls going to work<br />
Students' computers to have Browser with frame capabilities:<br />
Top frame for navigation: navigation button<br />
Selection <strong>of</strong> chapters and Topic via pulldown menus<br />
Checking own current progress<br />
Send E-mail to teachers<br />
Enter dedicated Problem Queries section<br />
Learn about the System (System Tour guide)<br />
Homework engine to have:<br />
lndivldualized problems: same text different data for each student<br />
Immediate feedback - in many problems; hints to be tailored to incorrect<br />
answers.<br />
The entire set <strong>of</strong> homework problems can be createdlmodified by the instructor<br />
only through the use <strong>of</strong> browser.<br />
Instructor Tools:<br />
View Table <strong>of</strong> Contents (ToC) <strong>of</strong> the course<br />
Copy ToC from another class<br />
Edit ToC according to the teachers choicelsludents' level<br />
EdiVintroduce homework problem<br />
Course admlnistration: Register new studentddrop existing studentslchange<br />
due dates for homework/assi$n markdassign system's e-mail recipients.
Objective <strong>of</strong> the proposed project:<br />
Calcutta University has nearly 230 aff~liated undergraduate colleges. Recently a<br />
course on Environment has been introduced as a compulsory paper at the<br />
Undergraduate level. There is a strong need for an interactive course material to be<br />
made accessible to the teachers and students. A sophisticated computer network<br />
infrastructure involving optical fiber backbone connecl~ng the different bulldings in the<br />
Rashbehari Prangan connecting different Departments is in place. From Alipur we have<br />
established a dial in PSTN connection between the two Routers. The connectivity at 9.6<br />
kbps gives 200 ms connect time. whereas for the I-NET X.25 leased llne it is 400-600<br />
ms. The VSAT (641128 kbps) is being used for Internet browsing by about 100 nodes<br />
spread all over the four campuses. We are in a pos~tion to <strong>of</strong>fer the undergraduate<br />
Colleges connectivity. A kit consisting <strong>of</strong> a Router, a hub and a modem is now ready<br />
with which the connectivity can be tested from any collegeiinst~tut~on using PSTN line<br />
This project envisage development <strong>of</strong> on-ltne courses based on the Syllabus for I)<br />
Environmental Studies ii) Computer Science (both at Pass and Honours level) ill)<br />
Molecular Biology iv) Electronics Science etc, at the Undergraduate level<br />
Recent development<br />
lnd~a was one <strong>of</strong> the fist few countries that had taken glant step In 1986 and<br />
establ~shed a nat~onw~de B~olnformatlon System <strong>of</strong> lndla A D~stributed lnformat~on<br />
Network under the aelgies <strong>of</strong> Department <strong>of</strong> Biotechnology was establ~shed in varlous<br />
Un~versit~es and research lnstrtutes As a consequence <strong>of</strong> th~s ~nit~atlve computer<br />
llteracy and awareness amongst 910-sclentlsts grew Large number <strong>of</strong> publlc doma~n<br />
databases w~th regular updates, In the area <strong>of</strong> molecular b~ology and genetlc<br />
englneerlng became ava~lable to sc~ent~sts In lndta Computer hardware to carry out<br />
s~mple data analys~s and modell~ng has also become ava~lable and network~ng <strong>of</strong><br />
computers IS now provlded on a regular bass through NICNET and ERNET establ~shed<br />
V-SAT l~nkage Calcutta Umverslty has already establ~shed its own Network the need<br />
<strong>of</strong> the present 1s development <strong>of</strong> human resources and for thls network can be<br />
harnessed to teach more effect~vely In far flung Inst~tut~ons by mak~ng ava~lable to them<br />
course rnaterlal by teleteach~ng<br />
International Status<br />
It is now global\y felt that the fruits <strong>of</strong> the Informatton retrieval, processing and<br />
analysis should be put in a networked environment so that greater benefit can accrue to<br />
the society. Even a market driven economy <strong>of</strong> the western world understands the<br />
necessity <strong>of</strong> dissemination <strong>of</strong> knowledge over the Net so that browsing or surfing the<br />
Net is no longer for pleasure but necessary for gathering vital teaching matertal.<br />
Michigan State University has already set up a web site based Lecture Course on<br />
Physics.
Methodology to be adopted<br />
The campus at Ballygunge Circular Road has an extensive LAN maintained with<br />
UTP cabling. Similarly the <strong>Central</strong> Libray Complex at College Street campus also has<br />
an operative LAN. These are connected via the I-NET could using (X.25 and X.28<br />
PADS). The four-port Router at Rashbehari Prangan can provide Dial-in service via the<br />
I-NET could as well. Undergraduate Colleges could be tested as a node where<br />
teleteaching material as well as Offline database could be made available over the<br />
Intranet.<br />
Due lo ever-increasing importance <strong>of</strong> the Web as a distribution channel and<br />
communications vehicle, organizations are racing to meet the demand for media-rich<br />
content on the Internet and the~r intranets. Education and training on demand is just<br />
one example <strong>of</strong> the innovative use <strong>of</strong> media streaming. WebFORCE MediaBase<br />
streams audio and video to the desktop, bridging the gap between those learning and<br />
those teaching through a Web environment. Video lessons streamed to the desktop<br />
allow for education on demand, including both live and delayed access to a lesson.<br />
Real-t~me Webcast~ng allows students the flexibil~ty to see lectures live from an <strong>of</strong>f-site<br />
location, through the familiar-interface <strong>of</strong> a Web browser. Course on demand allow<br />
even greater flexibility by archiving and cataloging lessons or sieeches that can be<br />
searched lor an delivered as needed with a keyword or topic search.<br />
Medis delivery in for /he educational institutions like colleges or schools<br />
Computer based Training applicalions. In fhe educational sphere the media<br />
server can be used for:<br />
Interactive computer based training<br />
Multimedia centres established by universities to train students in new technologies<br />
Dtstance learning, live recording and broadcast <strong>of</strong> classroom lectures over campus or<br />
external networks and the storage and cataloguing <strong>of</strong> these lectures for future viewing.<br />
Archiving and cataloguing <strong>of</strong> the various media assets at different departments at the<br />
educational institutions.<br />
Repurposing media assets for on-line education.<br />
Work to identify the applfcation requirements must be made and for that certain key<br />
questions should be answered:<br />
What is client platform<br />
What is the underlying network (topology and protocol) for distribution <strong>of</strong> the media<br />
What are Ihe video quality, bandwidth and format requirements How many clients are<br />
being served concurrently is there a need for multicasting How much content will the<br />
customer be receiving<br />
Are media server, storage content and network management important
What is the media format used (MPEGI. MPEGZ, H.263)<br />
Is live encoding and broadcast required or desired<br />
Is 'server-up-time' a critical issue<br />
Before a media server solution is decided it is important to understand the type <strong>of</strong><br />
application that is being proposed. Some <strong>of</strong> the iniportant information one must have:<br />
Display device for the media; Windows 95; Windows NT. Irix, Solaris; AIX; Network<br />
Computers Transport protocol IP or ATM<br />
Network topology from the server to the cl~ent<br />
The video format that is bebng used currently WebFORCE MediaBase supports<br />
MPEGI, MPEG2 and H.263 formats V~deo can be streamed using natlve IP protocols<br />
(UDP) or AAL5 for pure ATM netowrk.<br />
Number <strong>of</strong> concurrent streams that IS being planned along with hours <strong>of</strong> video that need<br />
to be stored and streamed.<br />
Hardware or s<strong>of</strong>tware decode for the client side should be identified.<br />
WebFORCE configuration for 0200 with CPU (4R10K) with 256 ME RAM 60 hrs<br />
<strong>of</strong> Videocontent may be stored using 100 streams This would need additional disk <strong>of</strong><br />
56GB. The Price is - $20000.00.<br />
Once this is standardized, interactive coursewares developed to cover some <strong>of</strong><br />
the specialized subjects like Environment, Microbiology, Molecular Biology, Electronics,<br />
Computer Science could be kept for access by both students and the teachers <strong>of</strong> the<br />
Undergraduate Colleges. We believe the interactive progamme w~ll ensure that the<br />
teaches wouldn't feel theatened as in a top down approach<br />
How is the project to be integrated to the educational system<br />
The proposed West Bengal net <strong>of</strong> universities<br />
The idea <strong>of</strong> this network gets its birth from the immediate need to establish<br />
communication between the major educational institutes in West Bengal for data, email,<br />
and remote education program. Right now a lot <strong>of</strong> these institute e.g IIM. ISI, Calcutta<br />
University, S. N. Bose Inst~tute. etc, have lheir own LAN and access to the W on<br />
leased lines. All these centres <strong>of</strong> education are to be broughl under one platform using<br />
a resilient, upgradable, scaleabie h~gh bandwidth backbone network. At a later stage<br />
this nelwork can afso be used for private Voice traffic, which will help project<br />
investment. The various affiliate colleges under the various universities should be able<br />
to dial into the backbone for sharing <strong>of</strong> resources. This intranet should also be used for<br />
internet access from one or multiple gateways in the network in such a way that there is<br />
hgh avaiiability if internet access to all users in the network. The currenl network at
Calcutta University connecting the various campus can be model 8 used as building<br />
blocks in the design and construction <strong>of</strong> this wmplicaled intranet.<br />
The current cal university network<br />
The Calcutta University network can be the basis <strong>of</strong> our proposed network. In<br />
the light <strong>of</strong> the above we need to look into the design <strong>of</strong> this IBM switch 8 router based<br />
intranet. The campuses in this intranet are:<br />
Rajabazar Campus<br />
Bailygunge Campus<br />
Alipore Campus<br />
College Street Campus<br />
The various branches are connected via the INET X.25 network. Alipore has a<br />
d~al up connectivity to Ra]ebazar. For internet access right now the Rajabazar campus<br />
acts as the gatrway to ERNET @ 64 kbps, through VSAT.<br />
Concerns in the network:<br />
The network la not secured. It does not have a proper proxylfirewall which might<br />
lead to data hacklng and Intentional instrusion into this network.<br />
Bottlenecks In band with can be a cause <strong>of</strong> concern as the network grows with<br />
more colleges dlallng thls network. The backbone is presently only at 9.6 kbps at which<br />
only emaii transfer can happen, smoothly Mission critical application and multimedia<br />
applications e.g. remote teaching program will definitely needs much higher bandwith.<br />
Video conferenclng too requires much higher bandwidth.<br />
For internet access the network is dependent on ERNET. But again to cater to<br />
the high number <strong>of</strong> Internet users in this huge network a fat pipe to VSNL at 512 kbps<br />
or more is ideal.<br />
How can we design thls network<br />
The final network as we envisage is to encompass the following Educational<br />
<strong>Institute</strong>s<br />
Calcutta University - 4 locations<br />
Jadavpur University - 2 locations<br />
Vidyasegar Univenity - Midnapore<br />
Rabindra Bharati Univenity - Calcutta<br />
Viswa Bhanti Univenity - Santiniketan
North Benal University - Siliuri<br />
Burdwan University - Burdwan<br />
Kalyani University - Kalyani<br />
BE College - Howrah (Calcutta)<br />
IS1 Calcutta<br />
IIM Calcutta<br />
SN Bose <strong>Institute</strong> Clacutta<br />
Saha <strong>Institute</strong> Calcutta<br />
IACS Calcutta<br />
Bose <strong>Institute</strong><br />
All the affiliate colleges under the universities (> 400)<br />
Fisheries University - Calcutta<br />
IIT - Kharagpur<br />
With all these inst~tutlons brought w~thin the lim~ts <strong>of</strong> a single network they need<br />
adequate bandwidth for effective data commun~cation and also to protect investment,<br />
the network should be designed only after through brainstromlng & careful and<br />
meticulous study <strong>of</strong> requirementslapplications and the various options available in<br />
terms <strong>of</strong> the WAN media. The design is also somewhat dependent on the extent <strong>of</strong><br />
security and network monitoring required<br />
To start with we can break the network into phases and look tnto the varlous<br />
media available at this stage. The network des~gned today should be able to<br />
accommodate new technologies <strong>of</strong> tomorrow,<br />
The various media:<br />
To start with one can continue with INET but keeping in m~nd that this would<br />
include running critical applications l~ke remote teaching programme, library on the net,<br />
File download, mult~medi application as well as video conferencing and voice at a later<br />
stage a higher leased bandwidth is definitely required and this is what holds the key to<br />
the effectiveness 8 usabil~ty <strong>of</strong> the network.<br />
The various alternates for having high bandw~dth w~thin the intranet are:<br />
641128 K DOT Terrestrial leased lines<br />
641128 K VSAT Priority Assigned multiple Access links<br />
Demand Assigned Multiple access VSAT links<br />
ISDN Basic Reserved Interface links from DOT (2B+D)<br />
For access to internet a good option is to have onelmult~ple leased links to<br />
VSNL through multiple gateways within Ihe network These links can be establ~shed<br />
through DOT leased IinksllSDN.
A Tiered Network<br />
To ensure a properly planned network that can be administered with ease we need to<br />
tier the network as:<br />
The network can be constructed In line with the internet which has a backbone<br />
on OSPF and Access network and is a IP network in total~ty.<br />
A router based IP backbone connecting the nodal universities<br />
A strong, resilient 8 redundant backbone hold the key to the functionality &<br />
scalibility <strong>of</strong> the entire network. Within Calcutta 8 its suburbs we can have the<br />
backbone on ISDN dial up That can give us upto 128 kbps <strong>of</strong> bandwidth. Between<br />
Calcutta 8 distant locations it is ideal to have 641128 k bandwidth using VSATIDOT<br />
leased links. VSAT w~ll def~nitely be beler in terms <strong>of</strong> reliability over DOT leased<br />
circuits.<br />
Consultat~on w~th DOTNSNL necessary<br />
A remote access network<br />
To start w~th the various affiliate colleges can dial up into the nodal points<br />
through RAS (rernote access server) and can gel into the nelwork It is very important<br />
that colleges should get a committed high bandwidth on demand. Here the two options<br />
are PSTN d~al upllSDN dial upi9.6 k leased164 kbps leased. The networking<br />
equ~pnlents should however be able to support 64 kbps in future.<br />
Consultation with DOT essential<br />
Internet access<br />
This intranet should also be available to the WWW and the users <strong>of</strong> this<br />
University network should have unhindered access to the internet.<br />
To ensure this there should be preferably two 64k leased links to VSNL from<br />
two nodal centres.<br />
The reason for having 2 links instead <strong>of</strong> I is to distribute the internet traffic<br />
through 2 points and hence reduce bandwidth clogging at a single gateway.<br />
Consultation with VSNL necessary.
Network security and monitoring<br />
A network <strong>of</strong> this stretch and magnitude needs utmost security for seamless and<br />
smooth functioning. Hence 11 should have multiple Proxy servers <strong>of</strong> high processing<br />
power to<br />
1. Ensure that the network is hidden from the interne! and hence secured from<br />
being hacked by firewalling mechanisms.<br />
2. The overheads on the backbone are reduced and hence network becomes<br />
faster due to proxy caching.<br />
Again to prevent downt~me <strong>of</strong> the network by early identiflcat~on <strong>of</strong> faults in the<br />
network. The network needs to be managed uslng a central SNMP Management<br />
Station.<br />
Local LAN at each site<br />
Finally one very important part <strong>of</strong> the network is the LAN at each slte. In order to<br />
effectively use the backbone the LAN at each site should be state <strong>of</strong> the art. All the<br />
sites should preferably have structured cabllng with a switched environment and F~bre<br />
at the backbone and UTP at desktops. The campus LANs can well be bull1 around ATM<br />
switches.<br />
Servers<br />
In add~tion to the proxy servers there should be DNS servers, mall servers,<br />
terminal sewers, d~g~tal l~brary server and web servers at one or more nodal site 8<br />
replication if poss~ble at other sites<br />
This, thus forms the basis <strong>of</strong> the proposed intranet However detalled studles on.<br />
1. Load Calculation<br />
2. Degree <strong>of</strong> redundancy<br />
3. The type <strong>of</strong> routing protocol to be sued<br />
4. The extent <strong>of</strong> security<br />
5. The type <strong>of</strong> management<br />
6. IP planning etc.<br />
is required to finally arrlve to the ult~mate design.<br />
In this regard we are looking towards the deslgn <strong>of</strong> a network that is<br />
Technically flawless<br />
Commercially viable<br />
Scalable i% Upgradable<br />
Should be able to grow
How do we go about building it<br />
The Calcutta University Network needs to be augmented.<br />
We have been given one Cluster C-series <strong>of</strong> IP address (256). Current scheme<br />
has exhausted the list. We are to put in Proxy server. We have downloaded<br />
LlNUW~ndows NT based demo version. These are to be ported on the<br />
Compaq Windows NT servers for which we have already placed order.<br />
We need to put a Firewall for security. Price Rs. 2. lakhs<br />
The RAM <strong>of</strong> the PC Servers and the machines currently used for internet<br />
access have to be increased. For this order has already been placed.<br />
The 9.6 kbps X.25 leased line INET needs to be upgraded to 64 kbps.<br />
A CD-Juke box has to be put in conjunction with the CD-NET at College Street<br />
so that we can start the Off-line database service. Price DM 28,000.00 (NSM,<br />
Germany)<br />
Web Server Sun Ultra Sparcll Rs. 10 lakhs<br />
Digital Library server: Origin 2000lRAID Rs. 20 lakhs<br />
Dial-in Server: Capable <strong>of</strong> 5 telephone lines on a hunting mode; Rs. 50000.00<br />
Budget Proposal by March 1999<br />
CD Juke Box<br />
S<strong>of</strong>tware for CD-NET<br />
Upgrade 9.6 kbps to 64 kbps<br />
Upgrade RAM<br />
Dial-in-Server<br />
ProxylFirewall S<strong>of</strong>tware<br />
5-telephone lines<br />
Laptop Computer<br />
Libsys<br />
S<strong>of</strong>tware for On-line Teach<br />
workstation for On-line<br />
ERNET 2 Mbps upgradation<br />
Total:<br />
What would we be getting<br />
1. Connectivity upgraded to 64 Kbps<br />
2. Off-line database with 150 Cds.<br />
3. 5-6 Colleges out <strong>of</strong> 100 colleges within Calcutta Telephone gets connected.<br />
4. A kit comprising <strong>of</strong> (1 Router; 1 Hub: 1 Modem; 3-4 Patch cords: 1 Laptop<br />
Computer) is kept ready for checking the connectivity with colleges and<br />
universities.<br />
5. Preparation taken for On-line teaching <strong>of</strong> courses from emerging areas.
EXISTING NETWORK A T CALCUTTA UNIVERSITY<br />
I
WEB SITE DESIGN 8 HOSTING<br />
Bikash Panda<br />
HIG-188, Kanen V~har, Bhubeneswar-75103 1<br />
The Internet's World Wtde Web is like the W~ld West. Anarchic, disorganised.<br />
exciting and with minimal standards. By now there are approximately about 48.00,000<br />
web sewers maintaining about 45 crores <strong>of</strong> web pages. Every corporate house,<br />
educational & research institute, small business concerns even indivlduat users expect<br />
to have a presence in the Web and moreover, every web site expects to attract as<br />
much visitors to browse their information contents. People visit the sites which are well<br />
organised, informative, easy to navigate, interesting content, good to look at and<br />
nevertheless, useful. This imposes a challenge on the web designers to have an edge<br />
There are no standards to define what a good site is. However, a consensus has<br />
emerged for the same.<br />
The language <strong>of</strong> World Wtde Web is HTML wh~ch stands for Hyper Text Markup<br />
Language. The word 'Markup" indicates that HTML is a formatting language and not a<br />
programming language. This concept makes the language easy lo learn 8 easy lo<br />
implement. Web pages are basically HTML documents whlch are inlerpreled by Web<br />
Browsers ltke Micros<strong>of</strong>t Internet Explorer or Netscape Navigator. A HTML docuri)ent<br />
IS an ASCll text flte that contam HTML tags 8 these tags decide how the web page<br />
looks like when browsed. Being ASCll files, you do not require any spec~allsed<br />
Compiler or Interpreter or IDE to wrtte or use them One can use the most cornmon<br />
Notepad or Wordpad or even DOS's own edit com or Unix's vi editor to write them. The<br />
HTML files have an extension <strong>of</strong> HTM or .HTML.<br />
The following section describes few commonly used HTML tags and other web<br />
development concerns.<br />
The HTML tags are special keyword wrltlen between < and > slgns An example<br />
<strong>of</strong> an HTML tag is There is no hard 8 fast rule in wr~ting the tags in Uppercase<br />
but it is advisable to use Uppercase letters so as to differentiate it from the text <strong>of</strong> the<br />
page.<br />
A typical web page may have the following contents Let us name the page rnyf1le.htrn<br />
<br />
<br />
Welwrne to CIFA, Bhubaneswar<br />
4iTML><br />
<strong>Central</strong> <strong>Institute</strong> <strong>of</strong> Freshwater <strong>Aquaculture</strong> is situated in the outskirts <strong>of</strong> temple city <strong>of</strong><br />
Bhubaneswar in Orissa.<br />
<br />
<br />
This page when viewed in your preferred browser would display a heading in the<br />
top <strong>of</strong> the screen as Welcome to CIFA, Bhubaneswar and the body <strong>of</strong> the browser<br />
would display the text '<strong>Central</strong> lnstltute <strong>of</strong> Freshwater <strong>Aquaculture</strong> is situated in<br />
the outskirts <strong>of</strong> temple city <strong>of</strong> Bhubaneswar In Orissa.' Please note that when<br />
indicating the start 8 end <strong>of</strong> the tags, the end tag must have a I in them. You may find<br />
this used as 8 , 8 elc.<br />
In the browser window only the contents <strong>of</strong> - are shown. The<br />
tag contains information which are not shown in the browser but have other<br />
use like the Header information, author's name elc.<br />
The tag inside body displays the text as Headinl;.<br />
Example :<br />
lntroduction to CIFA would show 'lntroduction to CIFA" as lntroduction to<br />
CIFA<br />
Smaller Headings are possible with tags thru <br />
The following tags helps us in formatting the text.<br />
denotes the start <strong>of</strong> a new paragraph.<br />
tag puts a line break in the text<br />
For making the text Bold<br />
For making the text Italics<br />
XU> For making the text Underlined<br />
Adding Plctures to Web Pages:<br />
Pictures speak thousand words. Graphics makes a web site attractive.All<br />
pictures must be converted lo one <strong>of</strong> several digital formals, so you'll need a scanner<br />
and s<strong>of</strong>tware (such as Adobe Photoshop) to manipulate the picture into the form you<br />
wish to display it in: the pictures don't appear there magically1 To get your pictures to<br />
display on a Web page, you must use certain HTML tags to "point to" the picture Rles<br />
that, like your HTML files, have been uploaded to a server. Where and how you place<br />
the tags deems how the art will be viewed by a particular user.<br />
Pictures can be saved in a variety <strong>of</strong> styles; the GIF format is the most<br />
commonly recognized by various browsers, and is thus most commonly used. .ihe<br />
JPEG format is also fairly common; it creates better quality photos, especially with<br />
scans. A program called GIF Converter Is also helpful; it converts files saved in the
Maantosh PlCT format to either a GIF or a JPEG, and allows you to edit the files<br />
Here is the most common tag used to find and place a picture on a Web page:<br />
provided. Third, you will need a private account on a Web server-a computer<br />
permanently connected to the Internet-so you can upload your files to it, and other<br />
people can see them.<br />
You also must be able to transfer your files to the server. For IBM and<br />
compatibles, use any FTP (File Transfer Protocol) client (there's a basic one built in to<br />
Windows 95); one <strong>of</strong> the easiest to use is Cute FTP. From there, you have the choice<br />
<strong>of</strong> a few different options for getting your pages up on the Web.<br />
Depending on your circumstances at school or at work, you may have to pay a<br />
fee to keep your pages on the Web; the rates will vary from provider to provider.<br />
University servers will sometimes upload student or faculty pages to their server for free<br />
or a minor fee; if you work in a company that allows you to use their server, that's<br />
another option. If neither <strong>of</strong> these are possible, you'll need an independent ISP (Internet<br />
Service Provider), price !he options, then upload the information to the provider so they<br />
can put it up for you. You will be paying a fee (most likely on a month-to-month basis) in<br />
this case. Fees could be flat, but many times they depend on how many people are<br />
accessing your site (called "hits"). The more hits, the more taxing it is on the server,<br />
and potentially, the more you'll pay.<br />
HTML Editors<br />
As one can imagine writing HTML tags for longer documents can be very<br />
dtff~cult 8 confusing. As on now there are hundreds <strong>of</strong> HTML editors which work as<br />
WYSlWUG (What You See Is What You Get) style, which helps you write good Web<br />
pages conveniently. The most popular ones are Mic-x<strong>of</strong>l Frontpage, Hotmetal's<br />
HotDog, Dream weaver etc.<br />
Web Design Considerations<br />
Here are few web development guides for making a good web site<br />
Set Objectives for the Web Site:<br />
Define the target audience clearly (Whom do you want to influence)<br />
Esttmate audience technology pr<strong>of</strong>ile (eg bandwidth, type <strong>of</strong> browser etc)<br />
Perform audience needs analysis<br />
Be clear about your purpose (sales, service, education, research, entertainment)<br />
Define the scope <strong>of</strong> Content:<br />
* Do not Use unnecessary words<br />
Provide useful information on each page<br />
* Design for all browsers
Use Graphics Judiciously:<br />
Limit large images used for visual appeal only<br />
Keep the total size <strong>of</strong> graphics on a page less than 50K<br />
Limit the use <strong>of</strong> graphics bullets and lines<br />
Ensure good contrast between text 8 background colour or images<br />
Plan for easy Navigation:<br />
Give each page an appropriate title<br />
For long documents, provide return to Top or Hornepage links<br />
For large sites, provide a search engine or index pages<br />
Indicate the date <strong>of</strong> last update <strong>of</strong> the site<br />
Avoid use <strong>of</strong> frames<br />
Provide guided tours in appropriate situations<br />
Web design is more <strong>of</strong> an arl than programming. A good designed site can be the best<br />
medium one organisation can think <strong>of</strong> lo promote their objectives.<br />
About the author<br />
Bikash Panda is a BE(Electronics). MBA(Systems) and has Web development experience <strong>of</strong><br />
more than 3 years in India 8 Abroad<br />
He can be coniacted at HIG.188, Kanan Vihar, Bhubaneswar-751031, Te1.91674-440702,<br />
Email : bikash@ma~lcily.cwn
MULTIMEDIA - a maglc mantra<br />
Jayaram Parida<br />
(MCS. Multimedia 6 Web Developer)<br />
NAVAGUNJAR<br />
Multimedia end Web Technology Lab<br />
9, Sweet Housing Complex<br />
Ganganagar, Bhubaneswar - 751 006<br />
Multimedia is a much used, over-used and abused term. Since the early 1990s<br />
multimedia has been hyped as a major revolution in computer technology and is hailed<br />
as part <strong>of</strong> "the next big thing". As with any bandwagon, there are many people looking<br />
at multimedia from different points <strong>of</strong> view. As we are considering multimedia from a<br />
Media Product~on viewpoint we need to define multimedia in terms that allow us to<br />
compare and contrast multimedia with other media products.<br />
As multimedia is so new there are riot any clear conventions about what is and is not<br />
multimedia but as a starting point we will work with the following def~nition:<br />
Multimedia is a really an adjective not a noun1 You can't really talk about multimedia full<br />
stop. You have to talk about a "multimedia something" We are talking about multimedia<br />
producis. These are media products with the following characteristics:<br />
They are delivered digitally. This usually means that some kind <strong>of</strong> computer is<br />
required to use the product. This may not be a conventional looking desktop computer<br />
(although it can be). It could be a Sega or Nintendo games console. It could be a settop<br />
decoder box or a CD Player. it might be a hand-held personal organiser or a mobile<br />
phone. The key that distinguishes digital technologies from the rest (analogue) is that<br />
large amounts <strong>of</strong> information can be stored, searched, displayed and manipulated with<br />
ease, Digital technology also makes it easier to allow the consumer to enter their own<br />
information and make there own choices- inleractivity.<br />
They use a range <strong>of</strong> audio-visual forms. Traditionally, information delivered via a<br />
computer has been text-based with perhaps some basic graphics. Multimedia products<br />
are based on the assumption that it is best to use the form most appropriate to the<br />
content. As computer technology has improved, it has become possible to display high<br />
qual~ty still images, v~deo and animation in addition to text and graphics. Whilst using<br />
these visual mediums it is also posslble to play high quality sound- music, voice-overs,<br />
sound effects etc. This allows the product designer to provide a much richer<br />
environment for the consumer. It is argued that this enhances their experience.<br />
They are interactive. Many traditional media forms are passive. The consumer can't<br />
decide what stories appear in a newspaper. They can't directly influence the narrative <strong>of</strong>
a N drama. They can't respond immediately to r radio adverl. Interactivity allows the<br />
consumer to influence the material that is king presented to them - to interact with it.<br />
The nature and amount <strong>of</strong> interaction varies tremendously. For example, a Ninlendo<br />
games console is highly interactive the whole experience hinges on the user's<br />
manipulation <strong>of</strong> the controls. Home Shopping may be less frantically interactive but still<br />
allows the consumer to respond directly to the content that is being displayed.<br />
Introduction- Still Images<br />
In this first main practical topic you will look at how the most basic elements <strong>of</strong> any<br />
multimedia product are constructed. The term "Still Images" covers a wide range <strong>of</strong><br />
different parts <strong>of</strong> a multimedia production. It refers to any static graphics, photographs,<br />
design devices and even text sometimes. Sometimes you will start a screen from<br />
scratch on the computer but there is <strong>of</strong>ten a need to capture existing graph~c material<br />
such as a photograph or a logo into the computer so that you can work on it before<br />
including it in the flnal product.<br />
Capturing Still Images<br />
Capturing a still image means taking an existing Image and transferring it into the<br />
computer so that it can stored and used in digital form. The method you use depends<br />
on the form the existing image takes before you start. 01 course the image may not<br />
exist at all so you will need to do some photography first. If this the case then consider<br />
using a digital camera. This will cut out an intermediate stage. If you want high quality<br />
images from scratch then you can take conventional photographs and have them<br />
transferred on a Kodak Photo CD which can then be read by the computer. It is <strong>of</strong>ten<br />
the case however that you already have the image as a photograph or on a prrnted<br />
page. It this situation you use a flat-bed scanner d~rectly connected to a computer to<br />
capture the image.<br />
Scanning<br />
The flat bed scanner is used to capture existing still images that are in a form that will fit<br />
flat against the glass plate. This usually means paper but it doesn't have to be- you can<br />
scan fabrics, leaves, silver fotl etc. a3 a means <strong>of</strong> generating textures There Is a<br />
scanner in all the computer suites that you use. Using the scanner IS fairly straight<br />
fomard but like anything in multimedia it needs to be done carefully following these<br />
instructions exactly.<br />
Place your original artwork under the scanner cover, face down. Align the corner <strong>of</strong> the<br />
picture with the comer <strong>of</strong> the glass indicated by an arrow. This usually means putting<br />
the picture In upside down. Launch the application Adobe Photoshop. This program is<br />
probably in a folder called Applications but it could be anywhere on the d~sk. :f you can't<br />
find it use "Find File" from the Finder File menu The application icon is shown here.
PhotoShop is a popular, powerful program for creating and manipulating still images.<br />
You access the scanner by pulling down the File menu and holding the mouse down on<br />
Acquire. This displays a sub-menu that shows the name <strong>of</strong> the scanner s<strong>of</strong>tware. This<br />
will vary depending on the make <strong>of</strong> scanner but is usually obvious.<br />
The scanner may have settings for adjusting parameters such as brightness and<br />
contrast. As a general principle, leave all these settings at their defaults. Scan the<br />
image first and then do all the correction afterwards in PhotoShop. PhotoShop gives far<br />
greater control over the image and if things go wrong you can always revert to the<br />
original scan and try again. Click the preview button. The scanner will quickly scan the<br />
original at low resolution, showing you a thumbnail view <strong>of</strong> the whole image. You will<br />
<strong>of</strong>ten want to scan only part <strong>of</strong> the image so use the mouse to click and drag a<br />
rectangle over the area <strong>of</strong> the image you want to scan. Click the scan button. The<br />
scanner will scan the parl <strong>of</strong> the image you have selected and the open a Photoshop<br />
window containing the scanned image. You can then modify it andlor save it as you<br />
wish.<br />
Dlgital Camera<br />
Digital Cameras are useful when the image you want doesn't exist. You can go out and<br />
shoot Images and then transfer them directly to the Computer without going through the<br />
traditional route <strong>of</strong> developing, printing and then scanning. The disadvantage <strong>of</strong> using a<br />
di~ltal camera (or at least a cheap digital camera) is the image quality. The quality is<br />
much lower than conventional photography.<br />
Comparison <strong>of</strong> techniques<br />
All three <strong>of</strong> the ways <strong>of</strong> capturing images discussed above have their advantages and<br />
disadvantages. In deciding which to use you should be aware <strong>of</strong> these:<br />
Scanning gives reasonable quality and is fairly quick provided the image exists in a<br />
form that can be put under a flatbed scanner.<br />
Digital cameras are quick and easy to use when you need to originate the image but<br />
the quality is only average and they are expensive.<br />
PhotoCD gives excellent quality and you don't have to bother with scanning but you<br />
have to wait for it to be processed and it can be expensive.<br />
Thia shows that there is no right or wrong way to capture images- you have to choose<br />
the best tool for the job.
Capturing Sound<br />
In the same way that you <strong>of</strong>ten start screens with a scanned image you will <strong>of</strong>ten need<br />
to start a soundtrack by capturing and storing some existing music or sound eifects on<br />
disk so that you can incorporate them in your production at the authoring stage.<br />
Existing sound recordings can exist a number <strong>of</strong> forms. The way you capture these into<br />
the computer varies according to the form the track takes. The easiest audio to capture<br />
is from conventional audio CDs. However, if your track is on audio cassette tape then<br />
you can still capture it quite easily. This will usually be the case if you have recorded<br />
your own track with volceoverslcommentary etc. Occasionally It may be necessary to<br />
capture the audio track <strong>of</strong> a video tape. This uses the same technique as required for<br />
audio tape so it is not covered in detail here. Once the audio track has been captured it<br />
can be edited to meet the requirements <strong>of</strong> your multimedia package. The resulting track<br />
can then be superimposed onto the visual material at the authoring stage<br />
Capturing from Audio CD<br />
If you need to capture a track from an audio CD then here's the procedure'<br />
1. Load the CD Into the CD drive <strong>of</strong> the computer. An icon represenltng the CD w~ll<br />
appear on the desktop. Don't bother double-clicking it- that isn't the way in1<br />
2. Locate and launch the application SoundEdit 16. This Is a general purpose<br />
sound capture and editing program. It is to sound what Photoshop is to images.<br />
Capturing Video<br />
Capturing Video is somehow a bit tedious process on the desktop computer. The video<br />
capture card is bit costly than a sound card. And also to capture a long duration video<br />
file takes more space. For example If we want to store 10 minute video data , then it<br />
requires 100-200MB <strong>of</strong> disk space to store the data on to the disk.<br />
Some good video capture cards are Miro DC-30, Bravadoo-2000, Truevision Targa pro<br />
and some low end capture cards are Video Blaster, etc.<br />
To Edit and capture video to the computer on a full frame full motion we require more<br />
video ram and also more RAM at least 32-84 MB(SD0 RAM). Adobe, Premier 5 is a<br />
best s<strong>of</strong>tware for non-linear editing and ~pecial Effects. There are also so many<br />
s<strong>of</strong>tware and editing system8 are available for broadcast quality production. They are<br />
SGI, AVID systems.
Fine I it's a separate topic that which require so many think to the spare, we should now<br />
move to combine all the Text, Picture. Sound, Video and to produce a complete<br />
CDROM .<br />
PREPARING SCREENS FOR INTERACTIVITY<br />
Introduction to Creating a Screen<br />
Once you have acquired all the images that you need you can then build them into a<br />
screen which can be then combined with other screens in an authoring package to<br />
produce the finished product. You will always use Adobe Photoshop to do this job.<br />
Photoshop is an extensive package that can be used for many other tasks as well.<br />
Rather than give you a general introduction to Photoshop this section allows you to<br />
work through the construction <strong>of</strong> an example screen. This is the quickest way to get<br />
results but you should take time to explore Photoshop and find out what else it can do.<br />
Having prepared our Text, Image, Audio, video we are now ready to import them into<br />
Macromedia Director in order to make our piece <strong>of</strong> interactive multimedia.<br />
Macromedla Director<br />
Director is an application which uses the metaphor <strong>of</strong> a film studio: There is a STAGE<br />
on which all the action comes together, a CAST, a SCORE which allows you lo<br />
orchestrate objects through time and a CONTROL PANEL which controls the action.<br />
There are also more computer-like tools for creating text, images, and other objects on<br />
the stage. Each feature is represented by a Window and each window can be open at<br />
the same time so you can work easily (provided you have a big enough screen)<br />
between the features.<br />
As the director <strong>of</strong> your own Movie (as the finished file format is called) you can<br />
orchestrale a number <strong>of</strong> already created objects (cast members) around the Stage.<br />
These objects could be Photoshop files, QuickTime movies, sound files or text files.<br />
You can layer these objects up in the Score so that they can play one in front <strong>of</strong> the<br />
other on the Stage.<br />
It Is the interacllv~ly ill Director that makes it really powerful -- you can programme the<br />
Score and indiv~dual Cast members and so control their behav~our by using Scripts<br />
written in Director's own programming language Lingo. Transparent interactive areas<br />
(buttons).
Multimedia Authoring Contents<br />
Importing the Cast<br />
The first stage is to import the prepared Pict screens. Select Import from the File menu.<br />
The dialogue box allows you select mom than one file at a time. You can choose to<br />
import the bitmap at its original colour depth or at the stage colour depth. You also have<br />
the choice <strong>of</strong> importing the Text, Audio, Video to the Director.<br />
The files will all appear in the Cast window. Now your Director movie should be<br />
interactive.<br />
Creating the Score<br />
The score is the most complicated part <strong>of</strong> Director. It consist8 <strong>of</strong> an ever expanding<br />
window that shows you channels horizontally and frames vertically.<br />
At the top left <strong>of</strong> the score there are control channels that let you adjust timing, create<br />
Colour changes; insert transition effects; and add sounds. You access these features<br />
by double-clicking in any frame in that channel.<br />
The best way <strong>of</strong> placing the cast members on to the score is to select them in the Cast<br />
window (by Shifl-clicking or choosing Select All from the Edit Menu).<br />
Adding lnteractivlty<br />
The next stage is to add buttons to the screens by putting an invisible box around each<br />
<strong>of</strong> the buttons we created on the Menu screen in Frame 1. For this we need to select<br />
the Tool palette from the Window Menu. Choose the empty rectangle and ensure that<br />
the no line option is clicked.<br />
They also appear as new cast members in the Cast window (as do the scripts). Double<br />
click on the button in the frame and the Cast Member Propert Window will appear.<br />
Click on scrip1 an type: go to frame 10 . Do the same for the other buttons<br />
The next stage is to put tnvisble redangles over the return to menu buttons In each <strong>of</strong><br />
the other screens and write the script "go to frame 1".<br />
Now your Oiredor mo& should ba interactive.
Multimedia Authoring Content8<br />
Making a Projector<br />
At the moment the movie can only be played using Director. It is possible however to<br />
turn it into a Projector - a self-contained program which can be played without Director<br />
even king on the computer.<br />
NAVAGUNJAR<br />
Multimedia and Web Technology Lab,<br />
9, Sweet Housing Complex,<br />
Ganganagar,<br />
Bhubaneswar - 751 006<br />
Tel : 91-674-425310,427514<br />
Email : jayaramp@yahoo.com
MULTIMEDIA -on the Web<br />
Jayaram Parlda<br />
( MCS, Mulfimed~s 6 Web Developer )<br />
NA VACUNJAR<br />
Mullimed~and Web Technology Lab<br />
9, Sweet Housing Complex<br />
Gangsnagar, Bhubaneswar - 751 006<br />
Multimedia is a technology which is have everywhere uses for making the<br />
thinks more attractive and more Interactive. Web Technology was dry without<br />
multimedia on 90s. When technology updated by putting graphics on the WebPages<br />
and later come to the animation. And finally now the revolution <strong>of</strong> real Audio and Real<br />
Video which plays a great role on the web and yet to be advanced for more realistic for<br />
the standard system and real application. Here is an detail overview <strong>of</strong> pulting<br />
Animation, Streaming Aud~o and Streaming video on the web for your web Page<br />
design.<br />
Getting lnto Motion -a Guide for Adding Animation lo Your Web Pages<br />
As a frequent Web traveler, you've probably encountered a number <strong>of</strong> pages<br />
that contain various animated objects--from bouncing logos to ads for speeding cars<br />
and bubbling a<strong>of</strong>l drinks. It used to be that a striking background image or a fancy rule<br />
line was all that differentiated the average Web page from one that was really cool.<br />
That, however, has all changed with the advent <strong>of</strong> animated GIFs, Java applets, and<br />
Web browsers that make it easy to host these new elements. If you're thinklng that<br />
you'll have to learn a new programming language, you can breathe a sigh <strong>of</strong> relief.<br />
Although we'll explore animation techniques that rely on Java, there are several ways<br />
you can spice up your pages without having to perform any programming.<br />
GIF Conrtructlon Set<br />
On the PC, the most popular program for creating animated GlFs is Errorl<br />
Bookmark not defined, from Alchemy M~ndworks. This easy-to-use, inexpensive<br />
shareware package supports image looping, interlaced GIF images, and transparency.<br />
It also features an Animation Wizard that will guide you through the process <strong>of</strong> selecting<br />
and preparing an animation sequence.<br />
Two other notable features in Construction Set are the "banner" and "transition"<br />
tools. The banner tool allows you to type in a text message, which is then turned lnto a<br />
scrolling GIF image. The transition tool lets you select an image and then apply one <strong>of</strong><br />
several special effects to create one that's animated. The release I tested supported<br />
four types <strong>of</strong> wipes, several splits, tiling, and an interlaced effect.
GifBuilder for Macintosh<br />
Macintosh users will find an equally powerful tool in Yves Piguet's freeware<br />
application Errorl Bookmark not defined.. This program even surpasses some <strong>of</strong> the<br />
capabilities found in Construction Set by supporting a built-in scripting language lhat<br />
<strong>of</strong>fers you total control over the creation and sequencing <strong>of</strong> images.lf you want lo see<br />
some examples <strong>of</strong> work done by other people and technical information on the GIF89a<br />
format, visit Errorl Bookmark not defined. and then follow the link to the GIF<br />
Animation Gallery.<br />
Java Gyrations<br />
Since Java is a programming language, you can have enormous control over<br />
the way animation sequences are performed--provided you do the programming.<br />
Applets, which are Java programs meant to be run from inside a Java-enabled browser<br />
(such as Netscape or Internet Explorer), allow you to do virtually anything with images.<br />
Java also includes built-in classes for manipulating GIF and JPEG images. But writing<br />
code to do really cool things is difficult--in any language. So why not use some pre-built,<br />
<strong>of</strong>f-the-shelf Java classes for animation<br />
Which Way do we go<br />
The question <strong>of</strong> whether to use GIF images or Java applets for your animatton<br />
depends on what you want to do. If you want to use both GiF and JPEG images, tie in<br />
sound, support navigational control, and can rely on your users to have a Java-enabled<br />
browser (which will be practically everyone very soon), then Java is a great way to go.<br />
Applets like Animator and CltckBoard <strong>of</strong>fer ready-to-use solutions lhat don't require any<br />
programming. All you do is create the artwork, store some Java class files on your Web<br />
server, and add an tag in your HTML file.<br />
The downside lo using Java applets, compared to GIF89a images, is the<br />
additional download time. The two Java applets we've described are each<br />
approximately 20 KB in size. Plus, they both use separate image files for each frame, If<br />
you had an animation sequence that required 10 images, that would mean 10 separate<br />
GETS your Java applet would be performing back to a Web server. Animated GIF<br />
images, on the other hand, are completely self-contained, with no extra code to<br />
download.<br />
What makes Enhanced CU-SeeMe great for Webmasters is that you can add a<br />
few lines <strong>of</strong> HTML to your page and point people to reflector s<strong>of</strong>tware residing on your<br />
server, so that lhey only have to click on a link to start up their own CU-SeeMe s<strong>of</strong>tware<br />
and join your conference automatically. The White Pine Reflector s<strong>of</strong>tware, needed to<br />
run conferences with more than two people, is currently available on 11 Unix platforms.<br />
as well as lor Windows 95 and Windows NT.
For simple animations intended for Netscape 2.0 or later and Internet Explorer<br />
3.0, consider going the GIF route. Both GIF Construction Set and Gifeuilder are<br />
capable tools. For enimation purposes, ActiveX components are. for now, a relative<br />
unknown. They have the potential to do almost anything a Java applet can do, but<br />
faster. Some <strong>of</strong> the early ActiveX animation controls, such as Future Wave's<br />
Futuresplash, are very impressive. Expect your choices in this arena to mushroom. The<br />
hardest part is preparing artwork that strikes a balance between appearance and<br />
compactness. On the Web, the name oi the game, besides looking good, is loading<br />
fast.<br />
Produce Streaming Audio that Satisfies<br />
After a somewhat slow start, Web sites that are capable <strong>of</strong> delivering relatively<br />
tow-bandwidth audio content are appearing with greater frequency, most llkely in<br />
response to the increasing number <strong>of</strong> multimedia-capable PCs hooking into the<br />
Internet. The current <strong>of</strong>ferings from some <strong>of</strong> the major suppliers <strong>of</strong> Internet audio<br />
s<strong>of</strong>tware now include the ability to stream live audio across the Net, typically through<br />
14.4 Kbps and 28.8 Kbps modems, which in turn has fueled the growth <strong>of</strong> Web "radio"<br />
programming and other real-t~me content.<br />
There are a number <strong>of</strong> different approaches taken for Internet-based audio<br />
delivery. Sewer-based audio solutions are currently the only way to stream live audio<br />
on the Internet. Most people will find the installation <strong>of</strong> a sewer to be the least<br />
complicated component <strong>of</strong> delivering audio. The server install is somewhat similar to<br />
setting up a httpd server, using a stand-alone daemon and a configuration file that is<br />
read on initialization, which specifies the root location <strong>of</strong> the encoded audio files. In this<br />
column, we are going to focus on the process <strong>of</strong> encoding audio and delivering it from<br />
your Web site, using the Rea!Audio 2.0 server and audio tools as an example, which I<br />
recently tested for use on the W Q Web Connection.<br />
Preprocess Before Encoding<br />
When uslng pre-exlsllng source ~t IS not uncommon to flnd d~gltal aud~o f~les<br />
that are hundreds <strong>of</strong> megabytes or more In slte Be sure that you have sufficient hard<br />
dlsk capacity for both the source and final encoded aud~o content Gwen the relatively<br />
low cost <strong>of</strong> hard dr~ves, ~t IS wlse to conslder a mlnfrnum <strong>of</strong> a gtgabyle capac~ly to<br />
process your content w~th, lf you are entertalnlng thoughts <strong>of</strong> hour-long aud~o files If<br />
you are plannlng to archwe your source mater~al a tape backup IS essential<br />
Encoding Audio<br />
Once you have finished preprocessing, the encoding process itself is eesy.<br />
When using the RealAudio encoder, select the target bandwidth encodlng that the
source should be processed with. RealAudio servers have the ability to negotiate<br />
content delivery based on the RealAudio Player's setting, and deliver either a 14.4 Kbps<br />
or 28.8 Kbps bandwidth selection. Accordingly, this also means that you have to<br />
encode each source twice if you plan to <strong>of</strong>fer users the choice <strong>of</strong> negotiated content<br />
delivery. There are still quite a few users that surf the Web using 14.4 modems, but the<br />
audio quality <strong>of</strong> 28.8 is noticeably better and should be <strong>of</strong>fered if at all possible.<br />
Producing usable audio can be a trying experience, particularly when you<br />
realize that the audio quality at best will be on par with a mono FM signal. That being<br />
said, properly-prepared audio can add a high degree <strong>of</strong> quality to the experience<br />
someone has visiting your site. It takes time and patience to produce good audio<br />
content.<br />
Puttlng Vldeo on Your Web Slte:<br />
The Baslcr<br />
Video is a medium that is as direct as print and catches more attention. If your<br />
company has something to say with video, that video should be on your Web site. This<br />
year, exciting new plug-ins and helper apps for Netscape Navigator make it possible to<br />
inlegrale video into your Web page, making it more like a CD-ROM. Other helper apps<br />
make it possible to "stream" video. Streaming video is attractive to many, because even<br />
though It Is much lower quality, there is hardly any wait for download.<br />
Although il's time-consuming, the process <strong>of</strong> digitizing, editing, and uploading<br />
your video files is not an extremely complicated process. The only thing that should<br />
scare you about the process is the bandwidth that you will be using (and the legal<br />
problems <strong>of</strong> posting clips that may not belong to you). Before you get serious about<br />
doing thls, you should ask yourself: What is the value the video adds to the Web site<br />
Does it justify the effort spent digitizing the video and making it ready for the Web W~ll<br />
people who come lo the Web site actually spend their time downloading it At 28.8<br />
Kbps, a 1 MB file representing a few seconds <strong>of</strong> video will take about 10 minutes to<br />
download. Spend a day or two surfing the Web looking for video files, and download as<br />
many as possible to get a good picture <strong>of</strong> how and why other people are using video on<br />
the Web.<br />
There are three main video file types that you will encounler on the Web:<br />
QuickTime, AVI, and MPEG. MPEG and QuickTime are most commonly found, with<br />
QuickTime probably being the most popular; many large entertainment sites (such as<br />
Errorl Bookmark not defined., Errorl Bookmark not defined., and Errorl Bookmark<br />
not defined.) use QuickTime exclusively.
AVI is a Windows-oriented video format that is not used as much as QuickTime<br />
or MPEG because <strong>of</strong> problems with syncing up audio and video. For this reason, AVI is<br />
the least popular <strong>of</strong> the three main file formats on the Web. Easy conversion from the<br />
other formats to AVI is available. Since QuickTime is readlly available for Wlndows as<br />
well as the Macintosh, the need for AVI is rapidly vanishing from the Web.<br />
MPEG's (Ermrl Bookmark not defined.), main advantage over QuickTime is<br />
the extremely high output quality. MPEG was developed as an international standard<br />
for use in CD-ROMs, video games, and other media that require quality digital video.<br />
For the trade<strong>of</strong>f <strong>of</strong> using slightly larger files, you get much higher-qualtty video, with up<br />
to 30 frames per second (the same as standard American N).<br />
Process Your Video<br />
The first step in the process is finding video to process. The higher the source<br />
quality, the higher the results after you digitize it. So try to get source thal is htgher<br />
quality than VHS, possibly Hi8 or even Betacam. Hi8 is probably su~table for most Web<br />
projects. If you work in the entertainment industry, you no doubt have access to higherquality<br />
equipment than Hi8.<br />
If you want to work in QuickTime, digitizing is not a problem Many Macintosh<br />
systems come with built-in AN equipment that makes digitizing video as easy as<br />
plugging in a video source and having enough disk space. Error1 Bookmark not<br />
defined, makes the extremely popular Videovision board, which is a hardware solution<br />
for video capture.<br />
When capturing wdeo for use only on the Web, cons~der the size <strong>of</strong> your movie.<br />
Unlike CD-ROM, you probably are not shooting for full-screen vtdeo wtth the best<br />
resolution possible from QuickTime. Instead you are trying to get a small, light image<br />
that looks good with compression. Using the plug-in to embed QuickTime in your Web<br />
page makes a great impact, but you have to plan ahead <strong>of</strong> time as to how large or<br />
small you want the movie to be. Choose standard sizes to capture video; for the Web<br />
the standard is a small 160x120 pixels.<br />
Sound Advice<br />
Sound is a very important element in video that has been sadly neglected by<br />
many people. Your best bet for achieving quality sound is to get an audio-editing<br />
s<strong>of</strong>tware package, and treat the sound in your video as a separate element that needs<br />
special attention. Separate the audio from your video (in QuickTime the easiest way to<br />
do this is with MoviePiayer 2.1 and exporting the audio to AIFF). Listening to the audio<br />
separately with headphones (preferred) or decent speakers gives you a better Idea <strong>of</strong><br />
what people will hear. W'tether or not people who download the video actually pay
special attention lo the audio separately is not the issue; poor audio quality will affect<br />
their overall impression <strong>of</strong> the video quality.<br />
Tools like SoundEdit 16 from Errorl Bookmark not defined. allow you to<br />
remove the sound from QuickTime files and edit it like regular audio, adding filters and<br />
equalization that will be necessary to get powerful sound out <strong>of</strong> your video. Another<br />
important feature in the latest release <strong>of</strong> SoundEdit 16 is built-in IMA sound<br />
compression for QuickTime, which allows 4:l compression <strong>of</strong> the audio track in movie<br />
files.<br />
The final process <strong>of</strong> getting your video digitized and ready for the Web is<br />
compression. For QuickTime there are several applications that just handle<br />
compression. The most popular compression is Errorl Bookmark not defined., a<br />
cross-platform compressionldecompression s<strong>of</strong>tware package that has been used by<br />
many companies (including Errorl Bookmark not defined., makers <strong>of</strong> PC audio and<br />
video equipment). Cinepak is the best compression method for most video needs,<br />
although using it can be time-consuming, and balancing image quality and compression<br />
can be tricky. On the audio side. the previously mentioned IMA supports 4:l audio<br />
compression at 16 bits <strong>of</strong> resolution. This allows your audto to sound great while not<br />
becoming a burden in terms <strong>of</strong> bandwidth.<br />
Upload Itl<br />
Once you have produced your video, getting it on the Web is an easy process. If<br />
you use an Internet Service Provider, find out how much dlsk space you are allowed to<br />
use. if you have several large video flles to upload, you may be exceeding your disk<br />
quota. Most lSPs have a quota on bandwidth as well, and if your videos are popular,<br />
you may break this quota. A typical quota is transferring 200 to 300 MB a day. If you<br />
have a 2 MB movie file, it will take only 100 downloads a day to exceed your quota.<br />
After uploading the file, you'll have to create a link to it on your Web page.<br />
Pages with video commonly will have a JPEG screen shot <strong>of</strong> the video at the actual<br />
size (sometimes people will enlarge the image, but this fools people into thinking the<br />
video size is larger than it is). Next to the screen shot, tell the viewer what format Ihe<br />
video Is in, ~(s length in minutes, and how much disk space it takes up. Leaving out this<br />
information will hurl your chances <strong>of</strong> people actually viewing the clips, as people don't<br />
want to download sotnelillng they are not sure about. As a final check, download the<br />
file yourself, using several different viewing programs, to make sure it works with all <strong>of</strong><br />
them from the Web.
Streaming AudloNideo<br />
"Streaming" audio and video over the Web has received lots <strong>of</strong> attention this<br />
past year. It started with Errorl Bookmark not defined., which allowed streaming<br />
audio. The quality was AM or worse, but it allowed near-instant playback without waiting<br />
for a full download, and this caught a lot <strong>of</strong> people's ears Shortly after RealAudio<br />
became popular. Xing Technology released Errorl Bookmark not defined.. which<br />
claims to deliver streaming video over even 14.4-Kbps modems. Over a faster<br />
connection, like a TI line, I was able to get a large color image that was very out <strong>of</strong><br />
sync with the audio, with audio qual~ty that was about the same qualily <strong>of</strong> RealAudio.<br />
This level <strong>of</strong> video quality would not be acceptable with conlent like sporting events and<br />
actlon films, but for a live event such as a press conference it is very suitable.<br />
The concept beyond these stream~ng technologies is that complicated<br />
compression s<strong>of</strong>tware is Installed on the server side that encodes the video so that it is<br />
able to be sent to the client for real-time presentations in spite <strong>of</strong> severe bandwidth<br />
I~mitations. The client IS expected to download helper apps that can read the<br />
compression type that the server s<strong>of</strong>tware is sending The helper apps are usually<br />
given away free to encourage a large user base. The server s<strong>of</strong>tware is given out for<br />
trial per~ods and is usually pretty expensive for full ve~.sions
WORLD WlDE WEB, THE INFORMATION STORE HOUSE<br />
Bijaya Kumar Panda', Ashwinl Kumar Nayak*,<br />
A. K. Roy" and P. K. Satapathy*'<br />
MCA Third Year Students <strong>of</strong> IGNOU (Utkal Univenily Sludy Centre)<br />
"Computer Section<br />
<strong>Central</strong> institute <strong>of</strong> Freshwater <strong>Aquaculture</strong><br />
Kausslyagsnga, Bhubaneswar 757002<br />
INTRODUCTION<br />
Traditionally, lnternet had four application as follows:<br />
E-mall:The ability to compose, send, and receive electronic mail has been around<br />
since early days <strong>of</strong> ARPANET and is enormously popular.<br />
News: News groups are specialised forums in which users with same interest can<br />
exchange messages. Thousands <strong>of</strong> news groups exist, on technical and<br />
nontechnical topics.<br />
Remote Login: Using telnet, Rlogin or other programs, users anywhere in the lnternet<br />
can log into any other machine on which they have an account.<br />
File transfer: Using FTP programs, it is possible to copy files from one machine on the<br />
internet to other machine.<br />
Until 1990's the lnternet was largely used by academic, Government and<br />
industrial researchers. One new application called World Wide Web(WWW) brought<br />
revolution in lnternet and brought millions <strong>of</strong> new non-academic users to the net.<br />
WHAT IS WORLD WlDE WEB (WWW)<br />
The WWW is an architectural framework for accessing linked documents spread<br />
out over thousands <strong>of</strong> machines all over the Internet. It is a huge collection <strong>of</strong><br />
interconnected hypertext documents. A hypertext document is a document that contain<br />
hot links to other documents. Hypertext links are usually visible as highlightedlunderline<br />
words in text, but they can also be graphics.<br />
BIRTH OF WORLD WlDE WEB<br />
The web began in 1989 at CERN, the European center for nuclear research.<br />
The initial proposal for web <strong>of</strong> linked documents came from CERN physicist Tim<br />
Berners-Lee in march 1989. The first prototype was operational eighteen months later.<br />
In December 1991 a public demonstration was given at the Hypertext '91 conference in<br />
San Antonio, Texas. The first graphical interface, MOSAIC, was released in February<br />
1993.
WHAT IS WEB PAGE<br />
As mentioned earlier the web consists <strong>of</strong> a vast world wide collection <strong>of</strong><br />
documents. These documents are called Web pages or simply Pages. Each page may<br />
contain links to other related pages anywhere in the world.<br />
In addltion to having ordinary text and hypertext, web pages also contain icons,<br />
line drawings, maps and photographs Each <strong>of</strong> these can be linked to another page.<br />
Clicking on one <strong>of</strong> those elements causes the browser(Programs which enable us to<br />
view pages) lo fetch the linked page and display it. The steps lhal occur between the<br />
user's click and page being displayed are as follows.<br />
The browser determ~nes the URL(Uniform Resource Locator ) by seeing whal<br />
was selected.<br />
The browser asks the DNS for IP address <strong>of</strong> the concerned server<br />
DNS replies with the IP address.<br />
The browser makes a TCP connection to port 80 <strong>of</strong> the concerned sewer.<br />
It then sends a GET file command.<br />
The concerned server sends the required Itla.<br />
The TCP connection is released.<br />
The browser displays all the text In the {lie.<br />
The browser fetches and displays all images in the f~le<br />
WHAT IS HOME PAGE<br />
For a user the home page IS the starting pant for exploring a single site on the<br />
whole WWW. It can be thought <strong>of</strong> as a kind <strong>of</strong> "Main Menu". A homepage outline your<br />
options- at least moving along the hnks from this site to other po~nts <strong>of</strong> i~:!erest, as<br />
imagined by the publisher <strong>of</strong> this site. To whomever publishes 11, the homepage is a<br />
part <strong>of</strong> advertisement, part <strong>of</strong> directory and a part <strong>of</strong> part <strong>of</strong> "reference librarian".<br />
Just to clarify lhings a bit, a website may be a s~ngle page or a collect~on <strong>of</strong><br />
pages. The main page among a number <strong>of</strong> pages is the homepage A web server is<br />
the machine and s<strong>of</strong>tware lhat house lhe web site. In feebly a home page is e<br />
hypedext document Ihet has links to <strong>of</strong>her points on Ihe web.<br />
The web is based on two standard. The HlTP protocol and HTML language.<br />
HTTP stands for Hypertext Transfer Protocol and it describes the way that hypertext<br />
documents are fetched over Internet. The HTTP protocol consists <strong>of</strong> two fairly distinct<br />
items: the set <strong>of</strong> requests from the browser to servers and a set <strong>of</strong> response going<br />
back the other way. All newer versions <strong>of</strong> HTTP supports two kinds <strong>of</strong> requests: simple
equest and full request. A simple request is just a single GET line naming the desired<br />
page, without the protocol verslone. The response is the raw page without any headers,<br />
no MIME and no encoding. The H'ITP was designed with an eye to future object<br />
oriented applications. HTML is the abbreviation for Hyper Text Markup Language and<br />
it specifies the layout and linking command present in the hypertext documents<br />
themselves.<br />
HOW TO WRITE A WEB PAGE IN HTML<br />
In HTML a user can produce web pages that include text, graphics and pointers<br />
to the other web pages. Web pages require mechanisms for naming and locating<br />
pages. Each page is assigned a URL that effect~vely serves as the world name.<br />
Ex:<br />
http'-:&Qlabouvhlslorv.html<br />
1 1 1 1<br />
protocol sewer address port no<br />
directory and file name<br />
A proper web page consists <strong>of</strong> a head and body enclosed by<br />
HTML> ....... tags. The commands inside the tags are called directives. HTML<br />
tags have following format.<br />
to mark the beginning and marks the end <strong>of</strong> it.<br />
Some popular tags are given below:<br />
TAGS -<br />
Declares the web page to be written in HTML.<br />
Delimits the pages head<br />
Defines the title<br />
Delimits the page's body<br />
Deltm~ts a level I header. 1=1..6.<br />
Set ... in bold face<br />
Sel..in italics<br />
Bracket an unordered list<br />
Bracket a numbered list<br />
Bracket a menu <strong>of</strong> <br />
Start a list <strong>of</strong> item<br />
Force a break
Form <br />
-<br />
Horizontal<br />
......*RE><br />
Do<br />
<br />
Load<br />
-=A HREF=' ....' >..,
Include thumb nails for large downloaded images<br />
Remember that people will access your page using different browsers and different<br />
platforms<br />
Keep file names short: make them consistent<br />
Tell people the size <strong>of</strong> downloadable 61es if you include them<br />
Findout if you need permission to use text or images created by someone else<br />
Establish who is going to webmaster and make link on your page leading<br />
webmaster<br />
Build prototype and test thoroughly<br />
Announce and publicize your page where possible
Designing and planning Your Database<br />
In designing a database you plan what tables you require and what data they wiit contain.<br />
You also delemine how the tables are related.<br />
You must determine what things you want to store information about (eech one is an entily)<br />
and how these things are related (by a relationship) A useful technique In designing your<br />
database is to draw a pidura <strong>of</strong> your tables. This graphical display <strong>of</strong> a database is called<br />
an Entlty-Relationship (€4) diagram. Usually, each box in an E-R diagram ccrmsponds to a<br />
table in a relational database, and each line from the diagram mrresponds to a forelgn key.<br />
Entity<br />
Each table in the database describes an entity; it Is the database equlvatenl ol a noun.<br />
Employees, order Items, departments and produds are all examples <strong>of</strong> entities represented<br />
by a table in a database The entilies that you build into your database arise from the<br />
adivities for which you will be uslng the database, whether that be lracklng $ales calls.<br />
malntainlng employee infomation, or some other adhky.<br />
Relationship<br />
A relationship between entities is the database equivalent <strong>of</strong> a verb. An employee Is<br />
associated with a department, or an <strong>of</strong>ftce is located In a city Relationships in a database<br />
may appear as foreign key relationships between tables, or may appear as separate tables<br />
themselves. The relationsh~ps in the database are an encoding <strong>of</strong> rules or praclicas<br />
gweming the data in the table. If each department has one department head. then a sinple<br />
column can be buin into the depslrhent table to hold the name <strong>of</strong> the department head.<br />
When these rules am built Into the drudure <strong>of</strong> the database, there Is no pmlsion lor<br />
exceptions: there is nowhere to put a semnd department head, and duplicating the<br />
department entry would involve duplicating the deparlmenl ID. wh~ch is the prlmary key.<br />
Relationships between tables<br />
There are three kinds <strong>of</strong> relalionship between tables:<br />
One-Imny relationship<br />
Onelo-one relationships<br />
. Many-to-many relawnshlps<br />
Them am five major d ep in We dwn process.<br />
Step 1: identify entiUes and relationships<br />
Step 2: identify the required dsts<br />
dep 3. nomlize the data<br />
Step 4: resolve the Wonships<br />
SIep 5: verify the d&jn
ldenttfy entities md relationships<br />
To idun\Hy the entities in your design and their relatbnshlp to each other:<br />
1 .Define high-lewl actlviU.s. ldenbfy !he general erne you will ma thk3 dalebase for.<br />
For exarnfle, you may want to keep trad <strong>of</strong> infomation about employees.<br />
2.ldentify entities. For lhe Hsl <strong>of</strong> aduities, Identify the wbjed areas you need to maintain<br />
information abouL These will become taMes. For example. hire employees, essign to a<br />
department, and determine a sWU level.<br />
3.ldentify relattonrhips. Look at the adiiities and determine what the rela(ionships will be<br />
between the tables. For example, there is a relationship between departments and<br />
employees. We glve this relationship a name.<br />
4.Bre.k down the activities. You started out with htghlwel adivies. Now examine these<br />
acllviiies more arcfully lo see If some <strong>of</strong> them can be broken down Into lower-level<br />
act~iiles. For example, a Iilgh-level activity sub as maintaln employee information can be<br />
broken down inlo:<br />
1 .Add now employees<br />
2.Chanpe existing employee information<br />
3,Delele terminated employees<br />
To identify the required data:<br />
1 .Identify supporting dala.<br />
2.Llst all tlie dala you will need to keep track <strong>of</strong>. The data that describes the table (subject)<br />
answer8 the questions who, what, where, when, and why.<br />
3.Set up data for each table.<br />
4.Llst the evailable data for each table as il seems appropriate righl now.<br />
5.Sei up dais for each relationshlp.<br />
0.List the data that applies lo each relationship (if any).<br />
Nonnallze th* data<br />
Normallzatior~ Is a series <strong>of</strong> tests you use to eliminate redundancy In the data and make<br />
sure the data is associated wtth the coned table or relatlonshlp.<br />
To normalize the dala:<br />
1 .List tha data:<br />
2.ldenllfy at least one key lor each table. Each table must have a primary key.<br />
3.ldmtlfy keys for relatlonshlps. The keys for a relaUonshlp am the keys lrwn the two tables<br />
it joins.<br />
4.Check for calculated dala in your supporting dala IW. Calculsted data is noi normally<br />
stored in the datab.se.<br />
S.Pul data In nnl nonna) Ion:<br />
6.Remwe repeatlng dala fmm tables and relationships.<br />
.Create one or more tables end relalionships with the data you remwe.<br />
0.Put data In second normal lorm:<br />
9.idenlWy tables and relationships with mom than one key.<br />
10.Remwe data that depends on only one par! <strong>of</strong> the key.<br />
11 .Create one or more tables and relaUonshlps wiM the data you rumwe.<br />
12.Put data In third normal form:<br />
13,Remove dala that depends on other deta In the table or relationshlp end not on the key.<br />
14.Create one or more tables and relaUoruhips with the data you rumwe.<br />
~ut~ng dam in first n o m ronn ~<br />
Remove repeatlng groups.<br />
To test for lint normal form, remwe repeating groups snd putthem into a table <strong>of</strong> their own.
Putting data in second ml fwm<br />
Remove data that does not depend on the W le key.<br />
Look only at tables end relationships Vlsl have mom than one key. To tesi for second<br />
normal fonn, remwe any dala that does not depend on the whale key (all the cdumns thal<br />
make up the key).<br />
Putting data in thkd noml<br />
form<br />
Remove dab that doesn't depend diredly on the key.<br />
To test for thild normal form, remove any dala that depends on other date rather than<br />
diredly on the key<br />
resolve the relationships<br />
When you finish the normalization process, your design is almost cwnplele. AH you need lo<br />
do is resolve the relationships.<br />
5<br />
Resolving relationships that carry data<br />
Some <strong>of</strong> yo esolving relationships thal carry date<br />
Some <strong>of</strong> your relationships may csny dala. This snuation oRen ocwrs in many-to-many<br />
relationships. ,<br />
-- I.<br />
-. I"<br />
When this is the case, change the relaUonship to a lable. Thq key to tho new table mains<br />
the same as It was for the miationship.<br />
Rarolvfng rol#Uonrhlprr Ih8t do not cmy data<br />
In order to Implement relationships thal do not cony data, you need to daRns forelgn keys. A<br />
fonlgn key Is a column or set <strong>of</strong> columns thal wnlalnr prlnury key values from another<br />
table. The fmlgn key allows you to aces, data frwn more than one table al one Ume.<br />
There are some baelc rules that help you dedde where to put the keys:<br />
One to many In a one-to-many relalionship, the primary key In the one Is canled In the<br />
many. In this example, the fomign key goes into the Employee table.
One to one in a one-to-one relationship. the Iombn key can go into enher table. If I is<br />
mandatory on one Me, but not on the other. I( shouM go on the mandatory side. In this<br />
example. the forelgn key (Head ID) is in the Department table bemuse # is mandatory<br />
there.<br />
-..I-<br />
Many to many In a many-temany relalionship, a new table is created with two foreign keys.<br />
The existing tables are now related to each other through lhls new table.<br />
Choosing primary and foreign keys<br />
The primary key is the column or columns that uniquely identify the rows in the table. If your<br />
tables are properly normalized, a primary key should be defined as part <strong>of</strong> the database<br />
deslgn.<br />
A forelgn key is a column or sel <strong>of</strong> columns that contains primary key values from another<br />
table. Foreign key relationships build one-to-one and one-to-many relationships into your<br />
database. it your des~gn is properly normalized. foreign keys should be deftfled as part <strong>of</strong><br />
your database design.<br />
verify the design<br />
Belore you implement your design, you need to make sure it suppons your needs. Examine<br />
the activities you Mentifled at the stail <strong>of</strong> the design procsscr end make sure you can access<br />
all the data the adhrities quire:<br />
Can you find e path to get all the inlomalion you need<br />
Does the design meet your needs<br />
Is ell the mquired data wadable<br />
If you can ansner yes to el the questions above, you am ready to implement your design
DATABASE ON FISH DISEASES<br />
6. B. Sahu ,A. K. Roy, P. K. Satapathy, S. C. Mukhrrjee and S. A<strong>yy</strong>appan<br />
Centre1 Instilute <strong>of</strong> Freshwater Apueculture<br />
Keusalyeganga. Bhubaneswar - 751002<br />
INTRODUCTION<br />
Fish health related information is <strong>of</strong> vital importance in modern aquaculture. A<br />
system for rewrd keeping and health monitoring Is essential for successful aquaculture<br />
production. The basic methodology to develop animal health and disease information<br />
system for farm animals has been described by Hall (1978). This present system is<br />
designed to record diagnosis and diseases in a simple way by transferring data into<br />
separate files. Limitations <strong>of</strong> detail information on fish diseases, definitions<br />
(nomenclature) etc. have been considered and due care have been taken during<br />
development <strong>of</strong> the database information system. Database system to record<br />
exclusively fish disease events have not been reported.<br />
OBJECTIVES<br />
The system can fulfil the following objectives<br />
1. Effective surveillance and monitoring <strong>of</strong> health and disease status in fish<br />
maintained in a farm1 aquaculture pockets.<br />
2. Precise recording and processing <strong>of</strong> regularly gathered morbidity and morality<br />
data to produce comparable indtces <strong>of</strong> diseases.<br />
3. Rapid retrieval <strong>of</strong> disease information and identification <strong>of</strong> variations in disease<br />
events <strong>of</strong> individuals and in fish stock.<br />
4. Standardized storage <strong>of</strong> epidemiological data for retrospective studies.<br />
5. Assessment <strong>of</strong> impact and economic measures adopted to prevent, control,<br />
eradicate and treat diseases and improve aquaculture productivity.<br />
6. Forecasting <strong>of</strong> fish diseases and tips for aquaculture farm operations.<br />
MINIMUM SYSTEM REQUIREMENT<br />
The fish disease data and information system for organized aquacutture sectors<br />
needs the following minimum computer equipment (Hardware) and programmes.<br />
1. IBM PC with a minimum <strong>of</strong> 640 KB memory and 2 x 5.25 360 KB DSDD Floppy<br />
drive.<br />
2. Matrix I Line printer.
The dalabase formal, post-mortem report forms, dala didionary for data entry<br />
have been developed by Fish pathology Division, CIFA, Kausalyaganga, Bhubaneswar.<br />
The system includes the following scientific aspects (s<strong>of</strong>tware):<br />
a) Standardize definilton <strong>of</strong> disease events and diagnosis.<br />
b) Systematic classification <strong>of</strong> diseases.<br />
c) Forms for recording data on clinical, post-mortem, fish stock (pond)<br />
environment and Laboratory examination.<br />
d) Use <strong>of</strong> standard disease indices.<br />
e) Formats for reporting informations regularly<br />
I) Computer programs (s<strong>of</strong>tware) for processing disease data<br />
The disease data will be processed in MS-Excel, from which statistical data<br />
analys~s can be done and finally the output can obtained in graphical form. The<br />
RDBMS packages like ORACLUFOXPRO can be used for data entry and for<br />
sequential querry processing to retrieve information, E-mail can be used extensively to<br />
collect disease informallon at a cheaper and faster way wherever the facility is<br />
available. Mailing list <strong>of</strong> farmers can be maintained to provide Information <strong>of</strong> disease<br />
incidence and precautionary measures to be taken.<br />
CONTENT OF THE SYSTEM<br />
1. Standardize definitions <strong>of</strong> disease events and diagnosis.<br />
2. Systematic classification <strong>of</strong> disease.<br />
3. Forms for recording data at clinical, post-mortem, Laboratory examinations.<br />
4. Use <strong>of</strong> standard disease indices.<br />
5. Formats for reporting information regularly.<br />
6. S<strong>of</strong>tware for processing disease data.<br />
USES OF FISH HEALTH AND POND ENVIRONMENT DATA<br />
A source <strong>of</strong> information for monitoring health status <strong>of</strong> cultured fish stock.<br />
A reminder for prophylactic measures to be undertaken in a aquaculture farm<br />
To monitor optimal productivity <strong>of</strong> the fish farms.<br />
A source <strong>of</strong> information about previous Illness and therapy.<br />
A source <strong>of</strong> information for epidemiological research.<br />
A source <strong>of</strong> clinical and laboratory information.<br />
A source <strong>of</strong> information for planning fish health.<br />
A source <strong>of</strong> information for calculating cost <strong>of</strong> disease and disease control.
INFORMATION GENERATION<br />
Information are generated through the following records<br />
1. Fish stock data register<br />
a) <strong>Aquaculture</strong> farm/sector report<br />
b) Monthly weight gain report<br />
c) Fish stock strength report<br />
d) Monthly Morbiditylmortality report<br />
2. Listing <strong>of</strong> all d~seas events<br />
3. Comparative pattern <strong>of</strong> disease encountered clinically or at post-mortem.<br />
4. Specific morbidity mortality rates <strong>of</strong> different species, class, sex, season.<br />
environment, locality etc., or combinations as desired.<br />
Fish disease information gathering suffer from deficiencies at ail levels in India.<br />
The information available at presenl 1s not effective for surveillance and monitoring <strong>of</strong><br />
fish diseases. An aquaculture information system for the Indian situation has to be<br />
developed at three organizational tlers i.e. 1. National 2. State or Regional and 3.<br />
Farm level.<br />
The uniform data generation, recording and retrieval helps in monitoring <strong>of</strong> fish<br />
health. However, the organizational necessities to provide routine health care,<br />
laboratory diagnosis, drug inventory, schedules <strong>of</strong> vaccination, deworming, d~pping etc.<br />
can not be over ruled. The fish disease information system at the national and regional<br />
levels will be similar, except possibly for the quantum <strong>of</strong> data processed.<br />
SYSTEM IMPLEMENTATION<br />
1. Fish disease information management :<br />
a) Organized farm level :<br />
The information system at organized farm levels has to be different as It will<br />
record and process primary data. The data base maintained at the farm level will be<br />
used for purpose <strong>of</strong> monitoring disease status and production efficiencies (Maw el a/.<br />
1990) . Recording <strong>of</strong> disease event at the farm level will be for the cultured fish in farm<br />
ponds. This system has been designed to record disease related data at organized<br />
farms engaged in aquaculture research. These farms may also be the sentinel farms<br />
for a national disease information system.
) Fanner parlicgalory rapid appmisal (PRA) :<br />
PRA approach and methods have been tried to help the aquaculture farmers to<br />
do their own analysis on fish disease epidemiology, surveillance and monitoring and<br />
make their own needs and priorities known to scientists. It has been found out that<br />
PRA satisfies the acute decision making needs <strong>of</strong> fish disease epidemiology,<br />
aurveiilance and monitoring. Participatory methods <strong>of</strong> 'visualisation', such as<br />
mapping, modeling, matrices, linkages and casual diagramming are powerful, valid<br />
and reliable when well facilitated and performed. PRA is a low cost diagnostic method.<br />
which can be very well applied to fish health surveillance and monitoring. PRA tool has<br />
already been evaluated under 'Institution Village Linkage Programe (IVLP). ClFA<br />
Centre, Kausalyaganga and reported (Sahu el al., 1998) (Please see Annexure ).<br />
CONCLUSION<br />
It has been felt that disease has been and will continue to be a major constraint<br />
to the development <strong>of</strong> aquaculture. Further it has been witnessed high loss <strong>of</strong> revenue<br />
due to d~sease and health related problems. So the importance <strong>of</strong><br />
epidemiologylepizootiology in providing solulioi to aquaculture health problems can not<br />
be overlooked. Fish health diaanost~cians, - researchers and extension scientists should<br />
be familiar w~th on-fan-conditions, diagnostics and therapy. So that the informed<br />
decisions on control and treatment can be made. Further research on epidemiology<br />
and epkootiology <strong>of</strong> aquatic animal diseases will help to develop a comprehensive list<br />
and database on notifiable fieh diseases.<br />
1 The database is expected to provide a feed back to researchers, diagnosticians<br />
for making improvement8 in technology and disease surveillance.<br />
2. Thrust areas <strong>of</strong> need at regionallnational level.<br />
3, Identification <strong>of</strong> appropriate research need and refinement <strong>of</strong> methods to<br />
conduct flsh health research programme.<br />
4. Ranking <strong>of</strong> diseases and syndromes causing key production constraints in<br />
aquaculture.<br />
5. Medium range fish disease forecast can be made from time series data on<br />
organized farms and fish production pockets and fish farmers can be alerted<br />
before farm operations .<br />
REFERENCES<br />
Inglis. V . Roberts, R.J., and Bromage, N.R. (1993) Bacterial Diseases <strong>of</strong> Fish, Oxford Blackwell<br />
Scienlifi Publrcalion. London.<br />
Maru, A. Srivastava, R.S.; P. S. Lonkar, S.C. Dubey and A.L.Choudhury (1990). Sheep<br />
research Database, CSWRl Pubkalion, CSWRI, (ICARJ Avikanagar 304501, Rejasthan.<br />
India.<br />
Sahu. 0. B., Radheyshyam., Uuldeep Kurnar.. Mukherjw; S. C. and S. A<strong>yy</strong>appan (1998).<br />
Farmer participatory flsh disease su~elllrnce and monitoring using PRA tooh, Trop&al<br />
AgdcuHural Resoetch end Extension, l(2) : 1 - 14 pp.
Visualisatio~i <strong>of</strong> Fish disease related infor~l~atior~ tllrougl~ PIM diagnosis<br />
SEASONALITY OF FISH DISEASE<br />
I.".,"<br />
I.*."*.<br />
Il..7.r(.*.<br />
I..."",*.*.<br />
I...*.,",.<br />
.,I..*.<br />
I".<br />
&5-+-/J<br />
I I I . . .<br />
* . .<br />
UOIIVUI W PWCIDENCE 1 J.n 0.d<br />
4 . 1 , 1 ,<br />
FISH DISEASE CALENDAR<br />
1tMI<br />
E U S INCIDENCE It4 VtLLAOES AflOUNO ClFA fMM
SPAWN MORTALITY<br />
FRY MORTALITY<br />
!US<br />
rn<br />
FlNOERLlNO MORTALITY<br />
Lulrovhlc~lion<br />
18%<br />
JUVENILE MORTALITY<br />
FACTORS RESPONSIBLE FOR POND FISH LOSSES
QUANTITATIVE AND QUALITATIVE FISH PRODUCTION DATABASE<br />
9. B Sahu, J. K. Jena, A.K. Roy and S. A<strong>yy</strong>appan<br />
<strong>Central</strong> lnsl~lule <strong>of</strong> Freshwater <strong>Aquaculture</strong>.<br />
Kausalyaganga, Bhubaneswar-751002, Orisse<br />
INTRODUCTION<br />
Fish growth and production related information is <strong>of</strong> vital importance in modem<br />
aquaculture. A system <strong>of</strong> record keeping is essential for the success <strong>of</strong> the production<br />
programmes. The <strong>Central</strong> <strong>Institute</strong> <strong>of</strong> Freshwate <strong>Aquaculture</strong> is worklng to develop a<br />
Computer based system to record and proces quantitative and qualilative fish growth<br />
and production related events in different production systems.<br />
IMPORTANCE OF AQUACULTUE PRODUCTION DATABASE<br />
As aquaculture IS multid~mens~onal ordinary quantltatlve analysis 16 too<br />
Inadequate for arrivlng at any valld consclusion Phys~cal and chemlcal characterist~cs<br />
<strong>of</strong> the water body seed quality, denslty, season, culture system, feeding and<br />
harvesting pattern are the Important factors and proper management <strong>of</strong> all these<br />
factors are essentral for successful operation <strong>of</strong> pcsc~culture act~v~t~es Generally few<br />
major factors are consldered at a tlme, whlle keeping other minor factors at a known<br />
level Even then su~table varlance function are presently not available to compare<br />
product~on parameters from dtfferent water bodles to observe and compare the<br />
treatment effects (Royce, 1996)<br />
USE OF PRODUCTION RELATED DATABASE<br />
Among the many factor, and their interaction influencing the growlh <strong>of</strong> fish are :<br />
genetic make up, species, behaviour, population dynamics, endocrinology and feed etc.<br />
Any single factor should not be consldered in isolation even though overall opt~mising<br />
the various factrors is difficult. Definitive information on optimal growth is lacking for<br />
many culturable species. Growth rates, and qualitative and quanlitative production<br />
parameters under different culture condition can be recorded in a database and<br />
optimum condition for growth can be modelled which would serve as a guide to<br />
researchers and producers (Wathne. 1995).<br />
CONTENTS OF DATABASE<br />
Knowledge <strong>of</strong> production efficiencies and determination <strong>of</strong> growth potentials<br />
which coincide with desired carcass attributes have provided impetus for improvement<br />
in genetic selection and management <strong>of</strong> aquatic animals. The role <strong>of</strong> quantitative end<br />
qualitative carcass data in aquaculture research programmer e~pscialty, genetics and
eeding, production management, feeding and nutrition for evolving suitable<br />
breedistrain for quantity and quality fish production can not be over emphasized. For<br />
this to be accomplished, accurate, standard and uniform methods for carcass<br />
evaluation are critically important. The present database is prepared keeping in mind<br />
the information related to : (a) Physical and chemical characteristics <strong>of</strong> water bodies<br />
(b) seed qual~ty (c) feeding (g) quantitaive production data (growth) and qualitative<br />
(carcass evaluation) production technology informations. Due care has been given for<br />
meterological parameters also.<br />
DATA FILES<br />
The date can be mantained in following data files.<br />
I. Pond environmental records sub database<br />
2. FeedlFertilizer sub database<br />
3. Monthlylfish body weight sub database<br />
4. Meterological record sub database<br />
5. Fishlcarcass quality sub database<br />
6. FishlFlesh quality sub database<br />
1. Pond envlronmental record Sub data bare<br />
1. Sector Code :<br />
2. Pond accession No :<br />
3. Pond size (ha) :<br />
4. Water deplh (m)<br />
5. Stocking density (noslha) :<br />
6. Soil texture (sandylclayielloamy) :<br />
7. Soil available Nitrogen (mg1100g)<br />
8. Soil available Phosphorus (mg1100g)<br />
9. Soil organic Carbon (%)<br />
10. Dale <strong>of</strong> entry :<br />
11. Water transparency (cm)<br />
12. Water temperature ('C) :<br />
13. pH:<br />
14. Dissolved oxygen (mgfl) :<br />
15. Free Cerbon dioxide (md)<br />
16. Total Alkalinity (mg CaCO JI) :<br />
17. Total Hardness (mg CaCOJn ) :<br />
18. Ammonia nitrogen (NH, -N) (md) :<br />
19. Nitrite nitrogen (NO2 - N (mg/L) :<br />
20. Nitrate nitrogen (NO, - N) (mgfl) :<br />
21. Phosphate phosphorous (P205P) (mg~l) :
22. Plankton Count (NoA) :<br />
23. Any others :<br />
2. Feed I Fertilizer management Sub database<br />
1. Sector Code<br />
2. Pond accession No:<br />
3. Pond size (ha):<br />
4. Water depth (m) :<br />
5. Date <strong>of</strong> entry :<br />
6. Stocking density (noslha).<br />
7. Lime (kglha) :<br />
8. Urea (kgha) :<br />
9. Single Super phosphate (kglha) :<br />
10. Micronutient (kglha) :<br />
11. Manure( Cowdung/others) (kgha) :<br />
12. Feed (kgldaylarea) :<br />
13. Any others :<br />
3. Monthly1 Periodic fish body weight Sub data base<br />
Sector Code :<br />
Pond accession No.<br />
Pond size<br />
Water depth<br />
Stocking density<br />
Date <strong>of</strong> Weighing :<br />
Age (days) :<br />
1. Species Code ............................ wt (Gms)<br />
2. Species Code ........................... wt (gms)<br />
3. Species Code ............................ wt (gms)<br />
4. Species Code ............................ wt (Qms)<br />
5. Species Code ............................ wt (8ms)<br />
6. Species Code ............................ wt (gms)<br />
7. Others ........................................ wt(gms)<br />
4. Meterological record Sub database<br />
1. Air temperature ("C):<br />
2. Relative humidity (%) :<br />
3. Rain fall (rnmlday) :<br />
4. Sunshine hours (hrslday) :<br />
5. Wind velocity (spm)<br />
6. Any other :
5. Flshl Carcass quallty Sub data base<br />
Annexure - i<br />
6. FlshlFlesh quality Sub data base with indices<br />
Annexure -11<br />
REFERENCES<br />
Dunham, R.A (1995). International Conference on sustainable contribution <strong>of</strong> fisheries lo food<br />
secuirlty, Kyolo, Japan, 4 - 9 Dac. 1995, 15 - 16 pp.<br />
Royce, W. F. (1996), Introduction to the practices <strong>of</strong> fishery sdance, Acedemic Press. 1NC.<br />
Wathne, E. (1995). Stralegies for direct~ng slaughter quality <strong>of</strong> farmed Atlantic salmon (Salmo<br />
solar) with emphasis on diet composition and fat deposition, Dr Thesis, Agricultural<br />
Univenily <strong>of</strong> Noway, N-1432. Aes, Noway.
DATABASE OF INDUCTED BREEDING EXPERIMENTS ON<br />
AN INDIAN MAJOR CARP Labeo mhita (Ham.)<br />
S. D. Gupta, A. K. Roy, S. C. Rath and P. K. Satapathy<br />
<strong>Central</strong> lnstrtute oiFreshwater Aquaculfun,<br />
Kausalyaganga, Bhubaneswar - 751 002<br />
INTRODUCTION<br />
Over the past years huge data have been accumulated on the breeding<br />
experiment <strong>of</strong> Labeo rohita (Ham.) conducted at CIFA. An attempt IS being made to<br />
form a database <strong>of</strong> breeding experiments using standard techniques applicable for<br />
computerized relational database management system followed by multivariate<br />
analysis which is likely to address a variety <strong>of</strong> research questions which have not yet<br />
been attempted in our country so far. Summary <strong>of</strong> parameters studied and preliminary<br />
results are presented below.<br />
Labeo rohrta (Ham ) IS the most consumer preferred culturable lndlan major<br />
carp belongs to famlly Cypnnldae As llke other lndlan major carp Labeo rohrfa (Ham )<br />
do not breed spontaneously In the confined water <strong>of</strong> culture pond, but breeds In nature<br />
In flooded river durlng monsoon Its non-spontaneous breedtng In captive water may<br />
be due to Inadequate secretion <strong>of</strong> gonadotropln, a hormone <strong>of</strong> ~ts own pltultary Thus<br />
an exogenous lnductron <strong>of</strong> hormone for breedlng In capllve water known as Induced<br />
breedlng The prlnclple <strong>of</strong> Induced breed~ng IS to manipulate the gonadotropln pr<strong>of</strong>lle <strong>of</strong><br />
the ~ndtv~dual to the deslred level by adrnln~strat~on <strong>of</strong> pttultary extract <strong>of</strong> other specles<br />
or Isolated concerned hormones<br />
lnduced breeding <strong>of</strong> Labeo rohrta(Ham.) ever since 1957, the initlal success <strong>of</strong><br />
induced breeding <strong>of</strong> Indian major carps by Choudhuri and Altkunht, has began a new<br />
era in Indian carp culture. Induced breedlng by administrat~on <strong>of</strong> pitullary extract's<br />
popularly known as induced breeding by hypophysation. To standardize the<br />
technology <strong>of</strong> induced breeding and to produce adequate quantity <strong>of</strong> seed <strong>of</strong> Labeo<br />
rohita (Ham.) several breeding experiments have been conducted, but no database is<br />
available on the subject. The present communication is an attempt to create some<br />
database on induced breeding <strong>of</strong> Labeo rohita (Ham.). The study pertains to 462<br />
experiments, from July, 1970 to August, 1982 with carp pituitary extract (CPE),<br />
noncarp piluitary extract (NPE) and gonadal concerned hormone (GCH) as inducing<br />
agents. Again the inducing agents have been adminislered in different combination<br />
and in different protocols. Experiments have been conducted within the temperature<br />
range <strong>of</strong> 27.5 to 35°C. Brood body wt. ranges from 0.3 - 2.7 kg (Male) and 0.4 - 3.5<br />
kg (Female). Spawning fecundity varies from 0.03 lakh eggslkg to 4.18 lakhslkg body<br />
wt. <strong>of</strong> the female. Fertilization rate ranges from 0 to 95 percent and spawn recovery<br />
ranges from 0 to 2.83 lakhdkg body wt <strong>of</strong> female.
INDUCING AGENTS AND SPAWNING RESPONSES<br />
Twenty seven types <strong>of</strong> inducing agents have been used in 462 breeding<br />
experiments. These inducing agents are broadly classified as carps pituitary extract.<br />
noncarp pitu~tary extract and marine fish pituitary extracts. Pituitary extracts with<br />
~solated hormone, in combination with salmon pituitary powder etc. Again carp pituitary<br />
extract in aqueous medium for immediate use and in glycerine medium for instant use.<br />
Glycer~ne medium extracts have been tried after 0 year. 1 year, and 2 years, 3 years,<br />
4 years and 5 years intervals.<br />
Table 1. Spawnlng response In Labeo rohita (Ham.) with different lnduclng<br />
hormones<br />
CPE<br />
GCH<br />
Spawnlng<br />
Percentage<br />
Inducing Agents (+)Tive (-)Tive Remarks On Negative(-)<br />
Spawnlng<br />
Acetone preserved 79.7 20.3 Inadequate diet and improper<br />
carp pituitary in queous<br />
gonadal maturation<br />
extract (ACPAE)<br />
Carp pituitary aqueous 88.9 11.9 High temp., unripe gonads,<br />
extract (CPAE)<br />
incorrect doses<br />
Carp pituitary glycerin 57.2 42.8 High temp., unripe gonads. and<br />
extract (CPGE)<br />
some other unknown factors.<br />
Pituitary extracts and 52.4 47.6 Loss <strong>of</strong> potency in more than<br />
other hormone<br />
two years, adverse weather<br />
combination (PEOMC)<br />
condition, and improper gonadal<br />
maturation<br />
NPE Noncarp pituitary 39.6 60.4 Pituitary extract other than<br />
extract (NPE)<br />
freshwater catfishes, carps and<br />
salmon and single dose <strong>of</strong><br />
salmon pituitary powder.<br />
TEMPERATURE AND BREEDING RESPONSE<br />
Water temperature plays a vital role in carp breeding. In the present study <strong>of</strong><br />
Labeo mhifa breeding 73.5% <strong>of</strong> breeding failure is attributed to the water temp r 32%.<br />
Only 26.5% <strong>of</strong> the non responded instances found in the temperature s 31.5'C.<br />
FERTILIZATION EFFICIENCY<br />
Spawn production depends upon the rate <strong>of</strong> fertilization <strong>of</strong> the ovulated eggs.<br />
Fertilization efficacy i 50% is considered as poor fertilization Instances (PFI).
SPAWN RECOVERY<br />
In the present study the fertilized eggs are incubated in both out door hapa<br />
system (OHS) and in Indoor hapa system (IHS). If spawn recovery r 70% out
THE MILLENNIUM BUG OR THE Y2K WAR<br />
A. K. Roy<br />
Eiohfmatics Centre<br />
Cenfml lnslilule <strong>of</strong> Freshweter Aquaculfure<br />
Kausalyaganga, Bhubaneswar 751002<br />
INTRODUCTION<br />
Y2K is an abbreviation which stands for 'year two thousand' (K is representative<br />
<strong>of</strong> a K~lo which is equivalent lo a thousand). The Y2K problem is also known as<br />
MILLENNIUM BUG. The year 2000 (Y2K) problem may be defined as the inability <strong>of</strong><br />
computer program to correctly interpret the century from a date which represents an<br />
year as a two d~g~t value. The war - 'THE Y2K WAR' deals with simple problem that<br />
involves just two d~gits. A wide variety <strong>of</strong> computer programs that display, manipulate<br />
or store dates have adopted the shorthand convention <strong>of</strong> using only the last two digits<br />
<strong>of</strong> the year Many <strong>of</strong> these programs will fail when using dates beyond 1999,<br />
parlrcularly if they compare those dates with earlier dates. It is estimated that the effort<br />
required to identify and fix the problem in all systems may take several years and<br />
thousands <strong>of</strong> programmer's hours to complete. This paper describes types <strong>of</strong> problems,<br />
misconceptions, apprehensions, remedies and opportunities associated with the Y2K<br />
problem.<br />
BACKGROUND OF Y2K PROBLEM<br />
The majority <strong>of</strong> computer applications is use today were developed years ago<br />
when the year 2000 seemed to far in the future to worry about. These programs<br />
historically represented the year portion <strong>of</strong> a date using only two digits. Dates are<br />
critical to computers. Most dates programmed in computers are based on a two-digit<br />
year field for instantce '99" rather than '1999". There are two main reasons why a twodigit<br />
field has been the norm among programmes over the last 50 years, firstly, the high<br />
cost <strong>of</strong> storage in the early days <strong>of</strong> computing and secondly as systems and<br />
applications were constantly being developed and replaced, it was never realised lhat<br />
they would last till the advent <strong>of</strong> new millennium. Some believe lhat this problem is<br />
partly due to farsightedness and partly due to lack <strong>of</strong> resources. The problem exists<br />
for mainframe, mid-range and PC computers alike. The two-digit year field can be<br />
found in microcode, operating systems, s<strong>of</strong>tware compilers, application queries,<br />
production screens and data bases. The problem was not thought <strong>of</strong> earlier, but it was<br />
realised when some sobare which deals with future dates i.e. renewal dale, License<br />
expiry date etc. started giving problems.<br />
As believed, the year 2000 problem comes from, but not limited to, the use <strong>of</strong> a<br />
2-digit year (<strong>yy</strong>) format, instead <strong>of</strong> a 4diiit (<strong>yy</strong><strong>yy</strong>) format for year representation within
programs, databases, files and procomes. As for an example. the year 1997 is<br />
repm~nted as '97'. The year 1998 as '98, and so on. Likewise February 29, 2000 is<br />
represented as 02/29/00 (using MMDDW format) which might bs interpreted as<br />
February 29.1900. Consequently, programs those perform arithmetic operations,<br />
comparisons or sorting <strong>of</strong> date klds to yield correct results when manipulating dates in<br />
the year 2000 and beyond may be affected.<br />
Some <strong>of</strong> the misconceptions about the year 2000 challenge with clar~ficalion are<br />
as follows.<br />
i) That the problem occurs only when or after the century rolls over<br />
ii)<br />
iii)<br />
That it is a hardware clock problem whrch should be solved by computer<br />
vendors.<br />
That this is a problem that occurs only in mainframe systems and or core<br />
application<br />
i) In forecasting applications thal deal with fulure dates will face problems In<br />
advance <strong>of</strong> the year 2000. Cases that deal with expiration dates that go beyond<br />
the 2000 are already at risk.<br />
ii)<br />
iii)<br />
iv)<br />
Contrary to the bel~ef that it is a hardware problem, in realty the problem comes<br />
mostly from application programs.<br />
Any program or system can be affected if it uses only two digits for<br />
representation <strong>of</strong> year in any file, database, logs wilh 2-digit year fields and any<br />
data entry, update and output processing that employs 2-digit year fields.<br />
Y2K problem will have impact at all levels in Hardware level, operation system<br />
level and application s<strong>of</strong>tware level.<br />
THE NATURE AND STRATIFICATION OF THE PROBLEM<br />
The year 2000 problem (phenomenon) has broad impact and can be visible in<br />
various ways. This phenomenon has both a information processing systemwide and an<br />
institutionwide impact on computing environment. Within system, this phenomenon can<br />
originate from or affect many key components like hardware, s<strong>of</strong>tware, people, data<br />
and procedures. Instlutionally this can act as the contaminated data files to other<br />
computing systems inside or outside the organizations. This is a complicated problem<br />
wilh far reaching consequences but it is not beyond solution. This problem may also<br />
affect microcoded hardware like VCR and digital clocks. The year 2000 syndrome is
compounded by many varialions used to ex- year and date notatio~ in data, the<br />
mathematical calculations performed on thoae data notations and in many places<br />
where date data may occur. These variations are stratiii as follows:<br />
w: Likely problems may be encountered when the 1st two digits in a<br />
year are assumed to be 19 and ignored during data entry, manipulation or hard<br />
coded on output.<br />
P u r e d Sometimes special values <strong>of</strong> the last two digits in a<br />
year might be used for a special purpose, for example 99, 365199 or 12.31.99<br />
might be used to indicate 'no expiration date' or 00 to indicate an 'unknown<br />
year'.<br />
incorrect Many programs determine the date format (MM<br />
DD YY or DD MM YY or YY MM DD) by testing an appropriate part <strong>of</strong> the dale<br />
field. A value <strong>of</strong> zero might be considered as lack <strong>of</strong> any date at all.<br />
Arilhmrllc: Many arithmetic calculations that operate on dates with 2-digit<br />
year representation might have potential danger. A person with a birth year <strong>of</strong><br />
1951 will be considered to be 51 years old rather than 49 years old in 2000 if<br />
the year 1951 and 2000 are represented by 51 and 00 respectively.<br />
SPdlng; When two digits are used to represent a year, programs that collate year data<br />
will sort that data out <strong>of</strong> sequence if there are dates both before and afler the<br />
year 2000 transition.<br />
Archival: Data arch~ves like magnetic tapes <strong>of</strong> data bases containing students<br />
records or research data or financial records may have fixed 2digit year data<br />
should not be modified. Instead special program may be written to read and<br />
convert archival data particularly if the data are to be used in union with data<br />
from beyond 1999.<br />
D s t a a x c h n n g a ; When data are to be exchanged between systems, there occurs a<br />
special case <strong>of</strong> the year 2000 mitigation. There must be close co-ordination<br />
between systems updates on both sides <strong>of</strong> exchanges otherwise the receiving<br />
systems may fail.<br />
Sometimes date information is used by the system as<br />
part <strong>of</strong> their algorithm to generate a unique key or serial number. If a 2 digit<br />
year is used, thls may cause confusion in some cases. This type <strong>of</strong> problem is<br />
likely to be an issue only with datasets covering more than 100 years.
Lrar,: This is not a 2digit problem rather a problem in the year 2000,<br />
2400 etc. The year laOO is not a teap year because it is not a multiple <strong>of</strong> 400<br />
but 2000 is a leap year. Date conversion routines may not have been<br />
programmed to take into account this anomaly since it occurs only once in 400<br />
years.<br />
Some <strong>of</strong> the problems caused by the identification <strong>of</strong> the 2000 as a non-leap<br />
year that would manifest in dates after February 28 are as follows.<br />
i) Dav - - calculations (the year 2000 has 366 days not 365)<br />
ii) -<strong>of</strong>-<br />
the - N&<br />
iii)<br />
calculations (March 1, 2000 is a Wednesday, not a Tuesday<br />
which is February 29,2000.<br />
Week calculation:<br />
The 1 lth week <strong>of</strong> the year 2000 is 5 through 1 I March. not 6 through 12 March.<br />
APPREHENSIONS AND REMEDIES OF Y2K CRISIS<br />
The impact will be tremendous not only for the business community but for the<br />
community at large. All the areas like banking, budgeting, accounting, stock market<br />
licensing, reservations, inventory, credit card transaclions, forward planning will be<br />
affected due to Y2K crises.<br />
The dimensions <strong>of</strong> this challenge are enormous Gwen the societies reliance on<br />
computers, the failure <strong>of</strong> systems to operate properly can mean anything from minor<br />
inconvenience to major problems. Licenses and permits not issued. Payroll medical<br />
and academic records malfunctioning. Errors in banking and finance. The bug affects<br />
computations which calculate age, sort by date, compare dates or perform other<br />
specialised tasks.<br />
Some s<strong>of</strong>tware vendom have developed modern tools as a remedy in the<br />
process. But these are not guarantee to solve all problems but will likely identify where<br />
problems exist and recommend solutions, speeding the process .<br />
STRATEGlES FOR ELIMINATION OF Y2K PROBLEM<br />
For running application s<strong>of</strong>twares in 21st Century a strategy should be decided<br />
for making the systems Y2K compliant. An inventory <strong>of</strong> all such s<strong>of</strong>twares has to be<br />
made and classified keeping in view the following points.<br />
i) Whether the s<strong>of</strong>tware will run beyond year 2000<br />
ii)<br />
Whether the s<strong>of</strong>twares involve computations on same future dates.
iii)<br />
iv)<br />
Whether, the existing s<strong>of</strong>twares can be replaced by the other versions apart<br />
from being Y2K compliant.<br />
Whether all such sohares which are yet to be developed in such a way that<br />
they are Y2K compliant.<br />
It is clear that not everything has to be converted. Communicating tools and<br />
hardwares also has to be Y2K compliant became these involve date and time. There<br />
are some s<strong>of</strong>twares which are very critical. These are the real time systems like flight<br />
monitoring, the computers <strong>of</strong> aircraft, spacecraft and radar system etc.<br />
Business Opportunity<br />
According to the experts, solutions to the Y2K crisis may yield huge commercial<br />
opportunity. Conservative estimate put the global opportunities in this area at $ 60 -<br />
100 billion, lnd~a may caplure a business worth 2 - 5 billion. Therefore, It is a bright<br />
challenge <strong>of</strong> the Indian I T pr<strong>of</strong>essional.
SCOPE OF APPLICATION OF STATISTICAL METHODOLOGIES<br />
IN AQUACULTURE RESEARCH<br />
A. K. Roy<br />
Biornlwmetics Centre<br />
<strong>Central</strong> Insbtute <strong>of</strong> FreShWter Aquecullun,<br />
Keusalyegenge. Bhubaneswar 751 002<br />
INTRODUCTION<br />
Like many other disciplines <strong>of</strong> science, statistics also plays an important role<br />
in Aquacultural Research. Some <strong>of</strong> the areas where statistical methodologies can<br />
be applied are described below. These are based on the experience <strong>of</strong> the author.<br />
There may be some more areas which are not included in this article.<br />
SYSTEMATIC STUDIES AND IDENTIFICATION<br />
In systematic studies it is always necessary to establish the relationship<br />
between two or more morphometrical quantitative measurements, like relationship<br />
between head length and total lenglh or breadth <strong>of</strong> carp, total lenglh and carapace<br />
length <strong>of</strong> a prawn etc.<br />
Taxonomic hypothesis formulated in terms <strong>of</strong> quantitative characteristics may<br />
be tested by means <strong>of</strong> chi-square test, student's 1-test, analys~s <strong>of</strong> variance, mult~ple<br />
range and non-parametric tests. Multivariate analysis may be useful when it is<br />
necessary to combine information on several characters (morphometridmeristic) to<br />
obtain best possible racial discrimination.<br />
COLLECTION, ESTlMATlON AND TRANSPORTATION OF FISH AND PRAWN<br />
Till today freshwater aquaculture in India is partially dependent on natural<br />
production <strong>of</strong> carp seed. Therefore, a lot <strong>of</strong> work is there for standardisation <strong>of</strong><br />
collection, estimation and transportation <strong>of</strong> fish end prawn seed. Availability <strong>of</strong> seed at<br />
different locations may be dependent on current velocity, turbidity, dissolved oxygen.<br />
food availability and numerous other factors. To identify the factors responsible for<br />
the availability <strong>of</strong> seed and to select suitable place for collection, stratified random<br />
sampling technique. Chi-square test, analysis <strong>of</strong> variance and multivariate analysis<br />
can be applied. Factorial experiments can be planned for optimisation <strong>of</strong> space,<br />
time, temperature etc. for mortality free transportation <strong>of</strong> fish seed to different area<br />
where simultaneous effect <strong>of</strong> various factors can be studied precisely taking into<br />
account environmental condition and bioassay techniques can be applied to assess<br />
the impad <strong>of</strong> affluents on fish larvae. SuRable sampling techniques for<br />
estimation <strong>of</strong> fish seed may be employed.
NURSERY REARING AND CULTURE EXPERIMENTS<br />
<strong>Aquaculture</strong> experiments are quite different from those <strong>of</strong> agricultural<br />
experiments because in the former case experimental animals can not be seen and<br />
periodical mortality cannot be observed. Moreover, requirement <strong>of</strong> minimum<br />
experimental units can never been met due to the shortage <strong>of</strong> ponds. However under<br />
varied level <strong>of</strong> fertilisation, stocking density, species combination and ratio,<br />
supplementary feed during different stages <strong>of</strong> nursery and culture experiments.<br />
simplest designs like completely randomised block design, randomised block<br />
design, latin square design, factorial design, incomplete block design etc. depending<br />
on the objective <strong>of</strong> the study can be laid out. System approach and simulation<br />
sludies may also be adopted for studying overall impact <strong>of</strong> stocking size and<br />
density, feeding quality, quantity, periodicity, species composition in polyculture<br />
and pond management to increase carrying capacity <strong>of</strong> water bodies. Manipulations <strong>of</strong><br />
nonmonltory inputs may enhance pr<strong>of</strong>itability.<br />
OPTIMUM UTlLlSATlON OF BROODER<br />
Size <strong>of</strong> brooder, dose <strong>of</strong> pituitary gland and physiochemical parameters <strong>of</strong> pond<br />
plays a great role during breeding. Therefore this is one <strong>of</strong> the area where through<br />
utilisation <strong>of</strong> suitable design <strong>of</strong> experiments, optimum exploitation <strong>of</strong> brooders can be<br />
done.<br />
ESTIMATION OF FISH POPULATION<br />
For rational management <strong>of</strong> culture fishery, monitoring <strong>of</strong> numerical changes<br />
which occur in a population through the course <strong>of</strong> time is essential for basic<br />
understanding <strong>of</strong> population number and production. For precise estimation <strong>of</strong><br />
fish population number from pond at any point <strong>of</strong> time the following methods may be<br />
applied on which a lot <strong>of</strong> research work has been carried out at this <strong>Institute</strong>.<br />
1) Method <strong>of</strong> two successive hauling<br />
2) Mark-Recapture method<br />
i) Method <strong>of</strong> two ruccessive hauling<br />
This method is very simple. The whole thing is to be done is that drag a net<br />
once in a pond then keep the capture fish in a container and let the catch be N,<br />
numbers then drag the net again in the same waterbody and let the catch be N2<br />
numben. Then the estimate the total number <strong>of</strong> fish present in the pond is given by
This method being convenient in operation involving minimum cost<br />
recommended for operation with caution (Roy et el.. 1995).<br />
ii) Mark-Recapture Method<br />
The rationale underlying mark-recapture experiments to estimate population<br />
number is that the proportion <strong>of</strong> marked fish appearing in a random sample<br />
provides an estimate <strong>of</strong> the proportion <strong>of</strong> marked fish in the population. If 'm' is<br />
the known total number <strong>of</strong> marked fish in the population from which the sample was<br />
drawn, then division <strong>of</strong> 'm' by the estimate <strong>of</strong> proportion marked given an estimate<br />
<strong>of</strong> total number <strong>of</strong> individual in the population. Mathematical expression <strong>of</strong> the<br />
estimation formula becomes (known as Petersen Method).<br />
where N = total number <strong>of</strong> fish in the population (unknown)<br />
m = total number <strong>of</strong> marked fish in the population (known)<br />
c = No. <strong>of</strong> fish in the sample<br />
r = No. <strong>of</strong> marked fish recaptured in the sample<br />
N = estimate <strong>of</strong> N.<br />
A A (N-m)(N-c)<br />
Standard Error (SE)(N) = N T----<br />
n~c(N - 1)<br />
If the, assumption that marked fish are representative <strong>of</strong> the reminder <strong>of</strong> the<br />
population is correct then the only error <strong>of</strong> estimation are the random errors<br />
associated with sampling. Experiment conducted at Wastewater <strong>Aquaculture</strong> Division<br />
<strong>of</strong> ClFA demonstrated that Petersen estimator modified by Bailey is efficient for<br />
estimation <strong>of</strong> carp population from pond because it demonstrated lower standard<br />
error, highest precesion coupled with lowest deviation <strong>of</strong> the estimated population<br />
from the free population (Roy et. al. 1989). It is funher observed that marking <strong>of</strong><br />
carps by finclipping which can be identified after one year <strong>of</strong> clipping is suitable for<br />
batch marking required for estimation <strong>of</strong> fish population from pond (Roy el el., 1991).
ESTIMATION OF PRODUCTION<br />
Freshwater aquaculture being subjected to wide range <strong>of</strong> environmental<br />
fluctuation passes through various stress condition leading to ,variation in survival,<br />
growth and production at different point <strong>of</strong> time. Therefore estimation <strong>of</strong> fish<br />
population and production is very important for understanding the process <strong>of</strong><br />
paoduction. In fishery science we are acquainted with the terms like biomass.<br />
production and yield. Generally no distinction is made between yield and production.<br />
In case <strong>of</strong> agricultural crops this may be true. But it is not so, in general, in fishery<br />
science Biomass is the amount <strong>of</strong> substance In a population expressed in material<br />
units, such as live or wet or dry weight etc. It is also termed as standing stock or<br />
crop. Here we may consider like a wet weight <strong>of</strong> fish as biomass. Suppose at the<br />
time <strong>of</strong> our observalion the estimated number <strong>of</strong> fish be N with average weight W.<br />
Then the estimated biomass at the time <strong>of</strong> our observation is: Biomass (0) = N W.<br />
Then to express biomass at different periods it is required to introduce time element<br />
in the above expression as<br />
Bt = Biomass at time '1'<br />
N- = No. at lime '1'<br />
W = Av, weight at time '1'<br />
S~rnilarly, biomass at lime t1 and t2 can be expressed as<br />
Produclion in a given time interval is defined (Ivlev) as the total elaboration <strong>of</strong><br />
anlmal tissue during the tlme interval including what is formed by individuals that do not<br />
survlve lill the end <strong>of</strong> that lime interval.<br />
What is produced is production and what is harvested is yield. In fisheries<br />
the quantity harvested, in other words the final biomass, may be termed as gross<br />
yield. Net yield is the difference belween final biomass and initial biomass what we<br />
generally express as production that in reality is yield. That means we never take<br />
into consideration those fishes who died between initial and final period <strong>of</strong> growth<br />
inspite <strong>of</strong> the fact that Itiose produced flesh during intermediate period. Yield and<br />
production will be same when there is no mortality during growth period.<br />
AGE AND GROWTH STUDIES<br />
The ability to determine the age <strong>of</strong> a fish is an important tool in fishery biology.<br />
Simplest and widely used method for age determination is the analysis <strong>of</strong> size<br />
frequency distribution. It can be used only to the youngest ege group <strong>of</strong> a fish
population. During their development fishes pass through several stages each <strong>of</strong><br />
which may have its own length weight relationship due to sex maturity, season, place<br />
and even time <strong>of</strong> a day. Hence fitting <strong>of</strong> regmssion line by least square method in<br />
each situation is required. For allometically growing fishes, condition fador can be<br />
worked out to compare individual condition <strong>of</strong> the fish under varied condition. Since<br />
growth is a complex procass a complete expression is not feasible, but formation <strong>of</strong><br />
growth models which are basically realistic could be important. For purpose <strong>of</strong><br />
description a number <strong>of</strong> straight line function, logistic curve, exponential curves have<br />
been fitted statistically for purpose <strong>of</strong> evaluation <strong>of</strong> different curves. The best growth<br />
model which can be fitted in fisheries production studies is that <strong>of</strong> Von<br />
Bertalanffy. This particular growth curve can be used in growth studies <strong>of</strong> freshwater<br />
fishes.<br />
GENETICAL STUDIES<br />
The foundation <strong>of</strong> modern theory <strong>of</strong> breeding are based on genelrcs and<br />
statistics which together constitute the scientific disc~pline statistical genetics founded<br />
by Fisher, Wright and Haldane. Therefore there is wide scope <strong>of</strong> application <strong>of</strong><br />
statistics in fish genetical studies like estimation <strong>of</strong> genetical correlation, correlated<br />
response to selection, simultaneous selection <strong>of</strong> several characters and calculation<br />
<strong>of</strong> co-efficients <strong>of</strong> in breeding and water relationship <strong>of</strong> various production .<br />
MODELING OF GROWTH OF FISHES AND POND DYNAMICS<br />
In aquaculture research, statistical methods used (or establishing <strong>of</strong><br />
empirical relationships are mostly univariate or bi-variate In nature e g I-test.<br />
correlations, linear regression etc. In many cases one IS to deal w~th several variables.<br />
as for an example environmental variables as predictor and fish growth as response<br />
variable; such situation is known as multivariate situation which require treatment<br />
and analysis <strong>of</strong> data using multiple regression analysis, path analys~s and<br />
cannonical correlation analysis. Although manual calculation is very tedious,<br />
availability <strong>of</strong> computer and s<strong>of</strong>tware programs has made these analysis within one's<br />
reach.<br />
In a pond environment mullilude <strong>of</strong> factors interact dynamically and influence<br />
fish growth and production. Some environmental factors are uncontrollable which<br />
requires thorough study. Interaction <strong>of</strong> various factors and their resulting effect on fish<br />
growth are seldom understood. In order to make the behaviour <strong>of</strong> these systems more<br />
predictable on which themselves undergo internal changes over the culture cycle,<br />
mathematical models capable <strong>of</strong> describing the fish pond ecosystem practically Is<br />
necessary.
SAMPLE SURVEYS FOR ESTIMATION OF FISH PRODUCTION FROM INLAND<br />
SOURCES<br />
In view <strong>of</strong> large coastline, multitude <strong>of</strong> inlad fisheries resources, the diversity <strong>of</strong><br />
fishing practices and scattered distribution <strong>of</strong> exploiting units it is very difficult to have<br />
reliable production estimate. Inspite <strong>of</strong> these, various organisations like IS], NSSO,<br />
IASRI, CMFRI, CIFRI, etc. during the past decades have conducted pilot surveys to<br />
standardise the sampling methodologies for estimation <strong>of</strong> resources and production.<br />
Presently ClCFRl is running a <strong>Central</strong> Sector Project entitled Development <strong>of</strong> Inland<br />
Fisheries Statistics in India covering various states to develop efficient methodologies<br />
for accurate estimation <strong>of</strong> resources or production. This is a potential area <strong>of</strong> research<br />
on application <strong>of</strong> sample survey. Socio economic and technoeconomic surveys to<br />
assess the impact <strong>of</strong> aquaculture technology on the society as a whole can be<br />
studied using suitable sampling methodology.<br />
REFERENCES<br />
Roy. A. K., Apurba Ghose and 0.K.Saha (1989) Estimation <strong>of</strong> some species <strong>of</strong> fish populations<br />
from pond by fin clipp~ng and comparative emcacies <strong>of</strong> three estimators. Envtronmenl &<br />
ECO~OQY 7(2) : 398 - 403.<br />
Roy. A. K.. A K. Datta. P R Sen and 8. K.Saha (1891). Preliminary studies on the effect <strong>of</strong><br />
pecloral fiin cl~pp~ng in carps on growth, suw~val and regeneration rate. J. Aqua.Trop.<br />
e(i991) : 89 - 98<br />
Roy. A. K, and A. K. Dalta (1995). Two melhods <strong>of</strong> est~mating Carp Population from closed waler<br />
bodtes. J. Inland. Fish.Soc. India, 27(1) . 70 - 77.
MANY FACES Of STATISTICS<br />
INTRODUCTION<br />
StstktiabavuUy~ldencswithmkn~my<strong>of</strong>nwtodrand<br />
technigW8. It plryr a vlW rok in Mi maouch, In industry and -,<br />
and In<br />
f<strong>of</strong>mukning nrUwJ pdld# and prognmw. SWIUo wrrbks tha Id.nthtr lo have<br />
a full plry for @her mativs palsnthiitbs - to dkcovsr new phenomena wlVIout<br />
allowing thorn to run rld ad waste in ldvnndng nw concspts. A povrmment an<br />
pmvkle best bonefib to the poop40 If it takes policy dedsions on the his <strong>of</strong> a tound<br />
stathUcd study <strong>of</strong> problamr.<br />
How do then laws M theories get establkhd 7 RMm k a lckntlRc method.<br />
Fimt, a lev is formuleted 8s a prwhlonal hypotherls to explain &ah observed<br />
evento. S d , tha conseqwms <strong>of</strong> the hypoVnrk m worked out by mkr <strong>of</strong><br />
Muctlw rsuonlng and vdiW by furthor obuwrtions cdkchd Ulrough unfully<br />
derigned exFwimOntc.<br />
If the data contradid the hypothsrls, it h dhclrded, and a fresh one In<br />
formulsted. Othwwiw, it is pmvirionolly accepted and is given the rtrtus <strong>of</strong> law - with<br />
specifled limitation Pnd rcop <strong>of</strong> applicstions.<br />
The rcidfic mahcd <strong>of</strong> investigation krvdving the logical cycle, Hypothesit -<br />
Data - HypcAhdn, can be achemrticPlly reprersnld as follows :
STATISTICS IN SEARCH OF TRUTH<br />
A few examples are given to show the inadequacy <strong>of</strong> measures <strong>of</strong> location such<br />
as the average, median and the mode in describing a given population and the pitfalls<br />
in ~nferences based on them. This is because the individuals in a population usually<br />
differ substantially from one another and this might make a difference. In such cases,<br />
we may compute a measure <strong>of</strong> dispersion (differenms between individuals) to<br />
supplement the measure <strong>of</strong> location. Suppose x,. ...., x, are measurements <strong>of</strong> n<br />
individuals arranged in increasing order <strong>of</strong> magnitude. One measure <strong>of</strong> dispersion is the<br />
range R=&-XI (the biggest minus the smallest). Another measure is the standard<br />
deviation S which depends on all the values, where s'=z(&' - x)' + n which is the<br />
average <strong>of</strong> the squared deviations <strong>of</strong> the individuals values from the average x =<br />
(x,+ ....+ x,)ln. Thus we have two quantities x and s, to describe a population. The<br />
former measures the general magnitude <strong>of</strong> values and the latter the spread <strong>of</strong> values.<br />
A small value <strong>of</strong> s indicates more homogenity <strong>of</strong> the individuals with respect to the<br />
character under study.<br />
A single characteristic in a population can be studied easily. Often it is<br />
necessary to consider two or more characteristics and examine their interrelationships.<br />
As an example, the average IQ <strong>of</strong> sons Increases with increase in the IQ <strong>of</strong> the father.<br />
This establishes some kind <strong>of</strong> relationship, though not <strong>of</strong> a one-to-one type. When the<br />
values <strong>of</strong> father) and y(son) ere plotted in their standard deviation units, the slope <strong>of</strong><br />
the regression line as measured by the tangent <strong>of</strong> the angle i.e. when the slope is zero,<br />
there is obviously no relationship. The strength <strong>of</strong> the relationship may be measured by<br />
the slope <strong>of</strong> the regression line, which is called the correlation between x 8 y and is<br />
denoted by r. This can be directly computed from the observed pairs (xl,yl) ....( x,y,) by<br />
the formula.<br />
. Relationships between variables are frequently used for predicting one variable<br />
given the others or controllinp one variable by causing others to take suitably<br />
determined values.<br />
The correlation between two variables may be induced entirely by a third<br />
variable, in which case the observed relationship is spurious and cannot be used for<br />
prediction. The task <strong>of</strong> making the necessary computations and updating the<br />
discriminant function by using fresh evidence provided by concurrent cases and by<br />
adding newly discovered diagnostic tests is indeed very complex. For this pupme<br />
modem high speed computers are pressed into se~ce. Computer diagnosis using<br />
hundreds <strong>of</strong> measurements is now commonly used in complicated heart diseases.
Mign Of experiments <strong>of</strong>fera a firm basis for dming condmions4rom data.<br />
Much <strong>of</strong> the experimental data generated by sdentists go wale or lead to wrong<br />
condusiona because <strong>of</strong> lack <strong>of</strong> adequate antmls and Mas in assignment <strong>of</strong> Ireatments.<br />
Most <strong>of</strong> the quantities involved in fishery research cannot be observed w<br />
measured throughout the whole population. A section or sample <strong>of</strong> the whob<br />
population is therefore examined for attributes concerned (average size or average<br />
weight) Wch is known as samples.<br />
In multistage sampling, one does not draw a sample <strong>of</strong> the desired units directly;<br />
one reaches such a sample in stages through samples <strong>of</strong> intermediate units. The<br />
method can be illustrated in mathematical terms : the population can be split into K<br />
primary units, each <strong>of</strong> N individuals, and K primary units are sampled, a subsample <strong>of</strong> n<br />
individual being taken from each.<br />
If m, the mean for the im prlmary unit then the estimate <strong>of</strong> the mean <strong>of</strong> any<br />
sample primary unit<br />
where xy is the value <strong>of</strong> the jth individual in the ith unit and the estimate <strong>of</strong> the<br />
population mean is<br />
STATISTICS IN AQUACULTURE RESEARCH<br />
Dedsion making in aquawlture research presupposes a deep knowledge <strong>of</strong> the<br />
aquaculture system and planning the Mure programmes lhrough a well established<br />
data recording syslem. This is one <strong>of</strong> the essentials <strong>of</strong> the farm management and<br />
deasion must be made as to which parameters, when and how much <strong>of</strong> them are to be<br />
monitored.<br />
Modelling and optimization <strong>of</strong> growth <strong>of</strong> fish in aquaculture is very important<br />
factor <strong>of</strong> study for the success <strong>of</strong> the operation <strong>of</strong> Ule aquauclture projects.<br />
Management <strong>of</strong> ponds is based largely on monitoring complex processes <strong>of</strong> pond<br />
dynemica and sansWty to environmental and operational factors. Physiological and<br />
biological parameten over a long period <strong>of</strong> time makes certain demand6 for a large<br />
storage capacity <strong>of</strong> computer for developing data acquisnion system.
For povldlnO etmbgk data for pEwJnp md rubHquwd Ink <strong>of</strong> handling<br />
Uwough Hltomrtkn, cmpttn-ddd design, computer-rldsd imbmmbtkn and data<br />
trwmlukn,#dogicd~lnrbvnrnbtknr(c.urran<strong>of</strong>th.-<br />
~krwhlch~eopsrrtrdtoq~murdrthrwghth.t#lp<strong>of</strong>micro<br />
comwn.<br />
Theurr<strong>of</strong>computsrmoddrh~msydrmpl.yrur~portmtmkt0<br />
obtskrbetteruds~<strong>of</strong>thepcdeco8yrtsmmdckwkpm~~<br />
t ~ o p t k n k s ~ ~ ~ . I t h n g O t ~ i r n p o d f w ~ a i t i c<br />
parameter8 that haw hbhef docfa on pond's pro- and hence Lh produdkn.<br />
Multivarktr adyeb for devdoplnp modrk In the bmnch <strong>of</strong> rt.thtko cormrrted with<br />
analyshg muklpk rn+rurcmnntr thlt have boon msde on wen1 samples <strong>of</strong><br />
Individuals. VarkUer ue dependent among themadver so that m an not split <strong>of</strong>f one<br />
or more from others. CompuMional malysis lncorponthrg a large number <strong>of</strong> variables<br />
h the only 8oluUon to arrhre at the conduelon IdonWykrg the ulticrl parameten.<br />
In order to cbvdop effldsnt and ownomicll feed fonnutn for aquaculture, the<br />
basic information b nqulred on nutfbnt mquhwb <strong>of</strong> the species cuttlvatd, the<br />
chemical comporltlon md oqMokptlc propsrtier <strong>of</strong> feed hgredhnb in relotton to their<br />
acteptabU#y Md &lHy <strong>of</strong> fhh to dmt and utiW nubienh from va&us sources.<br />
Linear programming k a mrttwtnaUcrl technlqw bawd on matrix algebra and best<br />
suned to a computer. Thh <strong>of</strong>fem mnrldenbk pdmtlal h the development <strong>of</strong> 'Least<br />
colt IW tonnul;llkn <strong>of</strong> flrh dkts'.<br />
Recad keeping dated to brood stock management wwM be better performed<br />
on a microcomputer. Gomtkte would like to incorporate new record rarity into a<br />
contlnuoru dotabom wMch am doveloped gently In drt.bns management<br />
pacJwge8 Hk dBm IV, Foxbus, R&X md Foxpro ate. The support ohfed by the<br />
modem computer wfWm techndoOy h th fkld <strong>of</strong> gcwtkr, Wng from<br />
chmmosomo wlyrh to genu nupplng or DNA 8epmchg math to hasten the<br />
pmgmss In gene technoiogy. Qana t.chndoly suppwted by th computer Whnology<br />
has a much more gmter rda to play in quadtum mwarch.<br />
The growing sdem <strong>of</strong> aquatic mlaobidogy wkh reference to aquatic<br />
producttvky, organic decornporltion, tbktkation, MoRnention and other biotic<br />
approaches to improw prodvdMty ha8 been bm&ting lnwwnnly from the<br />
Molnfomutia bawd on computer epplldon.<br />
Work on dewbpmonl <strong>of</strong> m(w qudlty mod.ls, -1 modoh, phyrlal<br />
modeh, economic modela etc. am 8una <strong>of</strong> tho mrthmrticd mpmmWbm whlch can<br />
be derived empidcalty or macluniaticaffy. The m o m , ldsnURcrtion <strong>of</strong> c~ntrd<br />
procesrer md facMas pmvick r map opportunity Md m exciting ch.lhge for<br />
sgu-.
lnvesm requirement for aquaculture, pmfdr, ntcr <strong>of</strong> mkm, growth nte, W<br />
requirement, mortality, dwng density, incidence <strong>of</strong> diiase in culture operations etc.<br />
are analysed through computers by a krga set <strong>of</strong> built-in mathematical and statistiul<br />
functions developed in programming languages.<br />
CONCLUSION<br />
Statistics proves a necessity when researchen contemplate advanced study<br />
with the objects <strong>of</strong> doing research. Statistical principles are involved in the effident and<br />
economic design <strong>of</strong> experiments as well as in the interpretation <strong>of</strong> the results.<br />
Appli~tion <strong>of</strong> statistics is modem mode <strong>of</strong> interpretation <strong>of</strong> scisnttfic data and drawing a<br />
right conclusions eliminating probabilitiie and posribilit'is.
FUNDAMENTALS OF SAMPLING AND ITS APPUCATION IN<br />
FISHERIES RESOURCE ESTIMATION<br />
S. Chakraborty<br />
Deputy D~rector ol Fisheries<br />
God. <strong>of</strong> West Bengel<br />
Some basic sampling concepts and basic sampling techniques<br />
Population : It is defined as the collection or an aggregate <strong>of</strong> all possible values <strong>of</strong> a<br />
particular characteristics for a specified group <strong>of</strong> individual.<br />
Example: i) populat~on <strong>of</strong> f~sh weights <strong>of</strong> all fishes In a pond.<br />
~i) population <strong>of</strong> income <strong>of</strong> f~shermen fam~l~es in a State.<br />
iii) population <strong>of</strong> fish length in a sea.<br />
A population can be finite or infinite. It is said to be finite if it contains finite no.<br />
<strong>of</strong> ind~viduals or un~ts. Example (i) and (ii) given above refer to finite population. A<br />
population <strong>of</strong> unl~m~ted or very large measurable no <strong>of</strong> individuals is called infinite<br />
populat~on. Example (ii~) above refers to infinite population. The no. <strong>of</strong> individuals or<br />
observation is called populal~on slze and usually denoted by 'N'.<br />
Sample : A group <strong>of</strong> individuals or units that is chosen from a population is called a<br />
sample. The no. <strong>of</strong> ind~viduals or observations in a sample is called sample size and is<br />
generally denoted by 'n'.<br />
Sompllng frame : It is a list, map or other specification <strong>of</strong> units which constitute<br />
available information regarding population. It forms the basis for drawing <strong>of</strong> sample.<br />
Random sampling : A random sampling is a method <strong>of</strong> sampling in which each<br />
individual in a population has a preassigned chance <strong>of</strong> being included in the sample.<br />
Generally units are drawn one by one from the population. If the chance <strong>of</strong><br />
selecting any unit at any drawal is the same then the sampling is called the simple<br />
random sampling. S~mpte random sampling can be obtained either by using 'Lottery'<br />
melhod or by the use <strong>of</strong> 'Random Number Tables'.<br />
L<strong>of</strong>fery method In this method, first number the individual <strong>of</strong> the population. Then<br />
write these numbers on identical chits and fold them so that the nos. are not visible.<br />
Then place lhese chits in a box. Shake the box thoroughly and draw chits one by one<br />
t~ll the no. <strong>of</strong> chits drawn equals to the sample size. Note down the nos. <strong>of</strong> those chits.<br />
The individuals with these nos. form a sample.
Use <strong>of</strong>nndom numlran : Prepared tables <strong>of</strong> random nos. are nvaiiabba lor drawirig a<br />
rbnple random sample. These tables consist <strong>of</strong> series <strong>of</strong> digits fmm 1 lo 9 which<br />
appear Indewndent <strong>of</strong> each other and appear approximately aqua no, <strong>of</strong> times.<br />
As a first step, units Of the population are numbered from say 1 to N. From<br />
random no. tables, select a no. between 1 to N and include the unit bearing this no. in<br />
the sample. Continue this process till the no. <strong>of</strong> units included in the sample equals to<br />
the sample size. In this procedure nos. larger than N are not considered. To avoid<br />
rejection <strong>of</strong> such nos. 'Reminder approach' methods a n adopted which is described<br />
below.<br />
if N is a 'd' digit& no. determine first the highest 'd digited multiple <strong>of</strong> N. Let<br />
this be 'N'. Then a random no. 'r' is selected from 1 to N. Divide this selected 'r' by N<br />
and find out the reminder. A unit with serial no. equal to this reminder is se\ected. If<br />
the reminder is zero, the last unit (N) is selected.<br />
Example:- If N = 20, the highest 2 digited multiple <strong>of</strong> 20 is 80. Then select a random<br />
no. from 1 to 80. Let this no. be 72. Division <strong>of</strong> this no. by 20 glves a reminder <strong>of</strong> 12 .<br />
Hence, the unit with serial no.12 is included in the sample. Select another no. from 1 to<br />
80 and repeat the procedure till the no. <strong>of</strong> units selected equals the sample size.<br />
A sample survey is a vehicle for inductive reasoning. It provides for the<br />
transformation <strong>of</strong> observations <strong>of</strong> a part into conclusion regarding the whole. Taking<br />
samples is a procedure used in nearly all fisheries investigations and from the sample<br />
taken we intend to generalise about populat~on under investigations. For example,<br />
taking a sample <strong>of</strong> catch from a vessel operated in a water body. We want to say<br />
something about the total catch <strong>of</strong> fish from it.<br />
The basic sampling techniques are<br />
i) Simple Random sampling<br />
ii) Stratified sampling<br />
iii) Cluster sampling<br />
iv) Systematic sampling<br />
v) Two stage sampling<br />
In this mmpling all unib have equal probability <strong>of</strong> being seleded in the sample<br />
and wsry possibk sample <strong>of</strong> required size has the same chance <strong>of</strong> selection. The<br />
mmpk is drawn sither by lottery method or by random number table.
In stratifmd random sampling, the population b divided into m over lapping<br />
sub-populations called strata. A sample is then drawn from each stratum. The prime<br />
reasons for stratification are - (i) It ensure adequate representation to various sub<br />
division <strong>of</strong> the population. (ii) It may be convenient to break up the populetion into<br />
strata for better organization end supervision <strong>of</strong> field work (iii) A considerable<br />
precession may k gained by dividing a heterogeneous population to homogenous<br />
strata.<br />
In cluster sampling the population is divided into groups or dusters <strong>of</strong> units.<br />
Several <strong>of</strong> the cluster8 a n chosen at random and all units in each selected cluster<br />
become part <strong>of</strong> the sample. The choice <strong>of</strong> cluster sampling in fish catch surveys is <strong>of</strong><br />
immense use.<br />
In thin rampling fmt we relea clusters, called 1st stage units and then chosen<br />
units called 2nd stage units from the dusters. For example, in estimating the yield <strong>of</strong><br />
fish In 8 distrid, village may be considsrd as let stage unit and the ponds within<br />
vlllage as 2nd stage unit.<br />
In systematic sampling. the first unit is selected at random, the rest being<br />
selected according to a predetermined interval. In estimating marine landings or<br />
rlverine landing, the systematic sampling technique is normally used.<br />
Reliable and sound data base is a prerequisite for proper planning and<br />
management <strong>of</strong> inland fisheries. At present, the available data base on inland fishery<br />
resources and their exploitation is inadequate and a10 suffers from various drawbacks<br />
due to coverage, classification and methodology <strong>of</strong> collection <strong>of</strong> fishery data and its<br />
estimation procedure. The statistical methodology which may be applied in various<br />
Inland fishery resources as described below may provide reliable estimates on<br />
resources as well as production. Inland fisheries ere broadly classifred into capture and<br />
culture fisheries, the format being expioitive <strong>of</strong> natural population and the catch king<br />
intensive intervention <strong>of</strong> human by stock control and management practices. The<br />
culture fishery resources are pondskanks (impwnded water bodies), Ox-bow<br />
lakes/Beel and Baon, Brackish water fisheries, Reservoirs and Rivers, Estuaries and<br />
Lagoons are the capture fishery resources.
Presently. area approach is being followed for estimation <strong>of</strong> inland fish catch by<br />
using 'Acreage' and 'Yield rate' data as available through sample sutvey. The area<br />
under different culture, inland fishery resources may be developed in the following<br />
manner.<br />
Type <strong>of</strong> resources<br />
Source <strong>of</strong> area data<br />
1. Pondflanks ( impounded 1. CuHurable water area may be developed on the<br />
water bodies)<br />
basis <strong>of</strong> complete enumeration or through<br />
sample survey based on sampling<br />
methodology.<br />
2. Ox-bow IakesBeel and Baors 2. Through settlement records<br />
3. Reservoir fisheries 3. Through 1 8 W Department <strong>of</strong> respective State<br />
4. Brackish water fisheries 4. Through complete enumeration and also<br />
through implementation <strong>of</strong> Fish producers<br />
licens~ng order.<br />
ESTIMATION OF PRODUCTlVlTYlCATCH FROM IMPOUNDED WATER<br />
RESOURCES<br />
Impounded water bodies viz.. ponds and tanks contribute appreciable to the<br />
total inland fish production and the assessment <strong>of</strong> its contributions are being prepared<br />
on the basis <strong>of</strong> sound sampl~ng technique. The sampling technique and the estimation<br />
procedure described below provide precise and reliable estimate <strong>of</strong> productivity and It8<br />
fish production.<br />
For estimating the fish catch from these resources, a stale may be divided into<br />
three Agro-climatic zones. The criteria for classification adopted here is on the basis <strong>of</strong><br />
high, moderate and low ra~nfall, temperature and soil type etc.<br />
From the high rainfall region, a set <strong>of</strong> three districts are selected at random for<br />
catch estimation where two districts are selected from moderate rainfall area and one<br />
district from low rainfall area in order to provide larger sample for high concenlration <strong>of</strong><br />
units and smaller sample for low concentration <strong>of</strong> water units. Here, it is assumed that<br />
these sample districts represent the districts from which they are selected.<br />
The sampling design for estimating the productivitylproduction under these<br />
resources are stratified three stage cluster sampling. A district ie divided into three<br />
strata approximately <strong>of</strong> equal sue in respect <strong>of</strong> water arednumber <strong>of</strong> villages. A<br />
sample <strong>of</strong> six dusten, <strong>of</strong> five villages each are ~lected from each stratum. Cluster <strong>of</strong><br />
villages constitute lhe first stage unit and the ponds within cluster as the second slag.
unit. Selected villages are ourveyed completely and all the water unb in the village are<br />
enumerated.<br />
The selection <strong>of</strong> samples are prepared by adopting the following procedure.<br />
List all the villages in a district. Now the district is divided in 3 strata such that the<br />
number <strong>of</strong> villager in each Stratum are approximately equal. From each stratum, six<br />
villages are selected called the key village at random from the list <strong>of</strong> villages. Then<br />
l~sting <strong>of</strong> ell the villages surrounding each <strong>of</strong> the key village are prepared. From this list<br />
4 villages corresponding to each <strong>of</strong> the key village are selected randomly. In this way a<br />
sample <strong>of</strong> six clusters <strong>of</strong> five villages each in a stratum are selected for resource<br />
estimation.<br />
For estimating the total catch <strong>of</strong> fish, five pondsltanks are selected from each<br />
cluster at random from the total number <strong>of</strong> ponds in the cluster. In case the number <strong>of</strong><br />
ponds in a cluster is less than 5, all are taken in the sample for observation <strong>of</strong> catch.<br />
Thus, from each district a total <strong>of</strong> 90 villages are selected for est~maling the water area<br />
under ponds and tanks and 00 ponds for estimating the catchlproductivity <strong>of</strong> fish.<br />
Further, sampling In time are adopted so that each water unit is visited at least once in<br />
e month by an investigator for record~ng the catch from each pond more accurately and<br />
for prov~ding the est~mates <strong>of</strong> monthly catches also.<br />
Estimation Procrdurr<br />
Nh = Total number <strong>of</strong> clusters in h-th stratum<br />
n, = Number <strong>of</strong> sample clusters in h-th stratum<br />
MW = NO. <strong>of</strong> ponds in the J-th village <strong>of</strong> i-th cluster in h-th stratum.<br />
my, = No. <strong>of</strong> ponds selection from i-th cluster in j-th stratum<br />
Xyl = Total area under water unit In the j-th village <strong>of</strong> i-th cluster in the h-th stratum<br />
xkh = Area <strong>of</strong> the k-th selected pond in the i-th cluster <strong>of</strong> h-th stratum.<br />
Ykh = Yield <strong>of</strong> k-th selected pond In the i-th cluster <strong>of</strong> h-th stratum.<br />
= Average yield per cluster in h-th stratum<br />
)'* = Average yield per hectare per year in h-th stratum<br />
Yh<br />
Estimators <strong>of</strong> area and Number <strong>of</strong> ponds<br />
Average number <strong>of</strong> ponds per cluster in h-th stratum<br />
Total no. <strong>of</strong> ponds in the district is given by M = N, M*
1<br />
Average area per cluster In h-th stratum s -: = - ,Y,,,vherrX., =<br />
!Ik<br />
,Y#<br />
A',<br />
Total area in the district is X = N,<br />
x;<br />
E~tirnatora <strong>of</strong> yield<br />
Average yield per cluster in h-th stratum<br />
- 1<br />
Where qh = - 1 Y*<br />
m,<br />
Total y~eld in the d~strict ( Y )<br />
Average yield per hectare in the d~strict<br />
ESTIMATION OF FISH CATCH FROM CAPTURE FISHERY RESOURCES<br />
Under this resource Rivers, Streams, Estuaries etc constitute one <strong>of</strong> the<br />
important inland fishery resource in the State spreadin0 over thousands <strong>of</strong> kilometers<br />
and passing through mountains, valleys, pla~ns and other areas An appreciable<br />
quantity <strong>of</strong> fish are being landed from these resources. The estimates <strong>of</strong> its<br />
contribution are being prepared based on sound statistical technique and the procedure<br />
described below provide reliable est~mate <strong>of</strong> fish production<br />
Sampling Design, methods <strong>of</strong> data collection and estimation procodure<br />
Capture fishery resources under rivers, streams etc, sustain mult~gear and<br />
multispecies fishery exploited by art~sonal f~shermen operating on the area <strong>of</strong> the<br />
system. Most <strong>of</strong> the rivers have well established landing centres where fishermen land<br />
their calch. From the landing centres data on fish catch etc, are collected by the held<br />
i ;vestigators.
The sampling design adopted h a two atage stratfkd sampli involving<br />
stratification in space and lime viz., landing wntrea and days reqedvely.<br />
The entks stretch is divided into homogenous zone <strong>of</strong> landing mntre each zone<br />
having more or less same type <strong>of</strong> gear and craft, flshing practices and species landed.<br />
From each zone D few landinp (20%) centres are randomty selected. A month is<br />
divided into three sets <strong>of</strong> ten co~eartive days. From the first set, two consecutive<br />
days aw randomly selected whom olmvatlons are taken from the se\ected centre.<br />
From the ~ ~ and nthird set d <strong>of</strong> ten days each dusters <strong>of</strong> two days are taken with a<br />
sample interval <strong>of</strong> ten days.<br />
On the selected first day <strong>of</strong> observation in a landing centre, data are collected<br />
during 12.00 to 18.00 hm. and on the second day during 6.00 to 12.00 hrs. Data on<br />
night landing if any in between thew consecutive days are collected by enquiry on the<br />
second day. Thus in two day duster 24 houn observation is taken. This forms a<br />
landing centre day the fimt stage sampling unit. On the selected day <strong>of</strong> observation, if<br />
number <strong>of</strong> units landed is 10 or less, then all the units am observed for gear wise<br />
catcher. When it exceeds ten a sample <strong>of</strong> units not less than ten is selected in a<br />
syrtematic way depending on the total number <strong>of</strong> units landed during the period <strong>of</strong><br />
observation. Units landed form the second stage sampling unit from which data on<br />
specisswise catch, type <strong>of</strong> crafl and gear operated are collected.<br />
Estimation Procadun<br />
Let n sample centres are selected from a population <strong>of</strong> N and let d no. <strong>of</strong><br />
sampling days.<br />
D, = number <strong>of</strong> Fishing days at i-th centre in a month<br />
Y1 = Catch <strong>of</strong> I-th landing centre on j-th selected day.<br />
1<br />
71 = ~ ean yield <strong>of</strong> the i-th centre = -x~<br />
d<br />
N -<br />
Then Y = Estimate <strong>of</strong> total yleld from all the centre = TZ D, y,
The data given below relate lo three dusters <strong>of</strong> stratum-l in ths district <strong>of</strong><br />
Minapore, West Bengai for estimating the total area under ponds and tank. Tha tobl<br />
no, <strong>of</strong> dusten in the stratum in 349. The sampling methodology ir 8tratMed duster<br />
rampling.<br />
Cluster SI. No. <strong>of</strong> Village No. <strong>of</strong> Ponds , Total Area<br />
Compilation Procedure:<br />
Total catch for 20 ponds in cluster - 1 = XY,,,<br />
Average catch/Pond in Cluster - 1 =<br />
11= 2 m I<br />
= 1867.5120 = 03.47<br />
Average cetch/pond in cluster - 2 = 1189.5113 = 91.50<br />
Average catcNpond in cluster - 3 = 1729 0112 = 144.48
A 1<br />
Average catch per cluster = - = - (2617.3 + 2836.6 + 3313.9)<br />
Y, 3<br />
The following data relate to the estimate <strong>of</strong> area and variance in four strata <strong>of</strong><br />
Midnapore district In the state <strong>of</strong> West Bengal for estimating the total area under ponds<br />
and tanks.<br />
- -<br />
Stratum Nh nh A A<br />
Ah<br />
MA<br />
(no. <strong>of</strong> pond<br />
in village)<br />
IV 634 3 0.1261 0.00134388 98.67<br />
Compilation Procedure:<br />
Total No. <strong>of</strong> Ponds = M = XSN, A<br />
M,
Average area per pmd = = N.<br />
A<br />
Total area = number ponds x average area per pond<br />
= 491581 x . 1722<br />
= 84650.25 (ha)<br />
The data given below are from one stratum <strong>of</strong> the district <strong>of</strong> Midnapore in West<br />
Bengal for estimating the catch. The total no. <strong>of</strong> clusters in the stratum is 349. The<br />
sampling procedure is two stage stratified cluster sampling.<br />
Stratum C, 1
Compilation Promdun :<br />
We prepare (he following table<br />
Stratum Clulter SI. No. <strong>of</strong> Village Av, area per pond<br />
1 1 1 0.1200<br />
2 0.2987<br />
3 0.6180<br />
4 0.3740<br />
5 0.2750<br />
.-.-<br />
I I: ,<br />
A, N, A,, A,,<br />
is average area <strong>of</strong> pond in i-th duster<br />
Estimated varlance <strong>of</strong> A. , ; (2)<br />
A, A,<br />
= (---<br />
A<br />
Tatel catch kt stratum - 1 = Y = Nf -i;;<br />
A A <br />
A, A,<br />
Flmt the yiekVhectan for each pond la calculated
Average ykM per hectare for duster - $<br />
Average yieM for duster -2<br />
Average yieM for duster - 3<br />
Average yieldlhectare in stratum-l
Sa~npling Techniques Applied in Assessing Inland Fishery Resources<br />
and Production<br />
R A. Gupta<br />
Cc~ttrnl lttln~td Capture Fisheries Racnrcl~ <strong>Institute</strong><br />
llnrrackpore 743 101, West Bengal<br />
India is er~dowed wit11 very rich and polenlial inland fishery resources<br />
Tltcse resources t~eed to be judiciously exploited and managed in order to get sustainable<br />
yields OI long tenn basis The decision makers need reliable data not only to assess tlie<br />
levels <strong>of</strong> exploitation <strong>of</strong> these resources but also sucll data are needed for planning and<br />
for~nulation <strong>of</strong>our future strategied for balanced development <strong>of</strong> inland fislieries This<br />
is in this respect tliat we need sampling niell~odologies which may l~elp to assess these<br />
resources in ternls <strong>of</strong> area <strong>of</strong> coverage and productio~i <strong>of</strong> fish from them Tlie nature,<br />
nu~nber aid tyl)e <strong>of</strong> inland water bodies yielding fislt are so luge and diverse \hat it<br />
seelils ~ ~~~eco~~o~~iical<br />
to adopt nny type <strong>of</strong>nielliods e~nployitig total enu~neration and<br />
l~e~~ce justifies adoption <strong>of</strong> saniplirig rr~etl~odologies for their assessnlent The present<br />
lecture deals witli the san~pling neth hods nlost appropriate for assessment <strong>of</strong> inland<br />
fisliery resources and production.<br />
Before I embark upon the discussior~ on the sanipling tecliniques used for<br />
fisheries assessment 1 feel it necessary to enlist the types <strong>of</strong> resources used for inland<br />
fisl~erics and ndopt soti~c acceptable c~iterion for tlieir classification depending on the<br />
modes a~id nature <strong>of</strong>exploitation <strong>of</strong> different classes<br />
Clnssificntiott <strong>of</strong> i~ilil~td Iislleries resource<br />
A 111ajor bottleneck encountered ir~<br />
data collection refers to anibiguity in the use<br />
or concepts and tcrtninologies in definition, nomenclature and classification <strong>of</strong> the<br />
diverse Iialure <strong>of</strong> resource in dinerent states and union territories To overcome this<br />
deficie~lcy a complete framework <strong>of</strong>concepts have been formulated on the basis <strong>of</strong> pilot<br />
studies conducted ill various agro-cliniatic regions <strong>of</strong> [lie country in order to bring ill<br />
unifor~irity nt tlie national level Inland fishery resources can be described in tlie<br />
following Ir1anner.<br />
A, Frcslt ~vntc resources sucl~ ns :
I. <strong>Aquaculture</strong> ponds and tanks 3 Playas<br />
2. large irrigation tanks 4. Waterlogged<br />
5. Rivers and canals 9. Quarries<br />
6. Ox-bow lakdcut-<strong>of</strong>f meanders 10. Ash ponds<br />
7. Reservoirs I I. Excavations<br />
8. Swamps<br />
B. Saline water<br />
1. Lagoons<br />
2. Estuaries<br />
3. Creeks<br />
4. Mangroves<br />
5 Salt pans<br />
6 Marshes<br />
7 Other impoundments ( Bherries etc )<br />
Many <strong>of</strong> the water bodies mentioned above contribute very marginally to the<br />
total fish production and hence may not be <strong>of</strong> much in~portarice in formulating s~rategies<br />
the purpose <strong>of</strong> production assessment Hencc all those potential clnss <strong>of</strong> water bodies<br />
need coverage under catch assessment prograriirnes are being classified below for the<br />
execution <strong>of</strong> the metliodology in order to provide firni, reliable and statistically sound<br />
data base on inland fislieries.<br />
Group -1 : (Water bodies up to 10 ha water spread area at full rank level)<br />
I. <strong>Aquaculture</strong> ponds and tanks<br />
2. Brackish water impoundments<br />
3. Waterlogged areas<br />
Group U :<br />
I. Large Irrigation Tanks<br />
2. Reservoirs and check dams<br />
3. Lakes and Ox-bow lakes<br />
Group 111<br />
1. Rivers<br />
2. Canals<br />
3. Estuaries<br />
4. Lagoons<br />
5. Back waters
Separate sampling methods have bem devised fw estimation <strong>of</strong> resource area,<br />
fi~h production md other parameters <strong>of</strong> imponmw.<br />
Sampling Procedure lor Croup 1 water bodiu:<br />
Ponds and Tanks : Stratified three stages sampling design (Cochran, 1962,<br />
Sukhatme CI al, 1984 and Gupta et al. 1997) is adopted for assessment <strong>of</strong> water spread<br />
area and fish production. The entire state is divided into three nearly homogenous<br />
groups called strata keeping in view certain characteristic such as rainfall or soil<br />
conditions. Strata should be formed in such a way that geographical contiguity <strong>of</strong><br />
districts within the strata is maintained. Districts from each stratum forms first stage unit<br />
<strong>of</strong> selection, clusters <strong>of</strong> five pond bearing villages form second stage unit <strong>of</strong> selection<br />
and ponds within clusters as the tlird stage unit <strong>of</strong> selection. The ultimate unit is selected<br />
in the following manner.<br />
A sample <strong>of</strong> 2W <strong>of</strong> the districts are to be selected from each stratum subject to<br />
a minimum <strong>of</strong> two districts are included in the sample within each stratum. A list <strong>of</strong><br />
villages bearing ponds and tanks is then prepared and clusters <strong>of</strong> five villages are formed<br />
for further selection. A sample <strong>of</strong> 10% <strong>of</strong> the clusters ( 2nd stage) is selected from each<br />
sample district for estimation <strong>of</strong> pond area statistics. At the third stage <strong>of</strong> sampling five<br />
ponds within each selected cluster is taken by simple random sampling for estimation <strong>of</strong><br />
catch. However, locations, where units are widely scattered and formation <strong>of</strong> cluster is<br />
not beneficial, may adopt simple random sampling.<br />
Notations:<br />
Let<br />
N, - Nuniber <strong>of</strong> districts in h-111 stratum<br />
4 = Number <strong>of</strong> districts selected in h-th stratum<br />
M, - Number <strong>of</strong> clusters in i-th district<br />
m, = Nuinber <strong>of</strong> clusters selected in i-th district<br />
N,<br />
M~=C Mh,<br />
L.1<br />
=Total clusters in h-th stratum<br />
Bu - Total number <strong>of</strong> ponds in j-th cluster <strong>of</strong> i-th district<br />
Blyi Number <strong>of</strong> ponds harvested in j-111 cluster <strong>of</strong> i-th district<br />
bw Number <strong>of</strong> ponds selected in j-th cluster <strong>of</strong> i-th district<br />
%= Area <strong>of</strong> k-th pond in j-th cluster <strong>of</strong> i-th district<br />
& = Area <strong>of</strong> all waterbodies ill j-th cluster <strong>of</strong> i-th district
Ny = Area <strong>of</strong> dl waterbodies harvested in j-th cluster <strong>of</strong> i-th district<br />
(a) Estimation <strong>of</strong> total area ( Two stage sampling)<br />
fitimale <strong>of</strong> average area per C ~USK~<br />
"h<br />
EM,, 4,<br />
; where ii,,,=-x A,,<br />
= - 1.1 I m*<br />
"" ,#& mhrl-1 . ... . ,, ., (1)<br />
fitimate <strong>of</strong> average area harvested per cluster<br />
"h<br />
x M*,<br />
z;,<br />
I ""<br />
; where
,vh, b,<br />
I U Bh4wb=- Mk<br />
Bh ll*,., mk/'l M,<br />
c =-C I '<br />
- ; where -<br />
b,=-C<br />
rsrrmate <strong>of</strong> average ptrdr hamsred per clus~er<br />
I=-c w, Fir' ; where 6;-c B;~<br />
c I "b -, 1 "'<br />
Bh 1lhl., "'~1'1<br />
Estinrate ojtoral p~rdc it1 h-th strarunl<br />
b<br />
" d<br />
nh = B~*M,, ; ~vlrrrr Mh, = 1 Mh,<br />
1.1<br />
firintate o/ toralpttdr lrarvesred itr 11-11, stratum<br />
1 I<br />
13: = B-,'*Mh, ............ (8)<br />
3. Estimation <strong>of</strong> fisl~ yield (Three stage sampling) :<br />
Let<br />
yw = Yield <strong>of</strong> k-th ponds in j-th cluster <strong>of</strong> i-th district in h th straum<br />
xw a Area <strong>of</strong> k-th pond in j-th cluster <strong>of</strong> i-th district in h-th stratum<br />
brinrate oj yield per prrd irr j -th cluster IS<br />
....<br />
hlirtra!e oj yield per clt~srer F i-rh d~srrict is<br />
Fh, = Lz B;&,<br />
hr<br />
&rin~a/r oj yield per clrrster F h-tlr srratr~nr is<br />
A_<br />
" I<br />
......<br />
Y*=-C lVk Yh, ........<br />
'I4<br />
.(lo)<br />
..(I I)
Similarly esrimarejor area bused on seleoedpoirds IS,<br />
Esrimaie <strong>of</strong> area per cluster is<br />
The above estimates assume that MI'S and Bb's for the populatiori are known<br />
heldper hectare (Ratio brimare)<br />
A_ A-<br />
i=( Yh)4 Xh)<br />
Esiinrare <strong>of</strong> rotalyieWjronr h-rh srraruni based or1 (he ratio rslinrate is<br />
*<br />
(F,)=i A; or (y,)=i ,iL<br />
here ~L=rocal area harvested ulrder yotrh atrd io!iks i/r rlre sira~itnr<br />
This nray be replaced by A:<br />
The above esrmiare is rjlicieiir bur biased. 7he bra.$ tvill he rit~gligrhle<br />
Sampling Procedure for Group II wnter bodies:<br />
Raervoirs, Irks, beelr and large irrigatio~~ tat~ks: There is a great variability<br />
with respect to size and productivity <strong>of</strong> various reservoirs in India Hence, there is a<br />
strong case <strong>of</strong> sub classify them into various subgroups on the basis <strong>of</strong> area in order 10<br />
make reliable and accurate assessment <strong>of</strong> fish production. The following subclassfication<br />
seems appropriate<br />
Small reservoirs (I0 to 500 ha <strong>of</strong> water area at FKL)
Medium reservoirs (500 to I000 ha. <strong>of</strong> water area at FRL)<br />
Large reservoirs (1000 ha. and above)<br />
As far as area statistics is concerned. a Iota1 inventory <strong>of</strong> resources under each<br />
stratum is made and then the following selection procedure is adopted for estimation <strong>of</strong><br />
fish production.<br />
The water bodies under each stratum are classified into the above three groups<br />
and a random sample <strong>of</strong> 20% <strong>of</strong> the water bodies from each group may be taken for<br />
survey for physical observations. Further classification on the basis <strong>of</strong> information<br />
available on the type <strong>of</strong>their exploitationd may also be made. For making strategies for<br />
collection <strong>of</strong> catch da~a on harvesting days this type <strong>of</strong> classification would be<br />
advantageous. Therefore, it may be suggested that they may be sub-grouped into the<br />
following two categories.<br />
1. Waterbodies which are harvested during a short interval extending from a<br />
fortnight to about a ~nontll. These water bodies are mostly small reservoirs and lakes<br />
which fall under the perview <strong>of</strong> state departments and exploitatin is affected either by<br />
auctioning them to private contractors under certain terms and conditions or exploited<br />
depart~nentally by engaging contract labour Hence, the bulk <strong>of</strong> harvest is a one time<br />
operation which continues for a fortnight to about a month Data on catch <strong>of</strong> 20% <strong>of</strong><br />
such water bodies selected by simple random sampling should be observed by the survey<br />
staff through pl~ysical observation to cross check the authenticity <strong>of</strong> the catch records<br />
maintained by the agency.<br />
2. Water bodies which are exploited round the year by fishermen cooperatives <strong>of</strong><br />
individual fishermen on the basis <strong>of</strong> licenses, free fishing, royalty or any other such<br />
mode. Selection <strong>of</strong> 20% <strong>of</strong> water bodies in each stratunl is made by simple random<br />
sampling procedure. Assessrilent <strong>of</strong> catch is undertaken for selected water bodies in each<br />
stroluln by adopting sa~npling villages as the second stage unit <strong>of</strong> selection. Each<br />
sampled village is then observed as per the scheme suggested for group 111 for recording<br />
the data on catch.<br />
Notntiot~s<br />
Let<br />
N, = Total Number <strong>of</strong>water bodies <strong>of</strong> the 1-th sub-group in h-th stratum<br />
N,' = Number <strong>of</strong> water bodies harvested in h-th stratum<br />
n, = Number <strong>of</strong> water bodies selected from N,<br />
n,' = Number <strong>of</strong> selected water bodies which have been harvested among n,<br />
q,,,= Area <strong>of</strong>j-111 water body <strong>of</strong>the I-tti sub-group in 11-111 stratum<br />
yhu = yield <strong>of</strong> 1-th water body <strong>of</strong>a group in 11-111 stratum
(Value <strong>of</strong> yw is obtained by recording total fish catch in cases where water body is<br />
harvested during a short interval <strong>of</strong> the year However, water bodies which are harvested<br />
during the entire yea as discussed in the sampling procedure, y,,,, is estimated by fbrther<br />
sampling as under)<br />
(1) U total fish catch is recorded at a centre on each sanlpling day :<br />
Average caich ar k-ill cerrtre per dzy<br />
1 1<br />
Yh,,k=qC Mh#,k, .G,kl wht,re h#l = -C) hfl,#,<br />
ht,k ' "lhlill<br />
b/rmaie oj average carclr at k-th crrrrrr cltrrr~~g rhc nrot~rlil~~ar<br />
brtma/e <strong>of</strong><br />
'hyk<br />
roral caich J - lh water bodp<br />
(1 5)<br />
(16)<br />
where<br />
y,,,, = yield <strong>of</strong> I-th day <strong>of</strong> k-111 centre at j-111 water body <strong>of</strong> I-111 sub-group<br />
DM, = Total fishing days in the k-th centre <strong>of</strong>j-th water body during the montNyear<br />
d,,,, = sample days selected out <strong>of</strong> D, during the mont Wyear<br />
(MontMyear will depend on whelher estimates are prepared n~onthly or yearly)<br />
Mkuk, = Total nets operated on I-th day <strong>of</strong> k-th centre at j-th water body <strong>of</strong> I-th subgroup<br />
~s, = Total nets sampled on I-th day <strong>of</strong>k-th centre at j-th water body <strong>of</strong> I-th sub-group<br />
(2) Iffish catch is recorded by observing further sampling <strong>of</strong> few gears out <strong>of</strong> the total<br />
gears used on the sampling day<br />
Average yieldper srlec/ed waler bdy ojl-ih srrb-bwrrp 111 h-111 slrclrrmt<br />
Similarly, average area jwr Haler<br />
is
fitinrate oj yieldper hoctare (Ratio estimate)<br />
Estimate <strong>of</strong> total yield is (on the basis <strong>of</strong> total l~arvested nrea)<br />
L/~IIIu/~ r,//o/aljis/r pdrrctio~l for /Ire slale rr~rdcr GI orcp-lJ is give11 by<br />
San~plilig Procedure for Group - 111 water bodies:<br />
Stratifed two stage sampling is adopted for this group A list <strong>of</strong> fishing villages<br />
is prepared before hand and then a simple random sample <strong>of</strong> 20% <strong>of</strong> the villages froni<br />
each group is selected for observation <strong>of</strong> catcli by the following procedure.<br />
Each selected centrdfishing village is physically observed on two consecutive<br />
days in each <strong>of</strong> the first and second fortnight during the month. On a selected day <strong>of</strong><br />
miipling at a centre, data is collected during l2OO to 1800 hrs. and on second day from<br />
0600 to 1200 hrs. Data on night landings, if any, in between the consecutive days are<br />
collected by inquiry on the second day. The information should be collected from the<br />
fisllernlen by both enquiry and physical obsetvation. On the second day <strong>of</strong> observation<br />
the investigator should collect inforniation on the total number <strong>of</strong> fishing units operated<br />
on that day, fishing tunits sampled out <strong>of</strong> the total, the total catcli landed from the<br />
observed units and species composition. He should also ascetiain the number d<strong>of</strong> fishing<br />
holidays by esch type <strong>of</strong> fishing units since the last sarrlpling day However, the san~plir~g<br />
days in a nronth may be increased depending on tlre available resources and the units<br />
potential ill fish landings.<br />
N, Nurnber <strong>of</strong> landing centredfishing villages in 11-111 stratum (h=1,2,3)<br />
n, = Number <strong>of</strong> landing centredfishing villages selected in 11-th stratuni<br />
G, =Types <strong>of</strong> nctdgears used in i-th village<br />
D = Number <strong>of</strong> lishing days during the month <strong>of</strong>j-111 type
stratum @I,2 ....... G, ; i=1.2, ........ N,)<br />
dGj = Number <strong>of</strong> sample days during the month <strong>of</strong>j-th type net in i-th village <strong>of</strong> ti-th<br />
stratum (j=1,2 ....... G, ; i=1,2, ........ n,)<br />
Mw = Number <strong>of</strong>j-th type net operated on k-th day in i-th village <strong>of</strong> h-th stratum<br />
= Number <strong>of</strong>j-th type net observed on k-th day in i-th village <strong>of</strong> h-th stratum<br />
Yw - Fishing yield <strong>of</strong>each unit <strong>of</strong>j-th type net on k-th day in i-th village <strong>of</strong> h-th stratum<br />
Average caichper wrrl ('~eU~rei-irde)<br />
firrmare <strong>of</strong> average corch oJ/-lh rjpe rrei pBr duy<br />
dhv<br />
j<br />
h,, hl,k<br />
filinrale <strong>of</strong> average caiclr per cerrire<br />
Total motrlhly ca~ch ill h-ih sirarunr i.r
Reference<br />
I .Cochran,W C..1962.Sampling Techniques. Willey Eastern Limited, New Delhi &<br />
Bangalore.<br />
2.Sukhatme.P.V. Sukhrtme,B.V.. Sukhatme.S and As0k.C. 1984.Sampling Theory<br />
<strong>of</strong> Surveys with applications. lowa State University Press, Ames. lowa<br />
(USA) and Indian Society <strong>of</strong> Agricultural Statistics, New Delhi.<br />
3.Gupta. R.A.. Manda1.S.K. and Maumdar,S., 1997.Methods <strong>of</strong> Collection <strong>of</strong> Inland<br />
Fisheries Statistics in 1ndia.<strong>Central</strong> Inland Capture Fisheries Researcl~<br />
<strong>Institute</strong>, Barrackpore. Bull.No.77.
CORRELATIONS ANL) REGRESSIONS<br />
A.V. Surly* Rao<br />
Cenrrd Rice Research lmtrruta<br />
Cnrtack<br />
CORRELATION<br />
When information on two or more variables are processed. ~t is natural to think<br />
whether any functional relations exist among these variables. If any functional<br />
relationship exists among variables, then a question comes to our mind that how closely<br />
are the variables associated In other words, we seek the degree <strong>of</strong> association among<br />
the variables.<br />
The techniques, developed to measure the degree <strong>of</strong> association among<br />
variables, are known as correlelion methods and when an analysis is performed to<br />
determine the amount <strong>of</strong> correlation with its level <strong>of</strong> significance, it is known as<br />
coneletion analysis. The resulting measures <strong>of</strong> correlat~on are known es correlation<br />
coefficients end it is denoted as r (for simple lineer c<strong>of</strong>relelion between two veriebles).<br />
When more than two variables occur, the correlation coefficlent is denoted as R and is<br />
known as multiple correlation coefficient<br />
Formula for computation <strong>of</strong> simple hear correlation coefficlent r between two<br />
variables, say, X and Y is given by:<br />
r = Cov(X,Y)/ Sqri {(Var X) (Var Y)) I e r = Z xy / d{ (~x')(~y'))<br />
X W n and i' = XYln<br />
The value <strong>of</strong> correlation coefficient lies between -1 to +I and it has no unit.<br />
When the value <strong>of</strong> the correlation coefficient is equal to 0, we say that there is no linear<br />
association between the variables. On the other hand, if the correlation coeffic~ent is<br />
equal to -1, we say that the two variables are negatively associated which means that,<br />
when a positive change in one variable is associated with a negative change In the<br />
other, and when the value <strong>of</strong> the conelatim coeRctent is +I, it is positively associated<br />
indicating there by that, both the variables changes in the same direction.<br />
Even though the value <strong>of</strong> correlation coefficient is zero, it does not indicate the<br />
absence <strong>of</strong> any relationship between two variables It is possible for the two variables to<br />
have a non- linear relationship. This is the reason why it is preferred lo use the word<br />
linear in simple comlation coefficient, instead <strong>of</strong> correlation coefficient.
Test <strong>of</strong> significance <strong>of</strong> the simple linear correlation coefficient by comparing the<br />
computed r value with the tabular r value at n-2 degrees <strong>of</strong> freedom, where n stands for<br />
the number <strong>of</strong> observations with which !he computation is performed. The simple linear<br />
correlation coefficient r is declared to be significant at (say) a level <strong>of</strong> significance if the<br />
absolute value <strong>of</strong> the computed r value is greater than the tabular r value at the a level<br />
<strong>of</strong> significance at n-2 degrees <strong>of</strong> freedom. The term significance is generally to know<br />
whether \he linear correlation coefficient r is different from zero.<br />
In case <strong>of</strong> more than two variables, the linear correlation coefficient is known as<br />
multiple correlation coefficient and is designated as R. The significance <strong>of</strong> R is<br />
assessed by F-test with n-p-I degrees <strong>of</strong> freedom, where p is the number <strong>of</strong><br />
independent variables under study.<br />
Closely related to multiple correlation is that <strong>of</strong> partial correlation. By partial<br />
correlation we mean that the correlation between two variables in a multivariable<br />
problem with a restriction that any common association with the remaining variables has<br />
been eliminated. For example, a first order partial correlation coefficient is one which<br />
measures the degree <strong>of</strong> linear association between two variables after taking into<br />
account thelr common association with a third variable.<br />
If there are three variables say 1, 2 and 3, we can have three simple linear<br />
correlation coefficients i.e. r12, r, and r,. The partial correlation coefficient between two<br />
variables, sey 1 and 2 when the third variable 3 is held constant, i.e. taking into account<br />
the common association with the variable 3. Symbolically, we write this as:<br />
The partial correlation between two variables when the third is held constant, is<br />
also known as first order partial correlation coefficient . Similarly the second order partial<br />
correlation coefficient can be, symbolically , written as<br />
which measures the association ship between variables 1 and 2 independent <strong>of</strong> the<br />
variables 3 and 4<br />
REGRESSION<br />
When two or more variables are related to each other, we not only seek a<br />
mathematical function which tells us how the variables are associated, but also we seek
to know how precisely the value <strong>of</strong> one variable can be predicted if we know the<br />
value(s) <strong>of</strong> the assoclated variable(s). The techniq~s us4 to eccompllsh these<br />
objectives are known as regression methods. Regresston methods are used to<br />
determine the best functional relation among the variables.<br />
Regression procedures can be classified according as per number <strong>of</strong> variables<br />
involved and the form <strong>of</strong> functional relationship between the dependent and<br />
independent variables. The procedure is termed simple if only two variables (one<br />
independent and one dependent variable) are involved. In case <strong>of</strong> more than two<br />
variables the procedure is called as multiple. If the relationship is hear then it is termed<br />
as linear, otherwise nonlinear. Thus the regression analys~s can be classified into four<br />
types as follows.<br />
Simple linear regression<br />
Multiple linear regression<br />
* Simple nonlinear regression and<br />
Multiple nonlinear regression<br />
LINEAR REGRESSION<br />
For simple linear regression analysls to be applicable, the followtng conditions<br />
must hold:<br />
There is only one independent variable, (denoted as X) affecting the dependent<br />
variable ( denoted as Y)<br />
The relationship between Y and Xis known, or can be assumed, to be Itnear.<br />
To compute the regression equation, it is required to estimate the regression<br />
coefficient b and the intercept (constant) a for which it is required to assume one<br />
variable as dependent and the other as an independent As a general practice, the<br />
variable designated as X is an independent and the variable designated as Y is a<br />
dependent on X are assumed. Regression coeffic~enl b is then estimated as:<br />
Estimation b, the regression coefficient and Ihe intercept a may be computed as<br />
follows:
z XY<br />
b = - and a = P - b 2 where the notations have their usual meanings.<br />
For testing the significance <strong>of</strong> b, t-test is employed. The test <strong>of</strong> significance <strong>of</strong> b<br />
ie done to examine whether or not the coefficient b is different from zero. Since !-test is<br />
based on the normal distribution; it is necessary that the variable X must be normally<br />
distributed observed samples. Generalized I-test is given by:<br />
difference<br />
t = ----------------<br />
standard error <strong>of</strong> their difference<br />
The standard error (denoted as 5,)<br />
b is given by:<br />
For testing whether or not the intercept a is different from zero (the regression<br />
line passing through the origin), the formula is given by:<br />
Although the assumption <strong>of</strong> a ltnear relationship between any two variables in<br />
biological materiels seldom holds, it is usually adequate within a relatively small range<br />
in the values <strong>of</strong> independent variable. For example, the growth rate (as measured by<br />
weight or size) <strong>of</strong> living indtviduais is rapid at younger age and remain static or declines<br />
considerably as the individuals become older.<br />
The relationship between any two variables is linear if the change is constant<br />
throughout the entire range under study. Math~matically, the equation to a straight line<br />
is given as:<br />
Y = a + bX where,<br />
Y is a dependent variable; a is an intercept (a constant);<br />
b is a slop (regression coefficient) and<br />
X is an independent variable<br />
The graphical representation <strong>of</strong> a linear relationship is a straight line, that is the<br />
shortest distance between any two points, looks as:
(Graphical representation <strong>of</strong> a straight line)<br />
The value <strong>of</strong> dependent variable (Y) can be determined by using the above<br />
mathematical representation, corresponding to a given value <strong>of</strong> the independent<br />
variable X (within the range <strong>of</strong> X values).<br />
When there are more than one independent variable, say p independent<br />
variables (XI, Xz, ... X), the simple linear function form <strong>of</strong> mathematical representation<br />
i.e. equation is as follows:<br />
Where, a is the intercept (constant and the value <strong>of</strong> Y when all X's are zero) and<br />
bl 's are partial regression coefficients associated with the independent variables XI<br />
This represents the amount <strong>of</strong> change in Y for each unit changes in Xl,s.<br />
When the values <strong>of</strong> b, 's are not zero, it indicates the dependence <strong>of</strong> Y on Xis.<br />
Hence test <strong>of</strong> significance <strong>of</strong> b,'s are necessary to determine whether or not b = 0 is an<br />
essential for the regression analysis. Sometimes we may also seek to test the<br />
significance <strong>of</strong> the intercept to know whether or not a = a, where a, is any value<br />
specified by us. For example, if we wish to determine whether Y = 0 when Xlbs in the<br />
equation is zero. This is nothing but to check whether the line passes through the origin.<br />
For this, we must test whether or not a = 0.<br />
Homogeneity <strong>of</strong> Regression Coefficienl:<br />
When several linear regressions are estimated (due to different<br />
environments), it is usually important to determine whether various regression<br />
coefficients (slopes) <strong>of</strong> regression lines differ from each other. This is what is known as<br />
testing the homogeneity <strong>of</strong> regression coefficients.<br />
Of course, the concept homogeneity <strong>of</strong> regression coefficient is closely related to<br />
the interaction effecls among different factors in Analysis <strong>of</strong> Variance. Regression lines<br />
having equal slopes (non s$nificance <strong>of</strong> b's) are parallel to each other indicating that<br />
there is no interaction effect among the factors. It is to be noted that homogeneity <strong>of</strong>
egression does not imply equivalence <strong>of</strong> regression lines. For two or more regression<br />
lines to be coincide, the intercepts and slopes must be homogeneous.<br />
MULTIPLE LINEAR REGRESSION<br />
Regression analysis, involving more than one independent variable, is called<br />
mulliple regression analysis. When all independent variables are assumed to affect the<br />
dependent variable in a linear fashion and independently <strong>of</strong> each other, the procedure is<br />
called multiple linear regression analysis.<br />
The multiple regression analysis involves eslimalion and test <strong>of</strong> significance <strong>of</strong><br />
p+l parameters (a, b,, b, ,... b,) by means <strong>of</strong> F=-test employing Analysis <strong>of</strong> Variance<br />
(ANOVA). The slructure <strong>of</strong> ANOVA for regression analysis is as follows:<br />
SOURCE D.F SS MSS F<br />
Due lo Regression p 1 (b,)(~(x,Y)<br />
-+ RSS RSS/p + A NB<br />
Dev.from Regression n-p-1 Z (Y' - RSS) -+ ESS ESSln-p-1-4 6<br />
Total n-1 Z Y'<br />
hl<br />
Coeffic~ent <strong>of</strong> determlnatlon R' IS computed as R'IRSS whlch measures the<br />
contr~but~on <strong>of</strong> the llnear funct~on <strong>of</strong> p Independent varlables to the dependent varlable Y<br />
and ~ts square root that IS R IS mult~ple correlation coefficlent The coefficlent <strong>of</strong><br />
determlnatlon IS generally expressed In percentage whlch Infers the total varlatlon In the<br />
dependent varlable contributed by the Independent varlables<br />
The computed F value is compared to the tabular F value <strong>of</strong> variance ratio table<br />
<strong>of</strong> Fisher & Yales with (n-p-1) degrees <strong>of</strong> freedom. If the computed value <strong>of</strong> F is greater<br />
than the tabular F value, we say that the estimated linear multiple regression is<br />
signlflcant at a specified level <strong>of</strong> significance. Generally, 5% (P=0.05) and 1% (P=0.01)<br />
level <strong>of</strong> significance are specified for agricultural experiments.<br />
The significance <strong>of</strong> linear regression indicates that some portion <strong>of</strong> the variability<br />
in dependent variable Y is explained by the linear function <strong>of</strong> independent variables Xi<br />
Coefficient <strong>of</strong> determination, denoted as RZ (square <strong>of</strong> the multiple correlation) provides<br />
the lnforrnation on the size <strong>of</strong> that portion. Hence, if the value <strong>of</strong> R2 is high then the<br />
regression equation explains better. On the other hand, it the value <strong>of</strong> RZ is low, even if<br />
the F test is significant, the regression equation may not be <strong>of</strong> any meaning to the<br />
experimenter. It is also true that the value <strong>of</strong> ~'Increasee with the increase in number <strong>of</strong><br />
independent variables. Care should be taken to discard the variables which are highly<br />
correlated among themselves. The analysis becomes cumbersome when independent<br />
variables increase considerably.
Two important points are to be kept in mind while going for linear regression<br />
analysis.<br />
P The effect <strong>of</strong> each and every independent variables on the dependent variable<br />
should be linear. That is the amount <strong>of</strong> change in Y per unit change in XI is constant<br />
through out the range <strong>of</strong> XI values under study.<br />
P The effect <strong>of</strong> each XI on Y should be independent <strong>of</strong> other X<br />
Violation <strong>of</strong> any one or both the above mentioned points leads to what is known as<br />
non-linear relationship.<br />
SIMPLE NONLINEAR REGRESSION<br />
Functional relat~onship between two variables is said to be nonl~near if the rate <strong>of</strong><br />
change in dependent variable Y per unit change in independent var~able X is not<br />
constant. It is quite common to have such nonlinear relationship in biological organism.<br />
When the relationship among variables is not linear, the regression analysis is<br />
inadequate and therefore one must go for nonlinear regression analysis.<br />
A few mathematical models which are frequently encountered in applied<br />
research are given below:<br />
i) 'f=abx<br />
ii) Y = a + blX<br />
iii) Y = a + bX + cx2<br />
These nonlinear relationship can be made linear by simply transforming either<br />
one or more variables and then the procedure <strong>of</strong> linear regression technique can be<br />
applied.<br />
Equation (i) can be made linear by taking logarithm both sides. Similarly<br />
equation (ii) can be made linear by taking 1IX as X'. In case <strong>of</strong> (iii), an additional term<br />
cx2 is added in to the equation to a straight line. Here the additional term was created to<br />
make the model linear.<br />
MULTIPLE NONLINEAR REGRESSION<br />
When the relationship between the dependent variable Y and a set <strong>of</strong><br />
independent variables Xi 's is not linear, it is said to be multiple nonlinear relationship.<br />
Following are the reasons for the existence <strong>of</strong> nonlinear relationship.
i) At least one <strong>of</strong> the independent variables exhibits a nonlinear relationship with<br />
the dependent variable.<br />
ii)<br />
If any two independent variables interact with each other.<br />
The analytical procedure for nonlinear relationship becomes cumbersome when<br />
the independent variables Increase.<br />
HOW TO FIND THE BEST FUNCTIONAL RELATIONSHIP<br />
in order to search for best functional relationship, several techniques are<br />
available. The most commonly used techniques for identifying best relationship among<br />
variables are (i) scatter d~agram technique, (ii) lest <strong>of</strong> significance technique and (iii)<br />
step wise regression technique.<br />
(i) Scatter diagram technique:<br />
It 1s most simplest and commonly used technique in determining the relationship<br />
between two var~ables. In this technique, all the pair <strong>of</strong> values <strong>of</strong> X and Y variables are<br />
plolted(as dots) in X-Y plane to get a scattered diagram. This diagram can be<br />
examined to ensure the pattern <strong>of</strong> the functional relationship.<br />
(~i) Test <strong>of</strong> significance lechn~que :<br />
This technique is used to eliminate unnecessary variables in the regression<br />
equat~on. Based on this technique, regression coefficients which are non-significant can<br />
be dropped while obtaining the functional relationship.<br />
(iii) Step-wise regression lechnique:<br />
This lechnique Is almost similar lo the test <strong>of</strong> significance technique where in all<br />
significant variables are included in regression. This objective can also be achieved by<br />
employing step-wise regression technique in adding variables, one at time. Here it is<br />
kept in mind that some variables may be dropped while determining the functional<br />
relationship even if they are perfectly associated.<br />
SOME MISUSES AND MISINTERPRETATIONS OF CORRELATION AND<br />
REGRESSION ANALYSIS<br />
Correlation and regression analysis is one <strong>of</strong> the most powerful tool in agriculture. It<br />
leads to incorrect interpretation <strong>of</strong> the result if the analysis is misused. One <strong>of</strong> the most<br />
commonly misuse associated with correlation and regression is to generalize the<br />
functional relationship beyond the data range, that is by extrapolating the result out
sMe !he range <strong>of</strong> the data in lndepandent variable. The generalization <strong>of</strong> regresslon<br />
beyond the original date is risky and should be attempted with proper knowledge in<br />
biological phenomenon.<br />
Another area <strong>of</strong> misuse the functional relationship is the application <strong>of</strong><br />
generalized results for substitution. As far as practicable, the method <strong>of</strong> substitution be<br />
avoided. Only in some limited cases where there is wide range <strong>of</strong> variation exist, the<br />
substitution can be employed.<br />
Some times data from individual replications are employed to find the functional<br />
relationship. Care should be taken to employ the mean data over all replications for<br />
determining the functional relationship. If in the ANOVA, significant difference is<br />
detected among replications, then data from individual replications can be considered if<br />
determining the functional relationship.<br />
In simple correlation analysis, if r turns out to be significant implies the presence<br />
<strong>of</strong> causal reiationship between two variables. Even though correlation analysis<br />
quantifies the degree <strong>of</strong> association, it cannot provide the reason for such association.<br />
A non-significant r value cannot be taken to imply the absence <strong>of</strong> any functional<br />
relationship between two variables. Two variables may have nonlinear relationship<br />
even if r value is non-significant.
ON SOME STATISTICAL PROCEDURES FOR ANALYSIS OF<br />
DATA FROM FIELD EXPERIMENTS<br />
G. R. Maruthi Sankar<br />
Cenfrsl Research lnsl~fufe for Dryland Agricullure, ICAR,<br />
Santoshnagar, Hydembad- 500 059<br />
Correlation and regression techniques are used for assessing the relationships<br />
and predictions <strong>of</strong> variables. The d~fferent procedures <strong>of</strong> correlation like simple, partial,<br />
multiple. Rank, intra-class and correlation ratio are useful for different situations for<br />
assessing the relationships <strong>of</strong> variables. The simple and multiple regressions will be<br />
useful for making prediction <strong>of</strong> a dependent variable through a set <strong>of</strong> independent<br />
variables in different situations. The estimates <strong>of</strong> correlation and regression coefficients<br />
are tested using different stalistical tests <strong>of</strong> significance for valid inferences. The criteria<br />
for assessing model selection, comparison <strong>of</strong> models, sensitivity <strong>of</strong> regression are<br />
discussed. Some <strong>of</strong> the problems like multicollinearity and extreme observations In the<br />
data analysis are also discussed.<br />
MEASURES OF CORRELATION<br />
I. Slrnple Correlation : It measures relationship between two variables 'X' and 'Y'.<br />
It ranges from -1 tc +l.<br />
Z XY - (Z X) (I: Y) 1 n<br />
r = ----------------------------------<br />
\/IT~' - IT~)' /n) (u'- (TX)' in)<br />
2. Partlal Correlation (first order) : It measures the partial relationship between<br />
two variables 'Y' end 'XI' keeping the effect <strong>of</strong> a third variable 'X2' as constant.<br />
It ranges between -1 and +l.<br />
3. Partial Correlation (second order) : II measures partial relationship between<br />
two variables 'Y' and 'XI' keeping the effects <strong>of</strong> two other variables 'X2' and 'X3'<br />
as constant. It ranges between -1 and +l.
4. Multiple Correlation : It measures the correlation <strong>of</strong> a dependent variable 'Y'<br />
with a set <strong>of</strong> 2 or more independent variables 'X' together. It ranges between 0<br />
and 1.<br />
5. Rank Correlation : It measures correlation between ranks <strong>of</strong> two variables<br />
instead <strong>of</strong> the actual observations <strong>of</strong> variables. It ranges between 0 to 1<br />
where d I is the difference in ranks.<br />
6. Correlation Ratio : Correlation ratio 'q' is the appropriate measure <strong>of</strong> curvilinear<br />
relationship when the relationship between two variables is not linear. If<br />
relation is linear then q = r, otherwise q > r. It ranges between -1 to +I.<br />
7. Intra-class correlation : Intra-class correlation means within class-correlation.<br />
Here both the variables measure same characteristics. It Is Ihs correlation wilhin<br />
a variable with respect to some common characteristic. For example, we may<br />
work out intra-class correlation between yields <strong>of</strong> plots. Suppose there are A,,<br />
A2, ..... A, families with kt, k2, .....k. members, and let x , (i = 1,2 ,... n ; j<br />
=1,2, ..... k ,) denote the measurement on the jth member in the i th family, then<br />
intra-class correlation can be given as<br />
If k ,= k (i.e., if all families have equal members), then<br />
r = (1 I (k-I)] [(k a: 1 a 2, - 11<br />
where 0 "enotes the variance <strong>of</strong> X and 02 denotes the variance <strong>of</strong> means <strong>of</strong><br />
families. The intra-class correlation ranges between - Ill (k-1)] and 1.
MEASURES OF REGRESSION<br />
1. Slmple Regression : It measures the functional relationship between a dependent<br />
variable 'Y' and an independent variable 'X' with estimates <strong>of</strong> an intercept (a)<br />
and a slope (I)). The estimates <strong>of</strong> a and P can be negative, zero or positive. The<br />
linear regression is given as<br />
2. Multlple Regresalon : If the dependent variable 'Y' is a function <strong>of</strong> a set <strong>of</strong><br />
independent variables 'X', then the estimates <strong>of</strong> regression coefficients <strong>of</strong><br />
different variables (p) along with the intercept (a) are estimated using the matrix<br />
algebra. The multiple regression <strong>of</strong> 'Y' through different independent variables<br />
can be given as<br />
@<br />
= (x'x)" X'y<br />
The regression coefficients can be negative, zero or positive and would<br />
measure the rates <strong>of</strong> change in 'Y' for an unit change in the Independent variables<br />
3. Polynomlsl Regrrsslon : If the dependent variable 'Y' is a function <strong>of</strong> linear and<br />
other higher order effects <strong>of</strong> an independent variable 'X', then the polynomial regression<br />
is fitted to quantify the effects <strong>of</strong> a variable and its significance at different orders for<br />
prediction <strong>of</strong> 'Y'. The nth order polynomial regression <strong>of</strong> 'Y' can be given as<br />
Y .: a + pl x1 + ~1 xI2 + kxl3 + b4 xI4 + ------------<br />
= (X'X).' X'y<br />
P<br />
+ Pn Xln<br />
The polynomial regression coefficients can be negative, zero or positive and<br />
would measure the rates <strong>of</strong> change in 'Y' for an unit change in the independent<br />
variable.<br />
TESTS OF SIGNIFICANCE<br />
1. Testlng slgnlflcance In data <strong>of</strong> large samples: If X is distributed as Normal with<br />
mean p and variance U, then Z = (X-p )lo is distributed as Normal with mean 0<br />
and variance 1. If 121 > 1.96, then the sample mean is inferred to be significantly<br />
different from population mean at 5 % level <strong>of</strong> significance. If JZI > 2.58, then the<br />
sample mean is significantly different from population mean at 1 % level <strong>of</strong><br />
significance.
2. Testing single proportion : If P is proportion <strong>of</strong> success and Q=1-P is proportion <strong>of</strong><br />
failure. If mean = n P and variance = n P Q, then Z = [X - n P] I (n P Q) is<br />
distributed as Normal with mean 0 and variance 1. Same conclusions as above.<br />
3. Testing difference <strong>of</strong> proportions : Let pl = X1 I nl and p2 = X2 I n2. Mean (pl)<br />
= P1 & Mean (p2) = P2. Variance (pl) = PI Q1 I nl & V (p2) = P2 Q2 I n2<br />
Z is distributed as Normal with mean 0 and variance 1. Same conclusions as above<br />
4. Testing a mean in small samples : if X is distributed as Normal with mean p and<br />
variance o, then 2 = [ (x-p) I (a1 sqrt (n)) ] is distributed as Normal with mean 0<br />
and varianca 1. Same conciusions as above.<br />
5. Testing differences <strong>of</strong> two means : If 3 1 andT2 are means and o,' and 02'<br />
are variances based on two samples w~th nl and n2 observations, then<br />
-<br />
x1 -X2<br />
z = -----------------------<br />
(oI2 I nl) + (a: I n2)<br />
is distributed as Normal with mean 0 and variance 1. Same conclusions as above.<br />
6. Testing differences <strong>of</strong> standard deviatlons : If $1 and s2 are standard deviations<br />
<strong>of</strong> two samples with nl and n2 observations from Normal distribution with<br />
variances o12 and oz2, then<br />
sl - s2<br />
is distributed as Normal with mean 0 and variance 1. Same conclusions as above<br />
7. Testing the dlfference between sample correlation ( r ) and population<br />
correlation ( p) by making 2-transformation can be given as<br />
A<br />
z- z<br />
-<br />
z = log J G l<br />
z= 1,.JiGzTl
8. Testing the difference between two conelatlon coefficlanb 'rl' and 'rZ' by<br />
making 2-transformation can be given as<br />
ZI = log .d [(I + rl)<br />
- rl)l<br />
9. Testing the observed correlation 'r' between two variables against 'zero' can be<br />
given as<br />
r<br />
10. Testing the partial correlation 'r' between two variables keeping the effects <strong>of</strong><br />
a third variable as constant can be given as<br />
11. Testing the regression coefficlent (slope) 'p' <strong>of</strong> an independent variable 'X' can<br />
be given as<br />
P<br />
1 = -----------------------<br />
12. Testltlg the regrossion coefficient (intercept) 'a' can be given as<br />
a - a'<br />
t = --* --------------------<br />
13. Testing tile liomogeneity <strong>of</strong> regression coefficients (slopes) <strong>of</strong> 'k' sets <strong>of</strong> data<br />
(or over different seasons) can be given as
This is distributed as F with (k-1. Z n - 2k) degrees <strong>of</strong> freedom<br />
G=D-E~IC<br />
B = sum <strong>of</strong> 'residual sum <strong>of</strong> squares' <strong>of</strong> k sets<br />
C = sum <strong>of</strong> 'corrected sum <strong>of</strong> squares' <strong>of</strong> X <strong>of</strong> k sets<br />
D = sum <strong>of</strong> 'corrected sum <strong>of</strong> squares' <strong>of</strong> Y <strong>of</strong> k sets<br />
E = sum <strong>of</strong> 'corrected sum <strong>of</strong> squares' <strong>of</strong> products <strong>of</strong> X and Y <strong>of</strong> k sets<br />
14. Testing the predictability value (R') <strong>of</strong> a regression model with 'k'<br />
independent regressor variables can be given as<br />
SSR ~,EX1y+(+~X2Y+ +Pkzxky<br />
R2 = = -----------------------<br />
*........................ *-----<br />
Z Y'<br />
Y'<br />
where SSR is sum <strong>of</strong> squares due to regression<br />
15. Testing the R' adequacy <strong>of</strong> a model with 'p' regressors compared to a model<br />
with 'q' regressors where p < q<br />
where F is wlth (q, n - q - 1) degrees <strong>of</strong> freedom<br />
where ~~a = 1 - ( l-~~~)(l+ d)<br />
16. Testing the sufficiency <strong>of</strong> a model with Residual Mean Square Ratio (RMSR)<br />
criterion : This is used for testing the sufficiency <strong>of</strong> a regression model with 'p'<br />
regressors compared to a model with 'q' regressors based on F-test and can be<br />
given as<br />
where RSS is residual sum <strong>of</strong> squares ; RMSS is residual mean sum <strong>of</strong> squares.<br />
This is distributed as F with (q - p . n - q - 1) degrees <strong>of</strong> freedom
17. Percentage Rnlative EMckncy (PRE) <strong>of</strong> a ngnsrlon model 'A' comparud to a<br />
regression model '0' can be given as<br />
aZB(n+q+1)/n<br />
PRE (A) - X 100<br />
02,(n+p+ 1)ln<br />
where and a' are Realdual mean sum <strong>of</strong> squares <strong>of</strong> models 'A' and '8' models<br />
PROBLEMS OF REGRESSION<br />
1. Multlcolllnearlty: High and significant correlations between different variables<br />
compared to the over all multiple correlation and predictability <strong>of</strong> a model. This<br />
will result in linear dependence <strong>of</strong> a variable on another variable and insensitive<br />
regression coefficients for prediction. The multicollinearity can be assessed by<br />
computing an estimate <strong>of</strong> X2 and can be given as<br />
x2 = - n - I - (116) (2k + 5) log , (D)<br />
where D = value <strong>of</strong> standardised determinant<br />
n = number <strong>of</strong> observations<br />
k = number <strong>of</strong> independent variables<br />
2. Examine the resldualr for Identifying 'Outilers, High leverage and Influential'<br />
observations, testing and deletion, and improving the predictability <strong>of</strong> models.<br />
The residuals can be examined in different forms as normalised , standardised,<br />
internally studentised and externally studentised residuals.<br />
Normalised : e I = f (el , eve) = el I \j e'e<br />
Standardised : b = f (el , o ) = el l a<br />
where o = \I-<br />
Internally studentised : r~ = f (e \ , a (1- p N) = e \/ o (I- p ,)<br />
Externally studentised : r , ' = f (e 1, o I (1 - p = e I I o 1(1 - p R)<br />
where at2 = o ,, (1 - p #)
, Effectm <strong>of</strong> variables on a model : Examine the effects <strong>of</strong> independent variables on<br />
the dependent variable for their sensitivity and significances, linear dependence<br />
and muiticollineaity, lack <strong>of</strong> homogeneity <strong>of</strong> variances, randomness <strong>of</strong><br />
independent variables, autocorrelation <strong>of</strong> variables, extreme nature <strong>of</strong> variable in<br />
its relationships with other variables, violation <strong>of</strong> the assumption <strong>of</strong> normality and<br />
other aspects.<br />
4. Normal Distribution : The normal distribution can be given as<br />
(a) The curve is bell shaped and symmetrical about the line x = p<br />
(b) Mean, Median and Mode would coincide<br />
(c) As x increases numerically, f (x) decreases rapidly, the maximum probability<br />
occurring at the point x = p<br />
(d) pl = 0 (Skewness) and P, = 3 (Kurtosis)<br />
(e) Linear combination <strong>of</strong> independent normal variales is also a norrnal variate<br />
(f) Area property : P (p - o c X c (p + o) = 0.6826<br />
P (p -2 o c X c (p + 2 a) = 0.9544<br />
P (p -3 o c X < (p + 3 o) = 0.9973<br />
(g) x-axis is an asymptote to the normal curve<br />
(h) The points <strong>of</strong> inflexion <strong>of</strong> the curve are given by x = f (X) = 111 2 ] e +In1<br />
(i) Mean deviation about mean i s p CJ = (4 15) o<br />
(j) Quartile deviation = (Q3 - Ql) 12 = (2 13) 0<br />
(k)p2r+l=O(r=0,1.2.........)<br />
p2r "1.3.5 ........(2r-l)02r(r=0,1,2.........)<br />
This implies p 1 = 0 p 3 = 0 15 = 0 .......... (odd moments)<br />
pZ=I p4=3 ..........(even moments)
FUNDAMENTALS OF DESIGN AND ANALYSIS OF FIELD EXPERIMENTS<br />
WITH A NOTE ON TRANSFORMATION OF DATA<br />
Ravi R. Saxena and A. K. Roy*<br />
lndira Gendhl Agricultural Universiw<br />
Reipur-492012 M P.<br />
Bioinformalics Centre<br />
CIFA, Kausalyagangs, Bhubaneswar-751002, Orissa<br />
INTRODUCTION<br />
Experience has shown that proper consideration <strong>of</strong> statistical analysis before<br />
Ihe experiment is conducted, forces the experimenter to plan move carefully the design<br />
<strong>of</strong> experiment. The observations obtained from a carefully planned and welldesigned<br />
experiment In advance give entirely valid inferences. The subject-matter <strong>of</strong> the design<br />
<strong>of</strong> experiment includes.<br />
- planning <strong>of</strong> the experiment<br />
- oblaining the relevant information from it regarding the slatistical hypothesis<br />
under study<br />
- making statistical analysis <strong>of</strong> the data<br />
Somo Important definitions<br />
m n t<br />
: An experinienl is a device or a means <strong>of</strong> getting an answer to the problem<br />
under consideration e.g. comparison <strong>of</strong> different doses <strong>of</strong> feed or different<br />
species <strong>of</strong> fish etc.<br />
:The smallest division <strong>of</strong> the experimental material to which we apply<br />
the trealmenls and on which we make the observation on the variable under<br />
study e.g. in field experiments the plot <strong>of</strong> land is the experimental unit or pond<br />
may be experimental unit.<br />
&&n,ent: Various object <strong>of</strong> comparison in a experiment is called as treatments e.g.<br />
different species <strong>of</strong> fish or methods <strong>of</strong> cultivation are the treatments.<br />
w: Plot to plot variation under identical condition, which is due to<br />
random or chance factors beyond human control is known as experimental<br />
error.<br />
There are three important principles<br />
inherent in all experimental design.<br />
m: Replication means that a treatment is repeated two or more times, Its<br />
function is to provide an estimate <strong>of</strong> experimental error.<br />
m: Randomization is a process <strong>of</strong> assigning the treatments to various<br />
experimental unit in a purely chance manner. Its function is to assure unbiased<br />
estimates <strong>of</strong> treatment means and experimental error.<br />
<strong>of</strong> an F w :
Lacal : The process <strong>of</strong> reducing the experimental error by dividing the relatively<br />
heterogeneous experimental area into homogeneous experimental area into<br />
homogeneous groups is known as {ocal control. By reducing the experimenlal<br />
error, we can increase the efficiency <strong>of</strong> the design.<br />
Now, we shall discuss the layout and the analysis <strong>of</strong> the important designs <strong>of</strong><br />
experiments<br />
Completely Randomlsed Designs (C.R.D.)<br />
- Simplest <strong>of</strong> all the designs, based on the principles <strong>of</strong> randomization and<br />
replication<br />
- treatments are assigned completely at random to each experimental unit.<br />
Hence, the CRD is only appropriate for experiments with homogeneous<br />
experirnental units, such as laboratory experiments, where environmental effects are<br />
relatively easy to control.<br />
Randomization and Layout<br />
For a experiment with four treatments TI, T2, T3 and T4 each repeated five<br />
times, the step-by-step procedures for randomization and layout <strong>of</strong> a CRD are as<br />
follows:<br />
Step 1. Determine the total number <strong>of</strong> experimental plot(n) by simply multiplying the<br />
number <strong>of</strong> treatment(t) and number <strong>of</strong> repetitions(r) i.e. n=(r)(t). For our<br />
example n=5x4=20.<br />
Step 2. Assign a plot number to each experimental plot is from 1 to n. For our<br />
example, the plot numbers 1.........20 are assigned to the 20 experimental plots<br />
as shown in following figure.<br />
Plot No.<br />
Treatment<br />
Step 3. Assign the treatments to the experimental plots by using any randomization<br />
schemes such as random number table or by drawing cards or by drawing lots,<br />
as given in figure.
Analyrla <strong>of</strong> variance<br />
- There are two sources <strong>of</strong> variation. One is the trealrnent variation, the other is<br />
experimental error.<br />
- A major advantage <strong>of</strong> the CRD is the simplicity in the computation <strong>of</strong> its analysis<br />
<strong>of</strong> variance, espedally when the number <strong>of</strong> replication is not uniform for all<br />
treatments.<br />
CUD for equal replication<br />
The steps involved in the analysis <strong>of</strong> variance <strong>of</strong> data from a CRD experiment<br />
with equal number <strong>of</strong> replication are given below. We use the data from an experiment<br />
in the laboratory using CRD with four pots and five varieties.<br />
Step 1. Arrange the data by treatments and calculate treatment total (T) and grand<br />
total (G)<br />
Step 2. Construct an outllne <strong>of</strong> the analyrir <strong>of</strong> varlrnce as followr<br />
Source <strong>of</strong> Degree <strong>of</strong> Sum <strong>of</strong> Mean Computed Tabular<br />
variation freedom sauares sauare F F<br />
Treatment<br />
Experlrnenlal<br />
error<br />
Total<br />
Step 3. Determine the degree <strong>of</strong> freedom (d.f.) for each source <strong>of</strong> variation, if t<br />
represent the number <strong>of</strong> treatments and r, the number <strong>of</strong> replications<br />
Total d.f. = (r)(t)-1 = (4)x(5)-1 = 19<br />
Treatment d.f. = t - 1 = 5 - 1 = 4<br />
error d.f. = t (r - 1 ) = 5(4-1) = 15<br />
error d.f, can be obtained through subtraction as<br />
Error d.f. = Total d.f. - treatment d.f.<br />
= 1Q -4 =I5<br />
Table 1. Experimental data obtained from an experiment<br />
Treatment Yield Treatment Treatment<br />
R 1 R2 R3 R4 Total Mean<br />
TI 25 21 21 18 85 21.2<br />
T2 25 28 24 25 102 25.5<br />
T3 24 24 16 21 85 21.2<br />
T4 20 17 16 19 72 18.0<br />
T5 14 15 13 11 53 13.2<br />
Grand total 397<br />
Grad mean 19.8
Step4: Calculate the correction factor and various sums <strong>of</strong> squares (SS)<br />
Correction factor (C.F.) =<br />
Treatment SS<br />
c TI<br />
- 1.1<br />
r<br />
C. F.<br />
= 331.30<br />
Error SS =.Total SS - Treatment SS<br />
Step 5: Calculate the mean square (MS) for each sources <strong>of</strong> variation by dividing<br />
each SS by its corresponding d.f.<br />
Treatment SS<br />
Treatment MS =<br />
I-1<br />
Error SS<br />
Error MS = -<br />
I(r- I)
Step 6: Calculate the F value for testing the significance <strong>of</strong> the treatment difference as<br />
I:=<br />
'Treatment MS<br />
Error MS<br />
Step 7: Obtain the tabular F-values<br />
fl = treatment d.f. = (t-1) = 4<br />
f2 = error d.f. = I(r-I) = 15<br />
For our example, the tabular F-values w~th fl=4 and f2 = 15 d.f. is 2.131 at 5%<br />
level <strong>of</strong> significance.<br />
Step 8: Enter all the computed values in the ANOVA table<br />
Source <strong>of</strong> Degree <strong>of</strong> Sum <strong>of</strong> Mean Computed Tabular F<br />
variation freedom squares squares F 5%<br />
Treatment 4 331.30 82.825 13.043. 2.131<br />
Experimental 15 95.25 6.35<br />
error<br />
Total 19 426.55<br />
' Significant at 5% level<br />
Step 9: Compare the computed F value with the Tabular F value and decide on the<br />
significance on the d~fference among treatments. For our example it is<br />
significant at 5% level <strong>of</strong> significance.<br />
Step 10: Compute the grand mean and the coefficient <strong>of</strong> variation (CV) as follows:<br />
Grand mean = Gln<br />
J~rror MS xiOO<br />
CC' = Grvld mean
For our example<br />
397<br />
Grand mean = - = 19.8<br />
20<br />
The CV indicates the degree <strong>of</strong> precision with which the treatments are<br />
compared and is a good index <strong>of</strong> the reliability <strong>of</strong> the treatment. It is generally placed<br />
below the analysis <strong>of</strong> variance table.<br />
CRD for unequal replication<br />
The CRD is commonly used for studies where the experimental material makes<br />
it difficult to use an equal number <strong>of</strong> replication for all treatments. Some examples <strong>of</strong><br />
these cases are:<br />
- Feeding experiments where the number <strong>of</strong> fish for each breed is not the same<br />
- Experiments for comparing body weight and length <strong>of</strong> d~fferent species<br />
- Experiments that are originally set up wlth an equal number <strong>of</strong> replications but<br />
some experimental unit are likely to be lost or destroyed during experimentation.<br />
The analysis <strong>of</strong> variance for data from a CRD experiment with an unequal<br />
number <strong>of</strong> replications are given below.<br />
C.F. = ~ 'ln<br />
Total SS =<br />
1-1<br />
X: - C.F.<br />
' 1;=<br />
Treatment SS = x- - CF..<br />
1-1 r,<br />
Error SS = Total SS - Treatment SS<br />
Follow the same procedure as given previously and complete the analysis <strong>of</strong><br />
variance table.<br />
RANDOMISED COMPLETE BLOCK DESIGN (RCBD)<br />
Features :<br />
- Most widely used experimental designs in agricullural research.<br />
- Especially suited for experiments where the number <strong>of</strong> treatment is not large.
- Important feature <strong>of</strong> the RCB design Is the presence <strong>of</strong> blocks <strong>of</strong> equal size,<br />
each <strong>of</strong> which contains all the treatments.<br />
Randomization & layout :<br />
- Randomization process is applied separately and independently to each <strong>of</strong> the<br />
blocks.<br />
If there are six treatments TI, T2, T3 T4, T5 and T6 and three replications, we<br />
illustrate the procedure in the following steps.<br />
Step 1. Divide the experimental area into r equal blocks, where r is the number <strong>of</strong><br />
replications. For our example, the experimental area is divide into three blocks.<br />
Ulock I. Block 2. Block 3.<br />
Step 2.Sub divide the block into t experimental plots where t is the number <strong>of</strong><br />
treatments.<br />
Step 3. Assign t treatments at random to the t-plots applying any <strong>of</strong> the randomization<br />
schemes. For our example six treatments are assigned at random to the six<br />
plot using random number table.<br />
Step 4. Repeat the above steps for each <strong>of</strong> the remaining blocks<br />
Analysis <strong>of</strong> Variance<br />
-<br />
There are three sources <strong>of</strong> variability in RCB design; treatment, replication (or<br />
block) and experimental error.<br />
To illustrate the steps involved in the analysis <strong>of</strong> variance for data from a RCB<br />
design We use the data from an experiment that compared five varieties <strong>of</strong> fish given<br />
below in Table.
Step 1. Group the data hy treatments and replications and calculate treatment total(T).<br />
replication total (R) arid grand total (G)<br />
Step 2. Out line <strong>of</strong> the analysis <strong>of</strong> variance as foilom<br />
Source <strong>of</strong> Degree <strong>of</strong> Sum <strong>of</strong> Mean Computed Tabular<br />
variation freedom squares square F F<br />
5% 1%<br />
Replication<br />
Treatment<br />
Error<br />
Total<br />
Table 2 yield <strong>of</strong> different varieties<br />
Variety Replication Total Mean<br />
I I I 111 IV<br />
V1 22.9 25.9 39.1 33.9 121.8 30.4<br />
V2 29.5 30.4 35.3 29.6 124.8 31.2<br />
V3 28.8 24.4 32.1 28.8 113.9 28.5<br />
V4 47.0 40.9 42.8 32.1 162.8 40.7<br />
V5 28.9 20.4 21.1 31.8 102.2 25.6<br />
Replication 157.1 142.0 170.4 156.0<br />
total(R)<br />
Grand total (G) 625.5<br />
Grand mean 31.30<br />
Step 3. Determine the degree <strong>of</strong> freedom for each sources <strong>of</strong> varlatlon. If r, represent<br />
number <strong>of</strong> replication and 1, the number <strong>of</strong> treatments, then<br />
Total d.f. = rt - 1 = 20 - 1 = 19<br />
Replication d.f. = r - 1 = 4 - 1 = 3<br />
Treatment d.f. = t - 1 = 5 - 1 = 4<br />
Error d.f. = (r - 1) (t - 1) = (3).(4) = 12<br />
The error d.f. can also be computed by subtraction 6s follows<br />
Error d,f. = Total d.f. - Replication d.f. - treatment d.f.<br />
= 19-3-4= 12<br />
Step 4. Compute the correction factor and various sums <strong>of</strong> squares (SS) as follows
i; R:<br />
Replication SS k!-- - C. F.<br />
1<br />
,<br />
1 7;'<br />
Trealment SS = LLr<br />
- C.F.<br />
Error SS = Total SS - Replication SS -Treatment SS<br />
= 351.10<br />
Step 5. Compute the mean square for each source <strong>of</strong> variation by dividing each sum <strong>of</strong><br />
squares by its corresponding degree <strong>of</strong> freedom.<br />
Replication MS =<br />
Replication SS<br />
r-l<br />
Treatment SS<br />
Treatment MS =<br />
1-1<br />
Error MS =<br />
Error SS<br />
(r - l)(f - I)
Step 6. Compute the F value for testing the treatment difference as<br />
Treatment MS<br />
F, =<br />
Error MS<br />
Replication MS<br />
F, =<br />
Error MS<br />
Step 7. Compare the computed F1 value with the tabular F - values with f, = treatment<br />
d.f. and f2 = error d.f. and make conclusions.<br />
For our examole tabular F value with f, = 4 and f2 = 12 degrees <strong>of</strong> freedom is<br />
3.26 at the 5% level <strong>of</strong> significance. Because, the computed Fl value 4.448 is greater<br />
than the tabular F value at 5% level <strong>of</strong> significance, hence it is significant, we reject the<br />
null hypothesis. F2 is not significant at 5% level <strong>of</strong> significance<br />
Step 8. If result is significant compute critical difference and compare the treatment<br />
means for our example<br />
CD = t (at error d.f.) x is<br />
r<br />
From the bar chart it can be concluded that variety V, produces significantly<br />
higher yield than all other varieties. The remaining varieties are all on par<br />
Step 9. Compute the coefficient <strong>of</strong> variation as<br />
cv = GMS<br />
xlOO<br />
Grand Mean
Step 10. Enter all values compuled is above steps in the analysis <strong>of</strong> variance outline.<br />
The Final result <strong>of</strong> our example is shown below.<br />
Source <strong>of</strong> Degree <strong>of</strong> Sum <strong>of</strong> Mean Computed Tabular<br />
variation freedom squares square F F<br />
Replication 3 80 80 26.93 < 1<br />
Variety 4 520.53 130 13 4.448'<br />
Error 15 351.10 29 25<br />
Total 19 952.43<br />
There are number <strong>of</strong> experimental deslgns viz. L.S D., split plot design, strip plot<br />
design etc, are avaitabie in the field <strong>of</strong> agricultural statistics, which can be used and<br />
analysed by using various available statistical packages.<br />
What to do when data break the rules<br />
Research workers who are content to learn the 'recipes" for carrying out an<br />
analysis <strong>of</strong> variance without attempting to learn and understand the underlying<br />
principles, may be headed for serious trouble. Whether they realize it or not, they are<br />
making certain assumptions about the data when they perform an analysis <strong>of</strong> variance.<br />
If the data do not conform to these assumptions, such an analysis may cause workers<br />
lo reach conclusion that are not justified. They may also overlook important conclusions<br />
that would be reached if the data were properly analysed.<br />
The assumptions underlying the analysis <strong>of</strong> variance are reasonably satisfied<br />
for most <strong>of</strong> the experimenlal data in agricultural research, but there are certain types <strong>of</strong><br />
experiments that are notorious for frequent violations <strong>of</strong> these assumptions.<br />
Assumptions <strong>of</strong> Analysls <strong>of</strong> Variance (ANOVA)<br />
1. The error terms are randomly, independently and normally distributed.<br />
2. The variances <strong>of</strong> different samples are homogeneous.<br />
3. Variances and means <strong>of</strong> different samples are not correlated.<br />
4. The main effects are additive.<br />
The most common symptom <strong>of</strong> experimental data that violate one or more <strong>of</strong><br />
the assumptions <strong>of</strong> the analysis <strong>of</strong> variance is variance heterogeneity.
Procedure for detecting the presence and type <strong>of</strong> variance heterogeneity<br />
- Compute the variance and the mean across replications for each treatment (the<br />
range can be used in place <strong>of</strong> the variance)<br />
- Plot a scatter diagram between the mean value and the variance<br />
- Examine, v~sually the scatter d~agrarn to identlfy the pattern <strong>of</strong> relalionship<br />
between mean and variance<br />
The following figure shows the three posstble outcomes <strong>of</strong> such an examinat~on<br />
Variance Variance Variance<br />
mean mean mean<br />
Fig.1 Fig.2 Fig.3<br />
Fia 1. Homoaeneous Variance<br />
~ i2. g ~eteri~eneous variance when the variance is functionally related to mean<br />
Fig 3. Heterogeneous variance when there is no functional relationship between the<br />
variance and the mean<br />
Transformation <strong>of</strong> data<br />
Data transformation is the most appropriate remedial measure for variance<br />
heterogeneity. In this techniques, the original data are converted into a new scale<br />
resulting in a new data set that is expected to satisfy the condition <strong>of</strong> homogeneity <strong>of</strong><br />
variance. Because a common transformation scale is applied to all ~ bse~ati~n~, the<br />
comparative values between treatments are not altered and comparisons between<br />
them remain valid.<br />
The most commonly used transformations for data in agricultural research are:<br />
Logarithmic transformation<br />
Most appropriate for data where the standard deviation is proportional to the<br />
mean.<br />
Data that are whole numbers and cover a wide range <strong>of</strong> values e.g. number <strong>of</strong><br />
insects per plot or the number <strong>of</strong> egg masses in per unit area etc.<br />
Take the logarithm <strong>of</strong> each and every component <strong>of</strong> data set.
lllurtration<br />
If the data set involves small values (e.g. less than lo), log ( x+ 1) should be<br />
used instead <strong>of</strong> log x, where x is the original data.<br />
An example for log transformation is given in the table below.<br />
Table 3. Observed and thelr log transformed values<br />
Original Valuer<br />
log valuer<br />
Treatment Replication Replication<br />
I I1 111 I I1 111<br />
Appropriate for data consisting <strong>of</strong> small whole numbers.<br />
For percentage data where the range is between 0 and 30% or between 70 and<br />
100%<br />
if most <strong>of</strong> the values in the data set are small (0.9. less than lo), especially with<br />
zeroes present, (xt0.5)'" should be used instead <strong>of</strong> xlR, where x is the original<br />
data, e.g. data obtained in counting rare events.<br />
lllurtration<br />
For illustration we use the following set <strong>of</strong> data on percentage <strong>of</strong> diseased tiller<br />
from a paddy variety trial <strong>of</strong> 6 varieties. The range <strong>of</strong> data is from 0 lo 21.99%.<br />
Because many <strong>of</strong> the values are less than 10, data are transformed in (x+0.5)"* as<br />
shown below.<br />
Table 4. Original and their square-root transformed valuer<br />
Original Values<br />
Transformed values<br />
Variety Replication Replication
Arc Sine Transformation<br />
- Appropriate for data on proportions, data obtained from a count, and data<br />
expressed as decimal fractions or percentage<br />
- It is not applicable to percantage data which are not derived from count data<br />
such as percentage <strong>of</strong> protein in rice, percentage <strong>of</strong> carbohydrates, infection<br />
index etc.<br />
- The value <strong>of</strong> 0% should be substituted by (114n) and the value <strong>of</strong> 100% by (100-<br />
114n) where n is the number <strong>of</strong> units upon which the percentage data was<br />
based.<br />
Certain rules for proper transforrnatlon scale for percentage data derived from<br />
count data<br />
Rule 1.For percentage data lying within the range <strong>of</strong> 30 to 70%, no transformation is<br />
needed.<br />
Rule 2.For percentage data lying within the range <strong>of</strong> either 0 to 30% or 70 to 100% but<br />
not both, the square-root transformation should be used.<br />
Rule 3. For percentage data that do not follow the ranges specified in either rule 1 or<br />
rule 2, the arc sine transformation should be used.<br />
Illustration<br />
We illustrate the application <strong>of</strong> arc sine transformation with data on percentage<br />
<strong>of</strong> fish survival trial with five size classes. For each variety 75 fishes were caught and<br />
the number <strong>of</strong> surviving fishes determined.<br />
Table 5. Percentage survival and their arc sine transformed valuer<br />
Survlval %<br />
Variety Original Values Arc Sine Scale<br />
Rl R2 R3 R1 R2 R3<br />
Based on rule 3, the arc Sine transformation should be used because the<br />
percentage data ranged from 0 to 100%. Before transformation all zero values are<br />
replaced by [1/4(75)] and all 100 values by [ 100 - {1/4(75))1.
ADVANCED STATISTICAL METHODS FOR DATA ANALYSIS<br />
R. N. Subudhl<br />
Berllampur Universfly, Bertrampur<br />
Onssa<br />
I. NON-PARAMETRIC TESTS<br />
While studying testing <strong>of</strong> hypothesis, we have used some tests (like large<br />
sample, t & F-tests) which estimated prameters <strong>of</strong> populations. Those are called<br />
parametric tests. In some cases we need not worry about the population parameters.<br />
Our test and result, both are about the sample observationlfunction (which is called a<br />
'statistic'), Such tests, as discussed below, are termed as non-parametric tests.<br />
Non-parametric tests are <strong>of</strong> course used for hypothesis testing. But it has also<br />
other extensive uses. And since it does not depend on the distr~bution <strong>of</strong> the parameter<br />
(<strong>of</strong> the population), there is no reference or comparison <strong>of</strong> tabulated value (like 1-table<br />
or F-table). It checks the pattern <strong>of</strong> occurrence <strong>of</strong> items in the sample. It assumes<br />
randomness or uniformity <strong>of</strong> the items. It checks whether the distribution <strong>of</strong> items (or<br />
whether the fit <strong>of</strong> the distribution is good or not, using chi-square).<br />
There are several non-parametric tests. Most <strong>of</strong> those tests check whether the<br />
items (or data) <strong>of</strong> the same series are appearing randomly ormnot. That is, whether<br />
successive items are changing (higher or lower than previous item) randomly or not.<br />
Here we discuss only two tests as given under, viz. SIGN TEST and RANK TEST.<br />
Sign test : In this test, average <strong>of</strong> the items <strong>of</strong> the given series is found first (say, M).<br />
Then each itemlvalue is deducted from this average or mean. Sign <strong>of</strong> such differences<br />
(XI - M) is noted. Suppose we get 'r' plus signs and 's' minus signs. If there are 'n'<br />
items in the series, then r + s s n. In this case our null hypothesis is that the chance <strong>of</strong><br />
any item or value exceeding M is 112. That is, P {X>M)=P=112. Alternative hypothesis,<br />
HI : P > 112 (one tailed test) OR HI : P ;c 112 (two tailed test).<br />
In case there are zero - differences (when X = M) we have to ignore those<br />
cases. So, sample size will reduce to (rts). Statistic for comparison is :<br />
Wilcoxin has suggested further improvement to this simple sign test, as<br />
discussed below. The suggested test is popularly known as
Wilcoxin signed : Rank Test : Here, afler finding the differences <strong>of</strong> individual items<br />
from mean, we have to find ranks combinedly for all the differences, by taking absolute<br />
values <strong>of</strong> negative differences. Let T and T' be the sum <strong>of</strong> ranks <strong>of</strong> positive and<br />
negative differences respectively.<br />
SO that, f + T = 1 + 2 ..... t n = n (n+1)/2<br />
For test we can check any <strong>of</strong> the values T' or T or (T' - T) or (T* + T)<br />
While checking T', for large samples. N- (0,l) is assumed and the test statistic is<br />
given by the formula :<br />
(Here too, zero difference cases are ignored.)<br />
Run test (due to Wald - Wolfowitz) : A run IS a sequence <strong>of</strong> den tical letters (or sign)<br />
or by no letter at all Ex. + : ++ ~1<br />
In the above case there 4 runs In total, 2 <strong>of</strong> + and 2 <strong>of</strong> - signs<br />
false.<br />
If Ho is true then the number <strong>of</strong> runs (say r) wtll be large. If r is small, then Ho is<br />
We can convert any given series to a series <strong>of</strong> + - + etc by the following<br />
principle:<br />
If the succeeding item is higher than the previous term a + sign is written, if it is<br />
less, then a - sign is written. Tie cases are to be ignored. Let there are 'm' + signs 'n'-<br />
signs and furlher that r is even (= 2d, say). We should expect 'd' runs <strong>of</strong> + and another<br />
'd' runs <strong>of</strong> - signs In the series. For large sample cases,<br />
II.<br />
CLUSTER ANALYSIS<br />
Data collected by researcher has to be classified according to the need <strong>of</strong> the<br />
research design for analysis. Cluster analysis is a science <strong>of</strong> classification known<br />
previously as typology or taxonomy. Eventhough the science <strong>of</strong> classification
originated in ancient period, in modern times it was developed by a German<br />
anthropologist in 1914. But it was R. Tryon's book 'Cluster Analysis" in 1939, a<br />
psychologist, which established the analysis as an important tool for classification <strong>of</strong><br />
entities.<br />
Cluster Analysis is a technique to group variables, individuals or entities.<br />
Geometrically it is defined as a 'continuous regibns <strong>of</strong> space containing a relatively high<br />
density <strong>of</strong> points separated by such other regions by regions containing a relatively low<br />
density <strong>of</strong> points. (B. Everilt, Cluster Analysis, 1980)<br />
The variables or entities can be grouped according to their similarity measures<br />
or according to their differences or distance measures. So far Karl Pearson's<br />
correlation coefficient is taken as a good similarity measure. lnspite <strong>of</strong> its importance it<br />
is not a good measure <strong>of</strong> similarity. Some <strong>of</strong> the important drawbacks are (a) It is<br />
sensitive to shape (b) Insensitive to the magnitude <strong>of</strong> variables (c) It is calculated on<br />
linear basis in which some entities remain unexplained.<br />
Similarltvcoan[clents : Similarity coefficients can be calculated for qualitative data or<br />
for quantitative data. For qualitative data similarity coefficients are calculated on a<br />
binary scale and presented In matrix form for clustering. There are so many formulas<br />
but Jaccad similarity coefficient is very popular among the cluster analysts which J =<br />
al(a + b + c) for quantitative data the formula is Ssk =1- (Xlk - Xjk)lRk, in which R is the<br />
range <strong>of</strong> the variables and Kk and Xlk entities.<br />
Dlatence : In this method the entities are clustered on the basis <strong>of</strong> their<br />
distances or differences which are called dissimilarity measure. One difference<br />
between the sim~larity and dissimilarity measure is that the former's value remains<br />
within 0 and 1 while the laler can take any positive value. One dimculty in distance<br />
measure is that it is scale dependent. But when raw data can be standardised lo<br />
calculate distance measures. For the calculation <strong>of</strong> distance measures Euclidean<br />
Metric measures formula Is used.<br />
A distance function can be transformed into a similarity function and vice versa.<br />
Technlnues : There are different types <strong>of</strong> clustering techniques <strong>of</strong> which<br />
hierarchical technique is most popular among the analysts since it is the simplest one.<br />
There are again two methods : agglomarative and divisive method. A dendrogram can<br />
be drawn to know the clusters either by single or complete linkage method. In divisive<br />
technique there are two methods (1) Monotonic and (2) Polithetic. The first method is<br />
the easiest. For this method the entities are divided into two subsets in any <strong>of</strong> the 2 n-
2 - I ways. The two groups are termed as main group and slinter group. Gradually<br />
one after another column is separated from the main group until it satisfies a certain<br />
condition. Then these two groups are further separated in the same procedure until no<br />
separation would be possible further.<br />
Ill.<br />
FORECASTING TECHNIQUES I AUTO CORRELATION<br />
By using past records or data, we can fit some mathematical models (or<br />
equations) through which we can estimate or forecast future values. There are several<br />
methods to do this, e.g.<br />
1) Fitting linear equations (or curves) by the method <strong>of</strong> least squares, (Normal<br />
equations computed from given data);<br />
2) Regression equationslmodels (including multiple regression models);<br />
3) Autocorrelation Analysis (In case <strong>of</strong> single time series data); and<br />
4) Time series models<br />
We here discuss the concepts <strong>of</strong> auto correlation briefly. It is very much useful<br />
for snalysing time-series data.<br />
Autocorrelation<br />
Autocorrelation is the correlation between time series componentslitems at<br />
different points <strong>of</strong> time. We can group each item with the successive ilem (or item at a<br />
fixed interval) to find the correlation. If time-lag is one (values at time t and t+l are<br />
paired), it is called first-ordered auto-correlation.<br />
'Prices <strong>of</strong> a company's equity share traded on daily basis", is an example <strong>of</strong> a<br />
time series. We can grouplpair the price <strong>of</strong> each day with that <strong>of</strong> next weeks price (time<br />
lag = 7 days). We can also make time lag as one day, as per the need. (In that case<br />
prices are paired with successive items).<br />
After the pairing (grouping <strong>of</strong> data), auto-correlation coefficient can be obtained<br />
by the formula similar to that <strong>of</strong> simple correlation coefficient (r).<br />
To know the significance <strong>of</strong> auto-correlation we can use<br />
Durbin-Watson d-statistic lies between 0 and 4. If d is very close (or equal) to 2,<br />
then it is un-correlated case. If dc2, there is positive autocorrelation (strongly positive if<br />
d=O) and if d72, it is negative auto correlation, (with strong negative auto~orrelation at<br />
d=4).
IV.<br />
MULTI-DIMENSIONAL SCALING (M.D.S.)<br />
4.1 Introduction<br />
In any problem <strong>of</strong> decision making (like buying a refrigerator or choosing a<br />
strategy) we find many alternatives. Several dimensions emerge when these<br />
alternatives are evaluated. Refrigerators can be described in terms <strong>of</strong> price, capacity,<br />
hours <strong>of</strong> trouble free operations, reputation <strong>of</strong> the manufacturer etc. Similarly in the<br />
case <strong>of</strong> employment decisions, choice involves salary, working conditions, opportunity<br />
for growlh and advancement, satisfaction etc. The search for an analytical approach to<br />
tackle such 'attribute-choice" problems has led to the techniques <strong>of</strong> multi-dimensional<br />
scaling.<br />
The development <strong>of</strong> various models and techniques <strong>of</strong> multi-dimesnional scailng<br />
is <strong>of</strong> recent origin, Initially it started with applications in Psychology.<br />
Subsequently these methods were used in marketing, Econom~cs, Operations<br />
research, Applied Statistics, Mathematical Psychology and Psychometrics.<br />
In multi-dimensional scaling, it is assumed that any object or brand (usually<br />
known as stimulus) can be described by levels on a set <strong>of</strong> attributes, characteristics or<br />
properties. The relevant attributes <strong>of</strong> the problem are determined by the decision<br />
maker. For example, in the purchase <strong>of</strong> a car, the attributes can be structural like<br />
strong body <strong>of</strong> a car, its colour, and speed. There may be functional attributes like the<br />
usability for long trips hauling. There may be psychological attributes like agreement <strong>of</strong><br />
the characteristics <strong>of</strong> the car with the self concept. They may be social attibutes like<br />
people's perception <strong>of</strong> the type (<strong>of</strong> car) and <strong>of</strong> those who drive it. They may be<br />
economic attributes like initial cost, anticipated resale value and cost <strong>of</strong> maintenance.<br />
The stimulus may be presented to a respondent through :<br />
(i)<br />
(ii)<br />
(iii)<br />
(iv)<br />
(v)<br />
physical objects themselves<br />
pictorial representations<br />
verbalised pr<strong>of</strong>ile descriptions<br />
name <strong>of</strong> objects<br />
any combination <strong>of</strong> the above<br />
Multi-dimensional scaling deals with Dsvcholoalcal among stimuli and<br />
expresses them through<br />
among points in a multi-dlrnensional<br />
sDace.<br />
The psychological relations are obtained through similarities and preferences.
Thus multidimensional scaling is the problem <strong>of</strong> representing n objects<br />
geometrically by n points. The interpoint distances correspond in some sense to<br />
experimental dissmilarities between objects.<br />
4.2. Multidimensional Scaling Models<br />
Multi-dimensional scaling models are classified into metric and non-metric models.<br />
In metric models, the input data may be assumed to be ratio scaled or interval<br />
scaled. In both the cases, the scaled distances found by the model are assumed to be<br />
metrically related. Given a set <strong>of</strong> interpoint distances these models find dimensionality<br />
and configuration <strong>of</strong> points whose distances most clearly match the input values with<br />
the smallest number <strong>of</strong> possible dimensions.<br />
ii)<br />
In many practical situations metric input data may not be available. People<br />
cannot ordinarily provide accurate and reliable data about equality relationships among<br />
objects such as competing brands or about brand characteristics.<br />
In non-metric models, only the ordinal or the rank order properties <strong>of</strong> the input<br />
data are considered. The objective <strong>of</strong> non-metric MDS methods is: ' Given rank order<br />
data, to find a configuration whose rank order <strong>of</strong> distances best reproduces, in a<br />
specified dimensionality the original rank order <strong>of</strong> the input data.<br />
4.3 Technique <strong>of</strong> Multi-dlrnesnional Scaling<br />
Multi-dimensional scaling is a technique <strong>of</strong> statistical fitting. The dissmilarities<br />
between flo pain <strong>of</strong> stimuli are given and we wish to find the configuration <strong>of</strong> n<br />
stimuli in a certain number <strong>of</strong> dimensions such that the distances between the stimuli fit<br />
the dissimilarities best. A criterion for the best fitting is given in terms <strong>of</strong> monotone<br />
relationship between the observed dissimilarities and the distances obtained from the<br />
fitted configuration. Symbolically if and SN: are two observed dissimilarities, and if dij<br />
and dil; denote the corresponding distances in the configuration then Sy < 6. implies that<br />
dl < d*.<br />
if we can find a configuration that is monotonically related to the observed<br />
dissimilarities, we say we have a perfect fit. However this may not be achieved<br />
especially in lower dimensions. We therefore need to have e criterion to evaluate the<br />
goodness <strong>of</strong> fit or badness. One standard criterion proposed by Kruskal is 'strees".
This 'stress' value can b computed for any configuration intended to represent the<br />
original set <strong>of</strong> dissimilarities. The lower the stress value, the better is the fit.<br />
The method <strong>of</strong> MDS is to start with some configuration in a given number <strong>of</strong><br />
dimensions and iterate by finding new configurations with lower and lower stress value<br />
until a desired stress value is obtained. The final configuration is taken to be the best<br />
fit. Thus the procedure <strong>of</strong> MBS can be summarised in the following steps.<br />
i) For a given dimensionality, select some initial configuration X (This can be<br />
random configuration or provided by the experimenter)<br />
ii)<br />
iii)<br />
iv)<br />
Compute the distances dli between the stimuli pairs and evaluate it by<br />
computing the stress value S.<br />
If S> pre-specified cut <strong>of</strong>f find a new configuration X whose ranks <strong>of</strong> dij are close<br />
to the ranks <strong>of</strong> the obse~edissimilarities.<br />
Repeat steps (ii) and (iii) until successive configurations converge<br />
v) Repeat (i) to (iv) in the next lower dimensionality and so on.<br />
vi)<br />
Choose the lowest dimensionality for which S is satisfactorily small<br />
4.4. Applications <strong>of</strong> MDS<br />
i) Market Segnientation<br />
A very promising area for application <strong>of</strong> non-metric scaling methods is market<br />
segmentation. A product class and its buyers could be represented as points in a<br />
space whose d~mensions are perceived product characteristics. Each brand could be<br />
represented as a stimulus point and each buyer as an ideal point. A market segment<br />
might be viewed as a sub-space <strong>of</strong> this superspace in which all members<br />
(a)<br />
(b)<br />
perceive the stimuli similarly<br />
possess the same ideal point position.<br />
Identification <strong>of</strong> such sub-space in which consumers exhibit commonality <strong>of</strong><br />
perception and preference may reveal empty regions with a high concentration <strong>of</strong> ideal<br />
points and no close brands. Such an analysis would reveal the perception <strong>of</strong> different<br />
market segments about the competrtive position <strong>of</strong> the firm's brand and other brands.<br />
ii)<br />
Vendor Ev~luatlon<br />
An industrial purchasing agent may have to C ~ Q Oamong S ~ alternative vendors.<br />
One vendor may be low in price, fair on maintaining delivery promises, poor in technical<br />
service, and low in technical innovation. Another vendor may be high in prices but<br />
excellent in delivery promises. Each vendor can be represented as a point in multidimension<br />
space, the dimensions being the various criteria on which vendors are
selected. We are interested in the relative importance <strong>of</strong> each <strong>of</strong> the criteria, and how<br />
these weights vary over time.<br />
i) Advertising Evaluation<br />
MDS methods could be pr<strong>of</strong>itably used in an ad pre-testing in answering the<br />
following questions.<br />
(a) Are good ads more similar to each other than good ads are to bad ads 7<br />
(b)<br />
Do advertising personnel exhibit inter-person reliability in making similarity<br />
judgment 7<br />
(c) What are the dimensions along which ads are judged 7<br />
MDS method could also be extended to the problem <strong>of</strong> advertisement and<br />
vehicle matching. For example, what ads seam to go with what magazines 7<br />
iv)<br />
Brand Switching Research<br />
It might be <strong>of</strong> interest to couple studies <strong>of</strong> brand switching with those <strong>of</strong><br />
similarities or preference analysis. Do brand switchers perceive products differently<br />
from brand loyal customers 7 What are the characteristics <strong>of</strong> preference structures for<br />
both brand switching and brand loyal types 7
AN OVERVIEW OF STATISTICAL PACKAGES<br />
Ravi R. Saxenr and A. K. Roy<br />
lndim Gandhi Agricuduml Unhrsrsity, Reipur -49201 2<br />
Bioinfomatkis Centre, CIFA, Bhubeneswar-751002<br />
Due to the attention given in the computational and algorithmic sciences during<br />
the part decade a lot <strong>of</strong> innovations has taken made in this field. Computations which<br />
was not possible manually has come to the reach <strong>of</strong> researches owing to the<br />
development <strong>of</strong> various statistical s<strong>of</strong>twares available in the market .Almost all the<br />
popular s<strong>of</strong>tware packages has one component exclusively dealing with basis statistical<br />
calculations like Excel <strong>of</strong> Ms<strong>of</strong>fice. Basfcally computations are done on Spread sheets<br />
creating data file. One thing may be kept in mind that even with fundamental knowledge<br />
<strong>of</strong> statistics, one has to spent a lot <strong>of</strong> time to explore the packages for all practical<br />
purposes. This chapter will be dealing with some s<strong>of</strong>twares available in the market for<br />
performing statistical analysis <strong>of</strong> data. However the list is not exhaustive, there may be<br />
many more packages besides these mentioned below.<br />
1. SPAR1 (Statistical Package for Agricultural Research)<br />
This package has been developed for the statistical anaiysis <strong>of</strong> experimental<br />
data In plant breeding and Genetics. The present package includes the following<br />
program modules :<br />
- input data file<br />
- Diallel analysis<br />
Multivariate analysis<br />
Multiple, linear regression analysis<br />
- Cluster analysis<br />
Line X Tester analysis<br />
Path analysis<br />
Discriminant analysis<br />
Stability analysis<br />
Partial Diallel analysis<br />
Triple test cross anaiysis<br />
Combining Ability<br />
- Generation Means analysis, Scaling test, Joint Scaling test.<br />
- Print Result File.<br />
- Rundos commands.<br />
System requirements:<br />
\BM Compatible PC-XT, AT and SX-386 with 640 KB RAM and with Math Co-<br />
Processor.
S<strong>of</strong>tware availability:<br />
Indian Agricultural Statistics Research <strong>Institute</strong>, New Delhi.<br />
2. SPSS ( Statistical package for Soclal Sclencer)<br />
The SPSS package includes the follow program modules (Base, pr<strong>of</strong>essional<br />
Stat., Adv.Stat., Trends Categories and LISREL). It is a comprehensive integrated<br />
system for statistical data analysis.<br />
-<br />
Scatterplot, Histogram. Box plot. Error bar, Auto Correlation plots. Time series,<br />
Inter polation and regression line<br />
- Frequencies, plots, Descriptive, Cross-tabulation, Tables, Correlation's, Case<br />
listings<br />
T-test, ANOVA, MANOVA, Non-parametric test<br />
Multiple regression, Non-linear regression, Log-linear, regression, CHAlD<br />
- Cluster Analysis, Factor analysis, conjoint analysis Discriminant Analysis,<br />
Logistic regression.<br />
- Exponential smoothing, ARIMA, XI1 ARIMA, Auto regression, Seasonality,<br />
Spectral analysis.<br />
- COX regression, logistic Manova, loglinear, ~urvival>robit etc.<br />
System requirements:<br />
Micros<strong>of</strong>t windows 3.1. windows 95, 386 based personal computer (486 or<br />
higher recommended); 8 MB RAM minimum (8 MB recommended; with I% MB <strong>of</strong> Hard<br />
disk storage space.<br />
S<strong>of</strong>tware availability:<br />
Wipro - S<strong>of</strong>tware products Division' Binary Semantics limited<br />
4011 A, Lavelle Road A-6, C-Block Community Centre<br />
3rd Floor, Basappa Complex (or) Nasraina Vihar<br />
Bangalore - 56000<br />
New Delhi-110028<br />
3. SYSTAT: The SYSTAT provlder the following statistleal snalysls<br />
Basic statistics, t-tests, correlation, regression and crosstable<br />
ANOVAIMANOVA<br />
- Bootstrapping, canonical and set correlations<br />
- Classification and regression trees<br />
- Cluster analysis, conjoint analysis, correspondence anatysis<br />
- Design <strong>of</strong> experiments (7 methods)<br />
- Factor analysis and principle components<br />
- Logistic regression and probit<br />
- Loglinear model<br />
- Multidimensional scaling and perceptual mapping<br />
- Non parametric tests
- Partially ordered scale analysis<br />
- Path analysis<br />
- Repeated measures<br />
- Signal detection<br />
- Survival analysis (7 distributions)<br />
- Time series (ARIMA)<br />
1-tests<br />
Two stage least squares<br />
- 13 probability dens~lies and random number generators<br />
System requirements:<br />
Micros<strong>of</strong>t windows 3.1, windows 95, 386 based personal computer (486 or<br />
higher recommended); 8 MB RAM minimum (8 MB recommended with I% ME <strong>of</strong> Hard<br />
disk storage space, SVGS monitor.<br />
S<strong>of</strong>tware availability:<br />
BINARY SYMANTICS LIMITED<br />
A-6. C-Block Community Centre,<br />
Naraina Vihar,<br />
New Delhi -1 10028<br />
4. SPBD (Statistical Package for Block Designs)<br />
There are three main modules <strong>of</strong> this package<br />
- Catalogue <strong>of</strong> BIB designs<br />
- Generation <strong>of</strong> the design and randomized layout<br />
- Analysis <strong>of</strong> the dala generated from a BIB design<br />
System requirement:<br />
IBM compatible PC-XT, at and SX-386 with 640 KB RAM<br />
S<strong>of</strong>t ware availability:<br />
Indian Agricultural Statistics Research <strong>Institute</strong> Library Avenue, New Delhi -<br />
110012<br />
6. Deslgn - Ease 8 Design - Expert<br />
The features <strong>of</strong> the s<strong>of</strong>tware are :<br />
- Scatter plots to visualize raw data<br />
- One variable, multi-level design<br />
- Optimal resolution fractional factorial<br />
- Replicate, delete, re-block designs<br />
- handles missing or botched data<br />
- Response transformations
- View ANOVA for precise information<br />
Drag-able 2-0 contours<br />
Slim contour plots<br />
Edit colors, text & more to procure snappy reports<br />
Desirability graphs - histograms or ramp<br />
- Augment any design<br />
System requirements:<br />
Minimum 486, 6MB windows 3.11951NT 3.51<br />
S<strong>of</strong>tware availability:<br />
BINARY SYMANTICS LIMITED<br />
A-6, C-Block Community Centre,<br />
Naraina Vihar.<br />
New Delhi -1 10028<br />
6. Slgma stat<br />
Sigma stat is the only advisory Statistical s<strong>of</strong>tware unique Advisor Wizard which<br />
analysis data, recommends the test to run, and runs it. Sigma stat handles missing and<br />
unbalanced data, automatically checks that data fits the underlying assumptions <strong>of</strong><br />
statistical model. It there is a violation, sigma stat automatically warns and calculates a<br />
more appropriate report <strong>of</strong> all test results complete with its own analysis its features are<br />
- 'Mess" data handling<br />
Graph editor Customization<br />
- Detailed reports with explanation <strong>of</strong> results<br />
- Statistics - Descriptive Statistics. 1-test and analysis <strong>of</strong> variance (ANOVA)<br />
- Graphing -Scatterplot<br />
System requirements (32 -bit):<br />
WIN 95 or Windows NT 4.0, 468 or higher, 33 MHz, ME and 11 to 16<br />
MB hard disk space<br />
S<strong>of</strong>t ware availability:<br />
BINARY SYMANTICS LIMITED<br />
A-6, C-Block Community Centre,<br />
Naraina Vihar,<br />
New Delhi -1 10028
7. STATISTICA<br />
The s<strong>of</strong>tware provides the following feature<br />
- non parametrics<br />
- distribution fitting multiple regression<br />
- general non-linear estimation<br />
- general ANCOVNMANCOVA<br />
- Stepwise discriminant analysis<br />
- log-linear analysis<br />
- Confirmatory /exploratory factor analysis<br />
Canonical correlation<br />
Survival analysis<br />
a large selection <strong>of</strong> time series modeling I forecasting techniques<br />
- structural equation modeling with Monte Carlo simulations and much more<br />
System requirements:<br />
Window 3.1 /WIN 85 8 MB RAM, 10-12 MB hand disk space<br />
S<strong>of</strong>t ware availability:<br />
Stat S<strong>of</strong>t,<br />
70, Janpath<br />
New Delhi-110001<br />
8. LINDO & LINGO<br />
The s<strong>of</strong>tware solve inventory, transportation, project, management, forecasting<br />
problems with operation research on PC.<br />
- Fast linear, integer 8 quadratic optimizer for variety <strong>of</strong> problem capacities<br />
- Data input, editing, optimization, display, logical data enquiry, file handling and<br />
sensitivity analysis<br />
System requirements:<br />
Minimum 486, 2 MB, Windows 3.11951NT 3.5<br />
S<strong>of</strong>tware availability:<br />
BINARY SYMANTICS LIMITED<br />
A-6, C-Block Community Centre,<br />
Naraina Vihar,<br />
New Delhi -1 10028
lndostat provides the statistics you need in a program you can use most easily.<br />
You get comprehensive data editing data management and extensive statistical<br />
capabilities. lndostat s<strong>of</strong>twares are available for various disci~linesuch as :<br />
- Applied statistics (Curve fitting, stepdownlstepwise regression, experimental<br />
designs)<br />
Clustering pack<br />
- Econometrics 8 Psychology pack<br />
- Advanced econometric models<br />
- Operation research pack<br />
Multivariate pack<br />
Time series pack<br />
ARlMA modeling<br />
Geology pack<br />
- Graphics pack<br />
- Advanced Econometic Models<br />
- Acceptance sampling<br />
Plant-Breeding 8 Genetics pack<br />
Entomology pack<br />
Animal Science pack<br />
- Poultry pack, etc.<br />
System requirements:<br />
IBM compatible PC-AT13861486 IPentium machine with minimum memory<br />
requirement is 640 KB.<br />
S<strong>of</strong>tware availability:<br />
lndostat Services<br />
18, Rohini Apartments<br />
7-1 -3UA Arneerpet<br />
Hyderabad -500 018<br />
Other s<strong>of</strong>twares like sample power, peakfit, Table curve sigma plot, Delta graph,<br />
math cad etc. are also available.<br />
10. SAS (Statistical Analysir Systems <strong>Institute</strong>) USA<br />
The SAS System is an integrated system <strong>of</strong> s<strong>of</strong>tware products. The SAS<br />
System enables you to perform<br />
- data entry, retrieval, and management<br />
- report writing and graphics<br />
- statistical and mathematical analysis<br />
- business forecasting and decision support
- operations research and project management<br />
- applications development<br />
The core <strong>of</strong> the SAS System is base SAS s<strong>of</strong>tware. It consists <strong>of</strong> the SAS<br />
language, a programming language that you use to manage your data procedures that<br />
are s<strong>of</strong>tware tools for data analysis and reporting a macro facility a windwoing<br />
environment called the SAS Display Manager System.<br />
There are other s<strong>of</strong>tware packages which require more are less same system<br />
requirements there are<br />
MINITAB, MICROSTAT, MSTAT-C, SHAZAM, TSP, LlNDO<br />
SCARP is a statistical package dealing with analysis <strong>of</strong> sample survey data techniques<br />
developed by IASRI(ICAR), New Delhi.<br />
Most popular graphic packages are the following:<br />
HARVARD GRAPHICS, SIGMA PLOT, DELTA GRAPH
EXCEL FOR STATISTICAL DATA ANALYSIS<br />
P. K. Satapathy, A. K. Roy and R. Dash<br />
Computer Seclion<br />
<strong>Central</strong> lnstitule olFnrshwaler Aquacullunr<br />
Kausalpgenga. Bhubaneswar 751002<br />
INTRODUCTION<br />
Evolution <strong>of</strong> electronic spreadsheets is the most significant factor in starting up a<br />
trend towards business microcomputinglstatistical data analysis electronically by users<br />
even if who are having littlelno programming knowledge. Amongst various spreadsheet<br />
packages which appeared in Information Technology market (like LOTUS 1-2-3,<br />
VISICALC, SUPERCALC, QPRO. EXCEL) LOTUS 1-2-3 was most popular t~ll Micros<strong>of</strong>l<br />
Office came into picture, where the MS-EXCEL was available.<br />
CAPABILITIES OF EXCEL<br />
EXCEL has several capabilities which include opening <strong>of</strong> a workbook; entering<br />
and editing data; building formulas to calculate values; managing l~st <strong>of</strong> data; formatting<br />
data; creating a chart, saving a workbook; opening and saving files from other<br />
spreadsheets; linking documents from other spreadsheets, etc. Besides these it has<br />
super capability for data analysis.<br />
STATISTICAL ANALYSIS OF DATA<br />
Micros<strong>of</strong>l Excel provides a set <strong>of</strong> special analysis tools called Analysis ToolPak.<br />
These tools include statistical analyses which one can apply to many types <strong>of</strong> data as<br />
well as analyses which are Anova : Single Factor; Anova : Two Factor with Replication;<br />
Anova : Two Factor without Replication; Covariance; Correlation; Descriptive Statistics;<br />
Exponential Smoothing; F-test : Two Sample for Variances; Histogram; Moving Average;<br />
Random Number Generation; Rank and Percentile; Regression; t-Test; Paired Two-<br />
Sample for means; t-Test : Two Sample Assuming unequal means, elc. Before using an<br />
analysis tool, it is required to enter and organise that required to be analyzed into<br />
columns or rows on worksheet, which is called as input range. Text labels in the first cell<br />
<strong>of</strong> a row or column may be included to identify the variables latter. When an analysis tool<br />
is used to analyze data in an input range. Micros<strong>of</strong>t Excel creates an output table <strong>of</strong> the<br />
results. To use an analysis tool, choose Data Analysis from the Tools Menu. In the<br />
Analysis Tools, box, select the name <strong>of</strong> the tool required. Then specrfy the input and<br />
output ranges and any other options required.<br />
DESCRIPTIVE STATISTICS<br />
The Descriptive Statistics tool generates a report <strong>of</strong> univariate statistics for data<br />
in the input range. The output values generated by the Descriptive Statistics tool are :<br />
Standard deviation <strong>of</strong> sample (sample variance), kurtosis, and skewness. These outputs<br />
are demed using the same algorithms used by the built-in functions STDEV, VAR,<br />
KURT, and SKEW, rrrpedively.
Arithmetic mean : It Is also referred as average and calculated by simply adding<br />
the numbers and divkling by how many numbers there are.<br />
up all<br />
Medlan : The median is the value !hat exactly separates the upper half <strong>of</strong> the<br />
distribution from the lower half.<br />
Med = L+ (05~ - cumf<br />
NED 1<br />
Population mean : To avoid confusion Greek letter p, pronounced 'mew', is the symbol<br />
for the population mean.<br />
Standard devlatlon : It la most widely used measure <strong>of</strong> variability which uses the<br />
deviation <strong>of</strong> each score from the mean, but the calculation, instead <strong>of</strong> taking the<br />
abaolute value <strong>of</strong> each deviation, square6 each deviation to obtain values that are all<br />
potltive in tign.<br />
Deviation formula :<br />
Mean formula :<br />
z score : The z score is simply a way <strong>of</strong> telling how far a score is from the mean in<br />
standard deviation units.<br />
z score - sample :<br />
z score - populatlon :
The Confidenca Interval Approach for Estimating r : In this approach, instead <strong>of</strong><br />
talking about possible values that p may take, given sample X, it is better <strong>of</strong>f to set up a<br />
confidence interval in which the true mean probably lies.<br />
95% confidence interval for p urlng population a : Xi 1.96af<br />
99% confidence Interval for p urlng population a : X 1 25SOj<br />
COVARIANCE<br />
The Covariance tool returns the average <strong>of</strong> the product <strong>of</strong> deviations <strong>of</strong> data<br />
points from their respective means. Covariance is a measure <strong>of</strong> the relationship between<br />
two ranges <strong>of</strong> data.<br />
The use <strong>of</strong> Covariance tool is to determine whether two ranges <strong>of</strong> data move<br />
together; that is, whether large values <strong>of</strong> one set are associated with large values <strong>of</strong> the<br />
other (positive covariance), whether small values <strong>of</strong> one set are associated with large<br />
values <strong>of</strong> the other (negative covariance), or whether the values in the two sets are<br />
unrelated.<br />
ANOVA<br />
Analysis <strong>of</strong> variance, or anova, is a statistical procedure used to determine<br />
whether the means from two or more samples are drawn from populations with the<br />
same mean. This technique expands on the tests for two means, such as the t-test.<br />
Anova : single factor tool performs a simple analysis <strong>of</strong> variance, which test the<br />
hypothesis that means from several samples are equal. Anova : two-Factor with<br />
replication performs an extension <strong>of</strong> the single-factor anova that includes more than one<br />
sample for each group <strong>of</strong> data. Anova : two-Factor without replication performs tw<strong>of</strong>actor<br />
anova that does not include more than one sampling per group.<br />
CORRELATION<br />
The Correlation tool measures the relationship between two data sets that are<br />
scaled to be independent <strong>of</strong> the unit <strong>of</strong> measure. The population correlation calculation<br />
returns the covariance <strong>of</strong> two data sets divided by the product <strong>of</strong> their standard<br />
deviations.
und<br />
One can use the Correlation tool to determine whether two data sets move<br />
togelher; that is, whether large values <strong>of</strong> one set are associated with large values <strong>of</strong> the<br />
other (positive correlation), whether small values <strong>of</strong> one set are associated with large<br />
values <strong>of</strong> the other (negative correlation), or whether the values in the two sets are<br />
unrelated (zero correlation - the correlation tends toward zero). Unlike covariance,<br />
correlation is independent <strong>of</strong> the units <strong>of</strong> measurement.<br />
REGRESSION<br />
The Regression tool performs linear regression analysis. Regression fits a line<br />
through a set <strong>of</strong> observations using the least square methods.<br />
USING CHARTS TO ANALYZE DATA<br />
The ease <strong>of</strong> plotting graphs comes as a handy tool in EXCEL for ovserving the<br />
trends, the impact <strong>of</strong> one or more variables on other etc. The graphs help in illustrat~ng<br />
the behaviour <strong>of</strong> the data. For example. the dependence <strong>of</strong> two variables on each other,<br />
i e., how one changes with a change in other; how different variables bahave over a<br />
period <strong>of</strong> time, etc. With the help <strong>of</strong> graphs, these bahaviours are brought out more<br />
clearly.<br />
Crealing a Trendline : The first step in creating a trendline is to select the data series in<br />
which the trendline is associated with. Then choose the Trendline command from the<br />
Insert Menu. On the Type tab, select the type <strong>of</strong> trendline needed. On the options tab,<br />
one can give the trendline a name and specify other option. The regression trendlines<br />
are linearllogarithmidpolynomiallpowerlexponential The options like displaying the R-<br />
squared value, setting the Y-intercept, moving average, formatting a trendline, etc. are<br />
available. The Linear option creates the trendline using linear equation y = mx + b. The<br />
logarithmic option creates the trendline using the logarithmic equation y = clnx + b. The<br />
polynomial equation y = b + ccx+qx2+ ...+ c6. The power option creates the trendline<br />
using power equation y=~~b. The exponential option creates the trendline using the<br />
exponential equation y=cebx.<br />
FORECASTING<br />
Exponential Smoothing tool predicts a value based on the forecast for the prior period,<br />
adjusted for the error in that prior forecast, which uses a smoothing constant, a, the
magnlude <strong>of</strong> which determines how strongly forscasts respond to errors in the prior<br />
forecast.<br />
Moving Average tool projects values in the forecast period, based on the average value<br />
<strong>of</strong> the variable over a specific number <strong>of</strong> preceding periods. Each forecast value is<br />
based on the following formula :<br />
where N is the number <strong>of</strong> prior periods to include in the moving average. A, is the actual<br />
values at time j, and F, is the forecasted values at time j. You can use this procedure to<br />
forecast sales, inventory, or other trends. A moving average provides trend information<br />
that a simple average <strong>of</strong> all historic data masks.<br />
The supplementation <strong>of</strong> projections with several other calculations are possible<br />
for example, the standard error measure the relative accuracy <strong>of</strong> projected values.<br />
Another method, the weighted moving average forecast, includes a large interval<br />
and allows to assign vanous nonnegative weights to observations over time.<br />
In the above equation, W,, W, ,..., WN are nonnegative weights that sum to 1. W,<br />
is the weight at interval I; A, is the actual value at lime j, and F, is the forecasted value at<br />
time 1. Here,SUMPRODUCT funclion can be used to calculate a weighted average.<br />
T-Test : Paired Two-Sample for Means : The paired two-sample for means 1-Test tool<br />
performs a paired two-sample student's 1-test. This form <strong>of</strong> the 1-test tests whether a<br />
sample's means are distinct. It does not assume that the variances <strong>of</strong> both populations<br />
from which the data sets are drawn are equal. A paired test is appropriate whenever<br />
there is a natural pairing <strong>of</strong> observations in the samples, such as when a sample group<br />
is tested twice, before and after an experiment. In this case as this is a paired test, the<br />
two input ranges <strong>of</strong> data must contain the same number <strong>of</strong> data points.<br />
This analysis tool is Pearson Correlation derived by using the formula
where cH h the degree <strong>of</strong> freedom. Another output values generated by this analysis tool<br />
is Pooled Variance, which is derived using formula<br />
where S' is pooled variance.<br />
T-Test : Two-Sample Assumlng Equal Variance : The two-sample assuming equal<br />
variances t-Test tool performs a two-sample student's 1-test. This form <strong>of</strong> the t-test<br />
assumes that the means <strong>of</strong> both data sets are equal and is referred to as a<br />
homoscedastic I-test. The t-Test is used to determine whether the two samples' means<br />
are equal.<br />
T-Test : Two-Sample Assuming Unequal Variances : The two-sample assuming<br />
unequal variance 1-Test tool performs a two-sample student's t-test. This form <strong>of</strong> the test<br />
assumes that the variances <strong>of</strong> both ranges <strong>of</strong> data are unequal and is referred to as a<br />
heteroscedastic t-teat. This t-test is used to determine whether two sample means are<br />
equal. This test is used when the groups under study are distinct. Use <strong>of</strong> a paired test is<br />
done when there is one group before and after a treatment. The formula used lo<br />
determine the test statistics value t is<br />
The formulae below is used to approximate the degrees <strong>of</strong> feeedom. The result<br />
<strong>of</strong> the calculation is usually not an integer. The nearest integer is used to obtain a critical<br />
value for the t table.<br />
df =<br />
(s; /my (s; 1.y +<br />
m-1 n-l<br />
F-Test : Two-Sample for Variances : The two-sample for variances F-Test tool<br />
performs a two-sample F-test. An F-test is a method for comparing two population<br />
variances.<br />
Z-Test : TwoSample for Means : The two-sample for means z-Test tool performs a<br />
two-sample z-test for means with known variance. This procedure Is commonly used to<br />
test hypotheses about the difference between two population means.
INSTRUCTIONS FOR OPERATING MINITAB STATISTICAL PACKAGE<br />
Snbashi Basu<br />
Indian Stetisticel <strong>Institute</strong><br />
203 8. T. Road, Cek~tte - 700035<br />
lntroductlon to Mhitab<br />
Minitab is used interactively but may be used in batch mode also. We will<br />
concentrate on running Minitab in the interactive mode.<br />
Running Minitab<br />
Start Minitab by double clicking on the Minitab icon in Windows. Miniteb will<br />
respond by opening a window and showing (MTB >) prompt.<br />
Input and Output <strong>of</strong> Data<br />
Small amount <strong>of</strong> data may be read into variables C1, C2 etc d~rectly by the<br />
command<br />
MTB > READ Cl C2 C3<br />
Here each line <strong>of</strong> input consists <strong>of</strong> one set <strong>of</strong> values corresponding to variable<br />
C1, C2 and C3, e.g. 1 .O, 3.0. 0.0. Next line will consist <strong>of</strong> another set <strong>of</strong> values <strong>of</strong> Cl.<br />
C2 and C3, and so on. {END) command denotes end <strong>of</strong> data input.<br />
Alternatively you may give all the values <strong>of</strong> C1 together by the command<br />
MTB > SET C1<br />
DATA> 1.0 1.5 2.0<br />
DATA> 2.5 3.0<br />
DATA, END<br />
When data are read from a file use the commands<br />
MTB > WRITE 'inputfile' C1, C2, C3<br />
where inputfile is where your data are stored. Data read from two files may be<br />
combined in Minitab either side-by-side or one top <strong>of</strong> the other, e.g.<br />
MTB > READ 'inputl' Cl-ClO<br />
MTB > READ 'input2 Cll-CZO
MTB > M ITE 'widefile' C1-C20<br />
MTB * WRITE 'tallfile' Cl-C10<br />
To select subsets <strong>of</strong> data<br />
MTB > COPY C1 INTO C2 USE 1,3:5<br />
This will copy the contents <strong>of</strong> C1 using rows 1, 3, 4 and 5 Into C2.<br />
MTB > COPY C1 INTO C2 USE C5 = 64,30:50<br />
This will copy the contents <strong>of</strong> Cl into C2 only if C5 is 64 or 30 to 50.<br />
MTB > SORT C1 C2 will store the sorted version <strong>of</strong> C1 into C2.<br />
In Minitab the constants are stored into K1, K2 etc. To save the entire session <strong>of</strong><br />
Minitab, use the command MTB > SAVE 'filename'<br />
This will put the entire worksheet into a file This will contain all columns, stored<br />
constants and column names. Information may be retrieved from it by using the<br />
command MTB > RETRIEVE 'filename'<br />
Many commands in Minitab have a number <strong>of</strong> subcommands. To indicate that a<br />
subcommand is in order a semi--colon(;) is put at the end <strong>of</strong> the command line. After all<br />
subcommands are specified a period(.) indicates the end, e.g.<br />
MTB > PLOT C1 C2;<br />
SUBC,<br />
TITLE 'CHERRY TREE DATA';<br />
SUBC> YLABEL'DIAMETER';<br />
SUBCz XLABEL 'VOLUME'.<br />
For plotting Minitab does not open a new window.<br />
Mathematical and Statistical Operations<br />
Some examples <strong>of</strong> arithmetic and algebraic expressions are<br />
MTB > LET C1= (C2 + C30) ' 10 - 60<br />
MTB > LET C1= C1- MEAN(C1)<br />
MTB > LET K1 = MEAN(Cl)ISTDEV(Cl)
MTB > LET C3 = C1+ C2-2<br />
Individual elements <strong>of</strong> a column may also be address e.g.<br />
MTB > LET C3 = C2(3) C1<br />
Usual mathematical functions like {ABSOLUTE), (SQRT), {LOGE), {LOGTEN),<br />
{SIN). {ROUND) etc, are also available. Commands for basic statistics include<br />
{DESCRIBE), {ZINTERVAL), (ZTEST), (TINTERVAL), (TTEST), {TWOSAMPLE) etc.<br />
Simple Linear Regression<br />
The basic command for regression C1 on (say) 3 predictors C2. C3 and C4 is<br />
MTB > REGRESS C1 3 C2 C3 C4<br />
The command (BRIEF K) controls the amount <strong>of</strong> the output. K can be any<br />
integer from 1 to 3, and the larger the value <strong>of</strong> K the more output. Default value <strong>of</strong> K<br />
is 2.<br />
The subcommands for regression include (NOCONSTANT), (MSE),<br />
{COEFFICIENTS), {HI), {RESIDUALS), (PREDICT). {VIF) etc.<br />
Minitab can also perform stepw~se regression, e.g<br />
MTB > STEPWISE C1 C2--C7;<br />
MTB > STEPS = 3.<br />
{STEPS) controls the number <strong>of</strong> steps shown per page. At the end <strong>of</strong> each<br />
group Minitab asks whether to show more <strong>of</strong> the steps or to end the output.<br />
Analysis <strong>of</strong> Variance<br />
The basic command for one--way analysis <strong>of</strong> variance is<br />
MTB > AOVONEWAY C1-C3<br />
Here each column contains Ihe observations for one cell. There must be more<br />
than two cells, otherwise the analysis is equivalent to (TWOSAMPLE) command with<br />
(POOLED)(standard deviation) subcommand. (AOVONEWAY) does not require an<br />
equal number <strong>of</strong> observation in each Cell.<br />
When all data are stored in one column and a second column gives the levels.<br />
then use the command<br />
MTB > ONEWAY C1 C2 C3 C4
C3 and C4 are optional. If C3 is specified the residuals are stored in it. If C4 are<br />
specified the fined values are stored In it. (TWOWAY) performs a two-way analysis <strong>of</strong><br />
variance for balanced dala.<br />
The command (ANOVA model) does analysis <strong>of</strong> variance for multiway balanced<br />
designs. Factors may be crossed or nested, fixed or random. (ANOVA) calculates all<br />
exact F-tests, prints expected mean squares and estimates variance components. You<br />
may specify your own tests, store residuals and fitted values and print call and marginal<br />
means. You analyze up to 50 response variables on one (ANOVA) command. To enter<br />
data you need one column for each response variable and one column for each factor.<br />
This means there is one row <strong>of</strong> the worksheet for each obse~ation. This row contains<br />
the value <strong>of</strong> each response variable and level <strong>of</strong> each factor.<br />
Because models can be quite long and tedious to type, a vertical bar indicates<br />
crossed factors and a minus sign removes terms, e.g.,<br />
(ANOVA Y = A I B ( C) is equivalent to the three-factor model with all the three<br />
two--way and the three--factor interaction terms.<br />
ANOVA Y A ) B ) C ) D - A'B'C -- A'B'C'D<br />
is equivalent to the model<br />
Y = A B C D A'B A'C A'D B'C B'D C'D A'B'D A'C'D<br />
B'C'D<br />
If a factor is nested you must indicate that when using the bar,<br />
e.g.<br />
ANOVA Y = A I B(A) 1 C<br />
is equivalent to<br />
Y =A B(A) C A'C B'C(A)<br />
Useful subcommands to be used with (ANOVA) are (RANDOM), (FITS),<br />
{RESIDUALS), (MEANS), {TEST) etc. The command (GLM) is used to do analysis <strong>of</strong><br />
variance with balanced and unbalanced design, analysis <strong>of</strong> covariance and regression<br />
analysis.<br />
Multivariate Analysis<br />
The command (PCA) does principal components analysis. Components can be<br />
calculated from the correlation matrix (default option) and output consists <strong>of</strong> the<br />
eigenvalues, the proportion and cumulative proportion <strong>of</strong> the total variance explained by<br />
each principal component and the coefficient for each principal component. Useful<br />
subcommands are (COVARIANCE), (COEF), (SCORES) etc.
The command (DISCRIMINANT) does linear and quadratic discriminant analysis<br />
for dassifying observations into two or mom groups based on the specified predictors.<br />
Output indudes the classification matrix, the squared distance between group centers,<br />
the linear discriminant function, means, standard deviations end covariance matrices<br />
and a summary <strong>of</strong> how each observation was classified. Useful subcommands are<br />
(QUADRATIC), (FITS), (XVAL), (PREDICT) etc.<br />
Plots and Graphics<br />
The basic command for scatterplot <strong>of</strong> C1 versus C2 is<br />
MTB > PLOT C1 C2<br />
To add titles, footnotes and axis labels to th~s plot you may use the<br />
subcommands (TITLE), (FOOTNOTE). (XLABEL), (YLABEL). To change the plotting<br />
symbol you may use the subcommand (SYMBOL}. The command (MPLOT) puts<br />
several plots on the same axes and (LPLOT) plots data using letters for plotting symbol.<br />
(TSPLOT) does a time--series plot. (HISTOGRAM) and (DOTPLOn produces<br />
histograms, The commands (GPLOT), (GMPLOT), (GLPLOT) etc are useful to produce<br />
high resolution graphics. (GPLOT) may also be used to plot a function. You may also<br />
control the line styles and colors for your graphs.<br />
References<br />
MINITAB Reference Manual : Release 7