06.06.2013 Views

Patentee Name Harmonisation - ecoom.be

Patentee Name Harmonisation - ecoom.be

Patentee Name Harmonisation - ecoom.be

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

" GMBH & CO. KG " 191 GMBH Replace with " & COMPANY "<br />

" GMBH & CO.K.G. " 0 GMBH Replace with " & COMPANY "<br />

" GMBH & CO.KG " 15 GMBH Replace with " & COMPANY "<br />

" GMBH & CO KG " 18 GMBH Replace with " & COMPANY "<br />

" GMBH + CO. KG " 13 GMBH Replace with " & COMPANY "<br />

" GMBH & CO. " 612 GMBH Replace with " & COMPANY "<br />

" GMBH & CO " 65 GMBH Replace with " & COMPANY "<br />

" GMBH & CO.," 77 GMBH Replace with " & COMPANY "<br />

" GMBH & CO," 8 GMBH Replace with " & COMPANY "<br />

" GMBH + CO. " 28 GMBH Replace with " & COMPANY "<br />

" GMBH + CO " 6 GMBH Replace with " & COMPANY "<br />

" GMBH + CO.," 1 GMBH Replace with " & COMPANY "<br />

" GMBH + CO," 1 GMBH Replace with " & COMPANY "<br />

" GMBH," 173 GMBH Remove<br />

" GMBH " 1,648 GMBH Remove<br />

Implementation<br />

The more complex the analysis (identification of legal forms, identification of spelling variations,<br />

validation), the simpler the implementation.<br />

All identified and validated spelling variations of legal form indications are transferred to search<br />

and replace statements or rules as in Table 15 and Table 18. This results in 1,060 rules or<br />

statements to remove legal form indications at the end of names, one rule or statement to<br />

remove legal form indications at the <strong>be</strong>ginning of names, and 17 rules or statements to remove<br />

legal form indications anywhere in the name.<br />

Every statement or rule contains the spelling variation to identify and the harmonized string to<br />

substitute. In most cases, legal form indications are simply removed and not replaced with<br />

anything; replacement is used if legal form indication is preceded or followed by general<br />

company words that can <strong>be</strong> harmonized (e.g. harmonize “ + CO.” to “ & COMPANY”).<br />

Every statement or rule also includes the harmonized legal form. This is used to update a new<br />

field with the harmonized legal form. The legal form indications are not deleted completely but<br />

instead removed from the name field and moved in a harmonized format to a different field.<br />

All identified and validated occurrences of legal form indications are removed by executing a<br />

program that reads the search and replace statements or rules, and executes an update query<br />

on the data to replace the given keyword (spelling variation of legal form indication) with a<br />

given string (mostly replaced with nothing to simply remove the legal form indication) while, at<br />

the same time, updating a new field to contain the harmonized legal form of the organization.<br />

The search and replace statements were executed in three groups: firstly, a group of 1,060<br />

statements to remove legal forms at the end of a name (see Appendix 2); then a group of 1<br />

statement to remove legal forms at the <strong>be</strong>ginning of a name (remove 1216 occurrences of<br />

“KABUSHIKI KAISHA” at the <strong>be</strong>ginning of a name); and finally, a group of 17 statements to<br />

remove legal forms anywhere in a name (see Table 18).<br />

In a group, all search and replace statements are executed in a singular and not a cumulative<br />

approach. This means that if a name is updated <strong>be</strong>cause of a search and replace statement, it<br />

cannot <strong>be</strong> updated again in a subsequent search and replace statement. This is to prevent a<br />

cascade of replacements within one name leading to unexpected results (with consequent<br />

difficulties for checking and validating). The list of search and replace statements is constructed<br />

<strong>be</strong>aring this important implementation consideration in mind (e.g. first replace " + CO. AG" with<br />

“ & COMPANY” <strong>be</strong>fore removing “ AG”).<br />

If a name contains a legal form indication at the <strong>be</strong>ginning and the end of a name, or anywhere<br />

in the name, only the legal form indication occurring at the end of the name is harmonized and<br />

moved to a different field.<br />

Not all search and replace functions are concerned with legal form indications removal. Some<br />

words, commonly used in a company context, appearing as last words were also harmonized,<br />

such as “CO” and “CORP” (see analysis for more details).<br />

As the replacements and removals in the search and replace statements can lead to names<br />

ending with irregular punctuation characters, all occurrences of “-”; “;“; “:”; “,” and “&” are<br />

removed at the end of a name by executing an update query on the data.<br />

34

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!