06.06.2013 Views

Patentee Name Harmonisation - ecoom.be

Patentee Name Harmonisation - ecoom.be

Patentee Name Harmonisation - ecoom.be

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Finally, as the replacements and removals in the search and replace statements can also lead to<br />

leading or trailing spaces, names have to <strong>be</strong> checked for and trimmed of leading and trailing<br />

spaces after removal of legal form indications.<br />

Result<br />

Legal form indications have <strong>be</strong>en removed and harmonized at the end of names in 221,498<br />

names, at the <strong>be</strong>ginning of names in 1,216 names, and anywhere in the name in 2,865 names.<br />

Moreover, words, commonly used in a company context, appearing as last words were<br />

harmonized in 9,150 cases.<br />

Since not all legal form indications have <strong>be</strong>en identified, nor all spelling variations of identified<br />

legal form indications have <strong>be</strong>en identified and added to the list of search and replace<br />

statements (<strong>be</strong>cause they would cause too many false matches), a significant num<strong>be</strong>r of names<br />

will still contain legal form indications. However, the vast majority of legal form indications are<br />

removed, with accuracy well above 99%.<br />

Impact<br />

From 437,336 unique names to 392,226 unique names, an additional reduction of 45,110<br />

names, or a total reduction of 51,496 names (11.6%).<br />

2.2 Common company word removal<br />

Description<br />

In addition to the legal form indication that can <strong>be</strong> removed as it is not really part of the name,<br />

there are some other words commonly used in a company context that are not really distinctive<br />

elements of a company name. Such words include “COMPANY”, “CORPORATION”,<br />

“GESELLSHAFT” and “SOCIETE”.<br />

The idea is that if two names are found that are completely identical except for these words, the<br />

underlying organization name will <strong>be</strong> the same and these words can <strong>be</strong> removed.<br />

Examples include “3COM” and “3COM CORPORATION”, “AMIC” and “AMIC COMPANY”, “BAUR<br />

SPEZIALTIEFBAU” and “BAUR SPEZIALTIEFBAU GESELLSCHAFT”, “SOCIETE NOVATEC” and<br />

“NOVATEC”.<br />

In addition, legal forms identified but not removed in the previous step – legal form indication<br />

treatment - are removed at this stage. Such legal forms as “KG” were not removed previously<br />

<strong>be</strong>cause they are, to all intents and purposes, part of the name and removing them could make<br />

the underlying name less comprehensible.<br />

Common company word removal can mutilate organization names and make them less<br />

understandable. However, the idea is not to use these common company word removal names<br />

as final harmonized names but as some kind of technical search name that can <strong>be</strong> used to<br />

identify name variations in the same organization.<br />

Analysis<br />

Common company words that can <strong>be</strong> removed were identified by using the last word index and<br />

first word index employed in the previous step – legal form indication treatment - and a full text<br />

index of the organization names.<br />

Firstly, the occurrence of “CORPORATION”, “COMPANY”, “KG” and “GESELLSCHAFT” at the end<br />

of names, identified but not always completely removed in the previous step – legal form<br />

indication treatment - were analyzed by manually scanning for all spelling variations.<br />

Table 19 contains all spelling variations of all common company words that can <strong>be</strong> deleted if<br />

they appear at the end of a name.<br />

Table 19: Common company words to <strong>be</strong> removed at the end of a name<br />

KEYWORD NBR<br />

"CORPORATION" 23,134<br />

"CORP" 102<br />

35

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!