06.06.2013 Views

Patentee Name Harmonisation - ecoom.be

Patentee Name Harmonisation - ecoom.be

Patentee Name Harmonisation - ecoom.be

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The following characters were identified for removal if they appear at the end of a name: “,”;<br />

”;”; “:”; “-“.<br />

Implementation<br />

Firstly, all occurrences of “ DITE:”, “ DITE,” and “DITE :” at the end of names are removed by<br />

executing an update query on the data.<br />

Next, all occurrences of “,”; “;“; “:”; “-” are removed at the end of a name by executing an<br />

update query on the data.<br />

As the removal of irrelevant characters at the end of a name can lead to trailing spaces, names<br />

have to <strong>be</strong> checked for and trimmed of trailing spaces after removal of irrelevant characters at<br />

the end of a name.<br />

The removal of characters at the end of a name can also lead to a new irregular name ending,<br />

so this step has to <strong>be</strong> executed several times until no further irregularities are found.<br />

Result<br />

Irrelevant non-alphanumerical characters at the end of a name have <strong>be</strong>en removed in 1,498<br />

names.<br />

Impact<br />

From 438,052 unique names to 437,689 unique names, an additional reduction of 363 names,<br />

or a total reduction of 6,033 names (1.4%).<br />

1.2.6 Replace comma irregularities<br />

Description<br />

A comma should <strong>be</strong> followed by a space and not <strong>be</strong> preceded by a space. A comma not followed<br />

by a space or preceded by a space means some irregularity in most cases.<br />

Analysis<br />

Firstly, comma irregularities based on commas not followed by a space are identified by<br />

querying the data for names having the pattern “%,[! ]%”.<br />

624 names were identified having a comma not followed by a space.<br />

A pattern and case-based approach is used to clean irregularities instead of blindly adding a<br />

space after every comma not having a space. A fully automated approach is dangerous <strong>be</strong>cause<br />

a comma might <strong>be</strong> a decimal, a thousand separator, or can appear as an abbreviation indicator<br />

instead of a dot.<br />

Table 10 contains most occurring patterns containing a comma not followed by a space found in<br />

the names.<br />

Table 10: Patterns with comma not followed by space<br />

PATTERN REPLACE WITH<br />

“% CO.,LTD.%” “ CO., LTD.”<br />

“% CO.,LTD%” “ CO., LTD”<br />

“% CO,. LTD.%” “ CO., LTD.”<br />

“% CO.,INC.%” “ CO., INC.”<br />

“%,LTD.%” “, LTD.”<br />

“%,LTD” “, LTD”<br />

“%,INC.%” “, INC.”<br />

“%,INC” “, INC”<br />

“%,LLC.%” “, LLC.”<br />

“%,LLC” “, LLC”<br />

“%,L.L.C.%” “, L.L.C.”<br />

“%,S.A.R.L.%” “, S.A.R.L.”<br />

“%,S.A.%” “, S.A.”<br />

“% CO,LTD” “ CO, LTD”<br />

“% CO,KG.%” “ CO, KG.”<br />

“% CO.,KG” “ CO., KG”<br />

25

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!