06.06.2013 Views

Patentee Name Harmonisation - ecoom.be

Patentee Name Harmonisation - ecoom.be

Patentee Name Harmonisation - ecoom.be

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Table of contents<br />

1 Introduction................................................................................................................................. 1<br />

2 <strong>Patentee</strong> name harmonization and legal entity harmonization .................................................. 2<br />

3 Existing name-harmonization approaches ................................................................................. 3<br />

3.1 USPTO CONAME assignee name harmonization................................................................. 3<br />

3.2 DERWENT WPI company name harmonization.................................................................... 4<br />

4 A content-driven name harmonization approach focusing on accuracy .................................... 5<br />

4.1 Data pre-processing............................................................................................................... 6<br />

4.2 <strong>Name</strong> cleaning ....................................................................................................................... 7<br />

5 Results and Impact..................................................................................................................... 8<br />

6 Directions for further development ........................................................................................... 10<br />

6.1 Approximate string searching............................................................................................... 10<br />

6.2 Automatic acronym generation ............................................................................................ 13<br />

6.3 Introducing address information (in conjunction with name similarity) ................................ 15<br />

7 CONCLUSION.......................................................................................................................... 16<br />

8 References ............................................................................................................................... 17<br />

APPENDIX 1: STEP-BY-STEP METHODOLOGY AND APPLICATION USING EPO AND USPTO<br />

PATENTEE NAMES.............................................................................................................................. 18<br />

1 Data pre-processing ................................................................................................................. 18<br />

1.1 Character cleaning ............................................................................................................... 18<br />

1.2 Punctuation cleaning (pre-parsing) ...................................................................................... 22<br />

2 <strong>Name</strong> cleaning.......................................................................................................................... 28<br />

2.1 Legal form indication treatment............................................................................................ 28<br />

2.2 Common company word removal ........................................................................................ 35<br />

2.3 Spelling variation harmonization .......................................................................................... 37<br />

2.4 Condensing .......................................................................................................................... 38<br />

2.5 Umlaut harmonization .......................................................................................................... 39<br />

2.6 Cleaned name...................................................................................................................... 40<br />

3 Harmonization results............................................................................................................... 42<br />

3.1 Original names matched to harmonized names .................................................................. 42<br />

3.2 Additional patents assigned to harmonized names ............................................................. 44<br />

3.3 Patent distribution amongst patentees................................................................................. 46<br />

3.4 Patent ranking of patentees ................................................................................................. 47<br />

APPENDIX 2: ALL SEARCH AND REPLACE STATEMENTS FOR ALL LEGAL FORMS TO BE<br />

REMOVED AT THE END OF A NAME ................................................................................................. 51<br />

APPENDIX 3: TOP 200 OCCURRING LAST WORDS......................................................................... 69<br />

APPENDIX 4: TOP 200 OCCURRING FIRST WORDS ....................................................................... 71<br />

APPENDIX 5: TOP 200 PATENTEES BEFORE NAME CLEANING AND HARMONIZATION ........... 73<br />

APPENDIX 6: TOP 200 PATENTEES AFTER NAMe CLEANING AND HARMONIZATION............... 77<br />

APPENDIX 7: VALIDATION EXERCICE ON 35 HARMONIZED NAMES............................................ 81

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!