Patentee Name Harmonisation - ecoom.be
Patentee Name Harmonisation - ecoom.be
Patentee Name Harmonisation - ecoom.be
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Table of contents<br />
1 Introduction................................................................................................................................. 1<br />
2 <strong>Patentee</strong> name harmonization and legal entity harmonization .................................................. 2<br />
3 Existing name-harmonization approaches ................................................................................. 3<br />
3.1 USPTO CONAME assignee name harmonization................................................................. 3<br />
3.2 DERWENT WPI company name harmonization.................................................................... 4<br />
4 A content-driven name harmonization approach focusing on accuracy .................................... 5<br />
4.1 Data pre-processing............................................................................................................... 6<br />
4.2 <strong>Name</strong> cleaning ....................................................................................................................... 7<br />
5 Results and Impact..................................................................................................................... 8<br />
6 Directions for further development ........................................................................................... 10<br />
6.1 Approximate string searching............................................................................................... 10<br />
6.2 Automatic acronym generation ............................................................................................ 13<br />
6.3 Introducing address information (in conjunction with name similarity) ................................ 15<br />
7 CONCLUSION.......................................................................................................................... 16<br />
8 References ............................................................................................................................... 17<br />
APPENDIX 1: STEP-BY-STEP METHODOLOGY AND APPLICATION USING EPO AND USPTO<br />
PATENTEE NAMES.............................................................................................................................. 18<br />
1 Data pre-processing ................................................................................................................. 18<br />
1.1 Character cleaning ............................................................................................................... 18<br />
1.2 Punctuation cleaning (pre-parsing) ...................................................................................... 22<br />
2 <strong>Name</strong> cleaning.......................................................................................................................... 28<br />
2.1 Legal form indication treatment............................................................................................ 28<br />
2.2 Common company word removal ........................................................................................ 35<br />
2.3 Spelling variation harmonization .......................................................................................... 37<br />
2.4 Condensing .......................................................................................................................... 38<br />
2.5 Umlaut harmonization .......................................................................................................... 39<br />
2.6 Cleaned name...................................................................................................................... 40<br />
3 Harmonization results............................................................................................................... 42<br />
3.1 Original names matched to harmonized names .................................................................. 42<br />
3.2 Additional patents assigned to harmonized names ............................................................. 44<br />
3.3 Patent distribution amongst patentees................................................................................. 46<br />
3.4 Patent ranking of patentees ................................................................................................. 47<br />
APPENDIX 2: ALL SEARCH AND REPLACE STATEMENTS FOR ALL LEGAL FORMS TO BE<br />
REMOVED AT THE END OF A NAME ................................................................................................. 51<br />
APPENDIX 3: TOP 200 OCCURRING LAST WORDS......................................................................... 69<br />
APPENDIX 4: TOP 200 OCCURRING FIRST WORDS ....................................................................... 71<br />
APPENDIX 5: TOP 200 PATENTEES BEFORE NAME CLEANING AND HARMONIZATION ........... 73<br />
APPENDIX 6: TOP 200 PATENTEES AFTER NAMe CLEANING AND HARMONIZATION............... 77<br />
APPENDIX 7: VALIDATION EXERCICE ON 35 HARMONIZED NAMES............................................ 81