27.12.2013 Views

slides - Microsoft Research

slides - Microsoft Research

slides - Microsoft Research

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Lack of good broad overview<br />

Question: Which algorithms work well on what types of<br />

datasets?<br />

6<br />

An outsider's perspective<br />

Difficult to understand the problems:<br />

classification<br />

regression<br />

ranking<br />

structured prediction<br />

statistical relational learning<br />

...<br />

Difficult to find reliable solutions:<br />

logistic regression<br />

kernel methods<br />

topic models<br />

conditional random fields<br />

hidden Markov models<br />

...<br />

7<br />

Outline<br />

MLcomp: code, data, comparison<br />

MLcomp: code, data, and comparison<br />

CodaLab: complex workflows<br />

8<br />

9<br />

People with programs:<br />

A meeting place<br />

Programs: SVMlight<br />

Components<br />

C implementation of support vector machines for<br />

classification by Thorsten Joachims.<br />

How well does my method work compared to others?<br />

People with datasets:<br />

What is the best method for my problem?<br />

Datasets:<br />

Runs:<br />

thyroid<br />

Task is to predict whether a patient has thyroid disease<br />

given attributes (age, gender, I131 treatment, etc.)<br />

Program : SVMlight<br />

Dataset : thyroid<br />

Error : 2.6%<br />

Time : 1 second<br />

10<br />

11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!