22.01.2013 Views

Automated Marketing Research Using Online Customer Reviews

Automated Marketing Research Using Online Customer Reviews

Automated Marketing Research Using Online Customer Reviews

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

[6] Constrained Logic Programming<br />

To align words into their corresponding attribute dimensions, we frame the task as a<br />

mathematical assignment problem and resolve the problem using a bounds consistency approach. We<br />

define the assignment using the maximal clique that corresponds to the schema for each product attribute<br />

table (see Figure WA1.1). In the bounds consistency approach, we invert the constraints (tok_exclusion)<br />

to express the complementary set of candidate assignments (tok_candidates) for each attribute dimension.<br />

If the phrase constraints, taken together, are internally consistent, then the candidate assignments<br />

(tok_assign)for a given token are simply the intersection of all candidate assignments as defined by all<br />

phrases in the cluster containing that token.<br />

We transform the mutual exclusivity constraint represented by each phrase into a set of candidate<br />

assignments using the algorithm in Figure WA1.2. Note that we need only propagate the mutual<br />

exclusivity of words that are previously unassigned. Accordingly, for each unassigned token in a given<br />

phrase, the set of candidate assignments is the intersection of the possible assignments based upon the<br />

current phrase and all candidate assignments from earlier phrases containing the same token. We<br />

maintain a list of active tokens boundary_list to avoid rescanning the set of all tokens every time the<br />

possible assignments for a given token is updated.<br />

Finally, the K-means clustering used to separate review phrases into distinct product attributes is<br />

a noisy process. The clustering can easily result in the inclusion of spurious phrases. Both the initial<br />

process_phrases(p_list)<br />

[1] schema = find_maximal_clique(p_list)<br />

[2] order phrases by length<br />

[3] for each phrase p:<br />

[4] # initialize data structures<br />

[5] tok_exclusion – for each tok, mutually exclusive tokens<br />

[6] tok_candidates – for each tok, valid candidate assignments<br />

[7] tok_assign – for each tok, the dimension assignment<br />

[8] # propagate the constraints for each successive phrase<br />

[9] tok_candidates, tok_exclusion, tok_assign =<br />

[10] propagate_bounds(phrase, tok_candidates,<br />

[11] tok_exclusion, tok_assign, schema)<br />

[12]<br />

Figure WA1.1 Logical Assignment<br />

7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!