Archive

Posts Tagged ‘Web Mining’

Monitoring brand using discourse analysis

 

Learning public opinion (or sentiment) about Your brand in traditional way is expensive because surveys or focus groups take much time and human work. Probably in near future alternative solution will gain recognition as some technology vendors launched tools for brand monitoring using text analytics. Initial review of these attempts appeared yesterday on SmartData Collective.

Monitoring brand using discourse analysis differs, to some extent, from the approach based on text analysis. I have very fresh example – a tool for monitoring opinion about retail nets (supermarkets). And now some words how it is made and how it works.

Building Monitor. Analysis of the discourse in the corpus of Internet discussions related to supermarkets gave a collection of subjects interesting for interlocutors, and a collection of expressions of their attitudes.  Using these results the complex queries for semantic search were built for the learning research. It is the crucial stage – we should learn very details of the discourse, and get its math at the same time, as the basis for justification and calibration of the Monitor. The final task is relatively easy – to implement the results and build a “machine” using accessible technology.

How Monitor works. The data for each retail brand is collected using semantic search. Monitor makes all the calculus according to calibrating formulas and provides figures ready for presentation. Please see the pictures made for presentation only (not production version).

First are the “profile” – how the brand is perceived, i. e. how it is distinguished vs the average Internet discourse. The result of such kind is often astonishing because the picture dramatically differs from that of Customer’s (user of monitor) wishes, from official image and marketing buzz. Moreover, the interlocutors’ categories (vertical in the charts) also differ.

Then there is a comparison of the brands monitored. The charts show how people value each brand with regard to the same categories.  2 charts with negative opinions (general index only) are presented as the example.

The third important group of results regards monitoring itself, i. e. presentation of the changes. It depends on the Customer needs. Some customers want to observe the effects of promotional campaigns, and for such purpose day-to-day monitoring is appropriate. Some want to know the general trends… etc.

Streamlining web mining

 

Last Sunday I submitted my comment to the people vs machine debate in Research Magazine. Some readers of this comment asked me how I get 97% accuracy of sentiment changes’ measurement in the Web Mining.

Web text analytics is rather new field of research and everybody is using its own approach. So, I would only advice – don’t want to be too quick. If you collect millions of records and focus on thousands of specific sentiment-rich expressions, first look at this data. Make some basic descriptive statistics (Yes!), make some charts of the frequency distributions etc. Try to find proper way of stratification, using your best proven approaches and tools. Don’t avoid this basic examination – I write this because I see many freshmen in analytic business who want to cut corners.

If you find good way of data stratification you will undoubtedly notice, that some expressions occur most frequently in one or two or three specific contexts or specific subject domains. Follow this clue, and limit further research to these expressions. This is the first step to the discourse mining (not simply text mining).

Next steps are obvious. Look for relations between various characteristics of the contexts, subject domains, and these “good” expressions. Make clustering in order to select subjects domains and texts you need. Make the selection from your corpus of texts.

There are a lot of tools to extrude rich and accurate information from data selected in this way.

Limiting the scope of study is the first and very basic way to streamline any research process. It is also a basic step used in Industrial Engineering in streamlining any manufacturing or business process.