Friday, April 25, 2008

user modeling idea in jelinek, statisticial methods for speech processing

there was a short idea for user modeling in Jelinek 2001 that I wanted to make a note of.

"A vocabulary is particular to a speaker and his task in at least four ways..."
1) habits of expression (related to things like level of education
2) domain of discourse
3) current interests
4) current document/subject

Monday, April 21, 2008

Fuzzy logic progress

Before meeting w/ Prof. Mendel about my fuzzy logic project, I wanted to collect my thoughts here so that our meeting would be more productive.

First, I'll start by going over the methodology that I used. The web survey was used to collect data from a fair amount of people (32), between 2 different surveys. the stimuli for the first survey was reported in the class final project [ Prof. Mendel's class project]. It used 7 emotion words as stimuli (angry, disgusted, fearful, happy, neutral, sad, surprised) and also 3 modifiers (very, sort of, not). The second experiment [ interspeech 2008 submission ] used 40 words from mood labels of blogs (the site livejournal.com, to be precise). In the interspeech paper submission, I combined these two experiments to make a computing with words type application where the words from the first experiment (excluding the modifiers) were used as inputs and the output were the words from the second experiment. The results agreed pretty well intuitively and based on an small evaluation. However, using the averaged midpoints of the type-2 fuzzy sets to find the euclidean distance gave about the same performance.


Some points that I want to ask him about are:

- is the 3-D approach (valence, activation, and dominance) and the combination methods (sum, product) valid?

- are my conclusions correct: is the fact that Euclidean distance is comparable to the fuzzy/jaccard distance metric evidence that we don't gain much by moving to type-2 fuzzy logic?


Some other ideas:

- use less data, or at least be more person-specific. This would allow me to see more interpersonal differences, which goes well with my user modeling interests. Also, I think it's clear that the data was a bit noisy, especially in the dominance dimension.

user modeling for emotions in text idea

========================================
user modeling for emotions in text idea
========================================

:author: Abe Kazemzadeh
:date: 2008-04-21

This weekend I had an idea about how to implement a user model of emotions in text. Using the liveJournal blog data [Mishne Dissertation], I can try to make a mixture language model, like described in [Stolke Dissertation]. The user model will be the mixture priors (am I using the terminology correctly?). This will give me a good change to move away from the tfidf approach into the language modeling framework, which I haven't explored enough.

TODO: make a simple example of the mixture language model, play around with srilm toolkit, apply to data [liveJournal, IEMOCAP]