I got several responses asking if I am planning to make my question list
public. I think I should answer this question to the whole list.
I am willing to make it public but I am not sure if I should do it right
now. Here are the reasons:
1. Some questions ask information about specific people, not only
celebrities, but also probably the questioners or other people with very
close relationships to the questioners. This may raise some privacy
issues. I prefer to take off these questions before make the question
2. Some questions, actually not a small number, contain some uncensored
words. I think these questions are improper to be in a corpus.
3. Many questions are not grammatically correct or with some spell errors.
I personally think this is ok becaues the questions are from real world. I
don't know what other researchers think about this.
4. Different researchers may have different expections. For example, the
original poster of this thread required why- and how- questions, other
people have asked about statistic information on specific phrase groups. I
would like to know if there are some common requirements from most or many
5. After I do something to the question archive and make it public, I am
thinking of updating the public question corpus time to time. More efforts
have to take and I am not sure if I have enough energy to do this. I hope
some one is willing to join me.
I am waiting for your inputs. Especially if you are willing to do
something for building the corpus, I am happy to work with you.
On Wed, 20 Nov 2002, ZHIPING ZHENG wrote:
> Dear Tian-Zuo and others,
> I have a big corpus which contains over 40K unique questions collected
> from real world users by my AnswerBus Question Answering System
> (http://www.answerbus.com/). I am willing to do some research based on the
> data together with other people who have the same interest.
> On Wed, 20 Nov 2002, tzshen wrote:
> > Dear all,
> > I am doing some work to find the answer patterns
> > to help automatic answering some complex questions, which ask for a complex answer.
> > I first focus on why-questions and how-questions.
> > So I am eager to find some corpora that contains large amount of this two types of questions and corresponding answers.
> > Does anyone know where I can find this kind of corpora or related resources?
> > Resouces about other complex questions and answers beyond why- and how-questions are also welcome.
> > THANK YOU ALL VERY MUCH.
> > Tian-Zuo, Shen
This archive was generated by hypermail 2b29 : Fri Nov 22 2002 - 00:08:31 MET