[Kim-discussion] [Interested-in-kim] problem of populating instances from my own corpus

Philip Alexiev philip.alexiev at ontotext.com
Fri Jun 4 07:02:03 EDT 2010


Hello Fangkai,

Could you send us some of your txt files that you are sure are not 
annotated? This could help us a lot in solving the problem.

Thanks,
Philip

On 06/03/2010 08:00 PM, Yang Fangkai wrote:
> hi, Anton,
>
>          I tried HTML files, and the population works. But this just
> doesn't work for txt file...
>
>         I checked the populator.xml and found the following configuration:
>
>         <INPUT_DOC_EXT>doc,htm,html,txt,page,xml</INPUT_DOC_EXT>
>
>         I suspect the populator has already been configured to process
> txt file. So where is the problem? Thank you!
>
> Fangkai
>
> 2010/6/3 Yang Fangkai<wolfgang.yang at gmail.com>:
>    
>> Anton,
>>
>> On Thu, Jun 3, 2010 at 10:39 AM, Anton Andreev
>> <Anton.Andreev at ontotext.com>  wrote:
>>      
>>> Hello Fangkai,
>>>
>>> First I would like to point out that the kim-discussion:
>>> http://ontotext.com/mailman/listinfo/kim-discussion is dedicated for asking
>>> technical questions like this one. Next time please use the kim-discussion
>>> mailing list, not this one. Thanks.
>>>
>>>        
>> Sorry for the mistake. I will use that list the next time.
>>
>>      
>>> Now back to your problem:
>>> What version of KIM do you use? KIM 2.4?
>>>
>>>        
>> Yes. I am using KIM2.4 under Windows XP.
>>
>>      
>>> Are you using the KIMGate hybrid - a GATE developer with KIM's default
>>> pipeline or the tool called "populater" again from the bin folder?
>>>        
>> I started KIM by running startkim.bat, and the populator by running
>> toolPopulate.cmd in tool folder. I didn't see the tool "populator" in
>> the bin folder.
>>
>>      
>>> The later
>>> only needs a document source folder and uses an already running KIM
>>> instance. Do you see that the documents are being annotated? What results do
>>> you expect, what is missing?
>>>
>>>        
>> Here is what I expect. I have a corpus containing about 2000 docs, and
>> I want to query over these docs. So I plan to use toolPopulate to
>> extract entities over these docs (this is what I am trying to do), and
>> then query over them. I expect to see the entities populated from
>> these docs, but I didn't see any meaningful entities when I query the
>> entity from the KIM GUI.
>>
>> I don't know if the above makes sense. Thank you!
>>
>> Fangkai
>>
>>
>>      
>>> The steps you are doing are correct in general.
>>>
>>> Best regards,
>>> Anton Andreev
>>>
>>> --
>>> Anton Andreev
>>> Account Manager
>>> Ontotext AD
>>> Tel: +359 2 875 81 17
>>> Fax:+359 2 975 32 26
>>> email: anton.andreev at ontotext.com
>>> www.ontotext.com
>>>
>>>
>>>
>>> On 3.6.2010 г. 18:17 ч., KIM Platform info newsletter wrote:
>>>        
>>>> Dear List,
>>>>
>>>>           I am trying to use Populate GUI to populate entities from my
>>>> own corpus. I have downloaded the raw file of PennTree bank, i.e., the
>>>> articles from Wall Street Journal in plain text form, and refer to the
>>>> folder in Populate GUI. However, it seems no entities is populated. I
>>>> try to add an .xml file with the same name of the text file, but still
>>>> doesn't work. (I check that by first deleting all files from
>>>> /context/default/populated, and populate entities from a file, and
>>>> check the entities by querying the entities at
>>>> http://localhost:8080/kim, but no meaningful entities found). I am
>>>> wondering if I miss some steps or important configurations. Thank you
>>>> very much!
>>>>
>>>> Best,
>>>>
>>>> Fangkai
>>>> _______________________________________________
>>>> interested-in-KIM mailing list
>>>> interested-in-KIM at ontotext.com
>>>> http://ontotext.com/mailman/listinfo/interested-in-kim
>>>>
>>>>          
>>>
>>>
>>>        
>>
>>
>> --
>> Fangkai Yang, Ph.D student
>> Taylor Hall 3.150A
>> Department of Computer Sciences
>> The University of Texas at Austin
>> Austin, 78712-0233, Texas
>> USA
>> http://www.cs.utexas.edu/~fkyang
>> email: fkyang at cs.utexas.edu
>>
>>      
>
>
>    


-- 
Philip Alexiev<philip.alexiev at ontotext.com>
Software Engineer
Ontotext AD



More information about the Kim-discussion mailing list