From david.c.harrill at lmco.com Tue Mar 2 09:49:20 2010 From: david.c.harrill at lmco.com (Harrill, David C) Date: Tue, 02 Mar 2010 09:49:20 -0500 Subject: [Kim-discussion] KIM_Entity_Search Message-ID: <252DC4CD40D8754E9F4E258E84BCD29753604DC1@HVXMSP2.us.lmco.com> To whom it may concern, In working with the KIM tool, I came across the Document Detail screen which displays both the Features associated with the document as well as the document content. Within the Features section, there exists two Features (KeyEntities and KeyPhrases). Are these two features derived from the GATE application and if so using what GATE plug-in? Otherwise how do these entities and phrases get populated on this screen. I appreciate any information you can provide on this matter and I look forward to hearing from you in regard to this matter. Thanks, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From borislav.popov at ontotext.com Tue Mar 2 10:27:56 2010 From: borislav.popov at ontotext.com (borislav popov) Date: Tue, 2 Mar 2010 17:27:56 +0200 Subject: [Kim-discussion] KIM_Entity_Search In-Reply-To: <252DC4CD40D8754E9F4E258E84BCD29753604DC1@HVXMSP2.us.lmco.com> References: <252DC4CD40D8754E9F4E258E84BCD29753604DC1@HVXMSP2.us.lmco.com> Message-ID: <4288968B-1D28-4DA5-9E3C-417778F934E3@ontotext.com> Hi Dave, both types of key artifacts are a part of the default kim pipeline, i.e. they are running in a standard GATE pipeline. The key phrase extraction has been originally developed by Kalina Bontcheva (USFD) and probably others at USFD. We took it some years ago and worked together to extend it. It is now available in GATE - check the creole plugins available and search for Keyphrase. It is in /plugins/Keyphrase_Extraction_Algorithm The module is based on TF.IDF, where the document frequency in IDF is calculated on a pre-defined corpus during the training of the model. You can limit the size of the model, the number of tokens in a phrase (e.g. taking only phrases 2 to 3 tokens of length). During runtime you can specify how many keyphrases you'd like to get per doc. I'm pretty certain, although we've changed it, that you would be able to get similar results easily with what is available in GATE. The key entities identification components are derived from this one, but they count on unique (for the entire corpus) identifier of entities - in our case URIs of instances in a knowledge base. Without it - you can not do the stats. I do not think that this functionality is available in GATE - mainly because you do not have this unique ID capability there - although with all the ontology extensions that the community introduced in the recent years - i might be wrong - so please check with the gate list. all the best borislav On Mar 2, 2010, at 4:49 PM, Harrill, David C wrote: > To whom it may concern, > > In working with the KIM tool, I came across the Document Detail > screen which displays both the Features associated with the document > as well as the document content. Within the Features section, there > exists two Features (KeyEntities and KeyPhrases). Are these two > features derived from the GATE application and if so using what GATE > plug-in? Otherwise how do these entities and phrases get populated > on this screen. I appreciate any information you can provide on this > matter and I look forward to hearing from you in regard to this > matter. > > Thanks, > Dave > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.c.harrill at lmco.com Tue Mar 2 10:48:58 2010 From: david.c.harrill at lmco.com (Harrill, David C) Date: Tue, 02 Mar 2010 10:48:58 -0500 Subject: [Kim-discussion] KIM_Entity_Search Message-ID: <252DC4CD40D8754E9F4E258E84BCD29753604F10@HVXMSP2.us.lmco.com> Borislav, Thanks for your quick reply. I wanted to ask an additional question pertaining to the overall KIM application. I have also been working with GATE and have been attempting to differentiate the two applications i.e. What overall capability that KIM provides that GATE does not (with associative plug-ins). I have been reviewing the vast documentation for both applications and have been unable to come up with a clear cut difference (excluding the wonderful search capabilities in KIM). Could you potentially provide me with information on why an individual would primarily use GATE over KIM or vice versa. Again, thanks for your assistance in regard to this matter. Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From Anton.Andreev at ontotext.com Tue Mar 2 11:56:39 2010 From: Anton.Andreev at ontotext.com (Anton Andreev) Date: Tue, 02 Mar 2010 18:56:39 +0200 Subject: [Kim-discussion] KIM_Entity_Search In-Reply-To: <252DC4CD40D8754E9F4E258E84BCD29753604F10@HVXMSP2.us.lmco.com> References: <252DC4CD40D8754E9F4E258E84BCD29753604F10@HVXMSP2.us.lmco.com> Message-ID: <4B8D4347.6030606@ontotext.com> Hello Dave, Borislav is head of our Semantic Annotation group. He has been with Ontotext for almost 10 years, so he has a long experience and expertise with semantic technologies and NLP. KIM uses a build-in version of GATE. Also we add some of our own specific GATE processing resources, JAPE rules, etc - we have our own pipeline. This pipeline was build based on experience in some EU research and some commercial projects. That is why annotating with KIM will give different results than annotating with GATE Annie. You may or you may not prefer our pipe-line based on the requirements of your project. The results of the extraction process are usually modeled in a knowledge base as individual entities and relationships, with respect to a formal conceptual model (ontology). This allows multi-paradigm retrieval on a single semantic index, covering structural, textual and co-occurrence based query and analysis. This is an advantage over GATE, but some projects may not need this functionality.We do that thanks to our OWLIM semantic database: http://ontotext.com/owlim. KIM has a buit-in version of OWLIM. OWLIM also lives its own life as we have a team devoted to it. OWLIM also does inference which also improves the whole picture when you later makes queries to your data. I can give you some examples if you wish. From developers point of view KIM acts as the server version of GATE, has its own JAVA API. In general it provides its API through Java RMI and web-services, but in the next version we will also include JMS. For example we have a project for a pharmaceutical company that uses KIM from .NET through web-services. We also do some additional services on top of KIM that you might be interested. Hope this helps, --------------------------------- Anton Andreev email: anton.andreev at ontotext.com Account Manager at Ontotext www.ontotext.com --------------------------------- On 2.3.2010 ?. 17:48 ?., Harrill, David C wrote: > > Borislav, > > Thanks for your quick reply. I wanted to ask an additional question > pertaining to the overall KIM application. I have also been working > with GATE and have been attempting to differentiate the two > applications i.e. What overall capability that KIM provides that GATE > does not (with associative plug-ins). I have been reviewing the vast > documentation for both applications and have been unable to come up > with a clear cut difference (excluding the wonderful search > capabilities in KIM). Could you potentially provide me with > information on why an individual would primarily use GATE over KIM or > vice versa. Again, thanks for your assistance in regard to this matter. > > Dave > > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion > -- Best regards, Anton Andreev email: anton.andreev at ontotext.com Account Manager at Ontotext www.ontotext.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From borislav.popov at ontotext.com Tue Mar 2 12:05:52 2010 From: borislav.popov at ontotext.com (borislav popov) Date: Tue, 2 Mar 2010 19:05:52 +0200 Subject: [Kim-discussion] KIM_Entity_Search In-Reply-To: <4B8D4347.6030606@ontotext.com> References: <252DC4CD40D8754E9F4E258E84BCD29753604F10@HVXMSP2.us.lmco.com> <4B8D4347.6030606@ontotext.com> Message-ID: > etc - we have our own pipeline. This pipeline was build based on ... just a clarification: this is just about the default pipeline available with the evaluation of KIM. Almost in all cases - our users or customers make or request changes. So it is like -ANNIE - a basis. b From borislav.popov at ontotext.com Tue Mar 2 12:41:50 2010 From: borislav.popov at ontotext.com (borislav popov) Date: Tue, 2 Mar 2010 19:41:50 +0200 Subject: [Kim-discussion] Fwd: KIM_Entity_Search References: <12756E58-B690-4E96-8986-7C9E0FEDB4A2@ontotext.com> Message-ID: <1AB40FC8-3DEB-4753-8189-01B6C3EF95BB@ontotext.com> sorry guys. sent this only to dave. fwding to the list as well b Begin forwarded message: > From: borislav popov > Date: March 2, 2010 6:19:10 PM GMT+02:00 > To: "Harrill, David C" > Subject: Re: [Kim-discussion] KIM_Entity_Search > > Dave > you came to an existential question. the simple answer is: > - if you need pure text analysis and you know what to do with the > results afterwards - you need only GATE. > - if you need annotations with respect to some structured data sets > - like knowledge bases, conceptual models ... that allows search and > navigation based on FTS, structure of the data set, co-occurrence > and a combination of those; if you need ways to obtain content > through rss feeds or focussed crawling; etc. - you need KIM. > > So briefly - text analysis we have by default or produce for > customers is GATE compliant - and in most cases can be executed > within a pure GATE environment. GATE embedded is integral part of > KIM for modeling documents, annotations, corpora, etc. > The rest are content feeding services, semantic annotation, indexing > & search. The Web UI you know is just an example one - often > customers choose completely different route. > Beside this we provide customizations of everything in a kim-based > system - from crawlers, IE pipelines to background knowledge and > search. And support our customers as they go on. > > for GATE only: also there you might stumble upon stuff that is not > present with the main package - like parallel processing, annotation > patterns based search, manual curation infrastructure > for these and more we have joint offerings with the GATE group and/ > or other partners, like Matrixware. > > So the answer is not that simple as the dependency is multi-layered. > There are also several products that are still not public in which > we heavily cooperate with GATE, but bits and pieces go into custom > projects for customers already. > > i remember a project several years ago where we had to explain how > in GATE there is a KIM Client calling a KIM Server which is based on > GATE. phew. > > So when you know what you would like to do - we hope this info will > help you take the right route and not waste time. > > all the best > b > > > On Mar 2, 2010, at 5:46 PM, Harrill, David C wrote: > >> Borislav, >> >> Thanks for your quick reply. I wanted to ask an additional question >> pertaining to the overall KIM application. I have also been working >> with GATE and have been attempting to differentiate the two >> applications i.e. What overall capability that KIM provides that >> GATE does not (with associative plug-ins). I have been reviewing >> the vast documentation for both applications and have been unable >> to come up with a clear cut difference (excluding the wonderful >> search capabilities in KIM). Could you potentially provide me with >> information on why an individual would primarily use GATE over KIM >> or vice versa. Again, thanks for your assistance in regard to this >> matter. >> >> Dave >> >> From: borislav popov [mailto:borislav.popov at ontotext.com] >> Sent: Tuesday, March 02, 2010 10:28 AM >> To: Harrill, David C >> Cc: kim-discussion at ontotext.com >> Subject: Re: [Kim-discussion] KIM_Entity_Search >> >> Hi Dave, >> both types of key artifacts are a part of the default >> kim pipeline, i.e. they are running in a standard GATE pipeline. >> The key phrase extraction has been originally developed by Kalina >> Bontcheva (USFD) and probably others at USFD. We took it some years >> ago and worked together to extend it. It is now available in GATE - >> check the creole plugins available and search for Keyphrase. It is >> in /plugins/Keyphrase_Extraction_Algorithm >> The module is based on TF.IDF, where the document frequency in IDF >> is calculated on a pre-defined corpus during the training of the >> model. You can limit the size of the model, the number of tokens in >> a phrase (e.g. taking only phrases 2 to 3 tokens of length). During >> runtime you can specify how many keyphrases you'd like to get per >> doc. >> >> I'm pretty certain, although we've changed it, that you would be >> able to get similar results easily with what is available in GATE. >> >> The key entities identification components are derived from this >> one, but they count on unique (for the entire corpus) identifier of >> entities - in our case URIs of instances in a knowledge base. >> Without it - you can not do the stats. I do not think that this >> functionality is available in GATE - mainly because you do not have >> this unique ID capability there - although with all the ontology >> extensions that the community introduced in the recent years - i >> might be wrong - so please check with the gate list. >> >> all the best >> borislav >> >> On Mar 2, 2010, at 4:49 PM, Harrill, David C wrote: >> >> >> To whom it may concern, >> >> In working with the KIM tool, I came across the Document Detail >> screen which displays both the Features associated with the >> document as well as the document content. Within the Features >> section, there exists two Features (KeyEntities and KeyPhrases). >> Are these two features derived from the GATE application and if so >> using what GATE plug-in? Otherwise how do these entities and >> phrases get populated on this screen. I appreciate any information >> you can provide on this matter and I look forward to hearing from >> you in regard to this matter. >> >> Thanks, >> Dave >> >> _______________________________________________ >> Kim-discussion mailing list >> Kim-discussion at ontotext.com >> http://ontotext.com/mailman/listinfo/kim-discussion >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.c.harrill at lmco.com Tue Mar 2 13:22:55 2010 From: david.c.harrill at lmco.com (Harrill, David C) Date: Tue, 02 Mar 2010 13:22:55 -0500 Subject: [Kim-discussion] KIM_Entity_Search Message-ID: <252DC4CD40D8754E9F4E258E84BCD2975360523B@HVXMSP2.us.lmco.com> Borislav, Along the same lines, is there any official documentation (such as a VENN diagram) or can you provide me with a VENN diagram type of description of features that distinguishes what features are in GATE compared to what is in KIM and any respective overlap? The reason I ask is that KIM seems to include so many characteristics that I am having trouble identifying where KIM starts and where GATE begins. As an example, you made mention that the functionality in KIM deals with co-occurrence of specific entities. GATE also provides this capability in the manner of their orthographic co reference. However it seems as if KIM takes advantage of additional ontology's not accessible to GATE. Please let me know if my question does not make sense. Thanks again for responding quickly and I look forward to hearing from you in regard to this matter. Thanks, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.c.harrill at lmco.com Tue Mar 2 18:15:29 2010 From: david.c.harrill at lmco.com (Harrill, David C) Date: Tue, 02 Mar 2010 18:15:29 -0500 Subject: [Kim-discussion] VENN_Diagram_Gate_vs._KIM Message-ID: <252DC4CD40D8754E9F4E258E84BCD29753660B32@HVXMSP2.us.lmco.com> As an addendum to my last message, would the following be accurate based on what I was inquiring about: GATE * corpora comprising sets of documents, grouping documents for the purpose of running uniform processes across them. * processing resources that manipulate and create annotations on documents o Orthomatcher o Pronominal Coreference o Tokenizer o There are many additional resources. * applications, comprising sequences of processing resources, that can be applied to a document or corpus. KIM * Popularity Timelines * Ontologies (PROTON + KIMSO + KIMLO) * KIM World KB * Web UI * Lucene - an open-source IR engine by Apache OVERLAP * scalable and customizable * ontology-based information extraction (IE) * annotation * document management * Sesame ontology * OWLIM Ontology Hopefully I clarified my own question. Are there additional features in any of the categories that I have not listed? As always, thanks for your quick response in regard to this matter. Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From borislav.popov at ontotext.com Thu Mar 4 05:42:56 2010 From: borislav.popov at ontotext.com (borislav popov) Date: Thu, 4 Mar 2010 12:42:56 +0200 Subject: [Kim-discussion] kim graph knowledge explorer In-Reply-To: References: Message-ID: Hi Greg, it depends what meaning you put behind" ontology-guided entity extraction ". There are two major possibilities: - you put in an ontology and KIM automatically start extracting instances from the ontology classes: this is not supported. and there is no obvious way to do it with reasonable quality. - you put in an ontology - usually extending PROTON as a basis; and you reuse or develop extraction pipelines to find entities from the classes of interest: this is exactly what KIM is meant for. b On Mar 2, 2010, at 11:33 PM, WILSON, Greg wrote: > Thanks for the information. > > Also, do you have any information related to using KIM for ontology- > guided entity extraction ? > > Thanks, > Greg > > From: borislav popov [mailto:borislav.popov at ontotext.com] > Sent: Wednesday, February 24, 2010 2:52 AM > To: WILSON, Greg > Cc: anton.andreev at ontotext.com > Subject: Re: kim graph knowledge explorer > > Hi Greg, > this was a 3rd party hyperbolic tree applet, when it was still > considered cool. At some point we have removed it from the > distribution, as its behavior had some specifics - sometimes > floating to the sides; and also quite unreadable with anything > beyond a dozen nodes. > So there is no such functionality by default. Given that such graph > visualization tools have an API and KIM has API for exploring the > entities - such integrations are very natural and some customers > choose graph visualization tools. > For small descriptions - this is fancy for bigger - say - you want > to see a big part of the KB - it doesn't work at all. We have > invested time in researching alternatives, that utilize space better > than a graph: 3D graph representations projected on a 2D plane; and > Voronoi treemaps, but we have nothing to show still. > b > > On Feb 24, 2010, at 12:48 AM, WILSON, Greg wrote: > >> Please see the brief note below. >> Can you point us to any instructions/documentation related to the >> Knowledge Explorer ? >> Thanks. >> >> From: WANG, Rick (Export Controlled) >> Sent: Tuesday, February 23, 2010 1:03 PM >> To: WILSON, Greg; KUPIEC, John; MANZOLILLO, David >> Subject: RE: kim graph knowledge explorer >> >> The Graph.jsp page is still there. However, I don?t see any >> references that point to the page. >> Rick >> From: WILSON, Greg >> Sent: Tuesday, February 23, 2010 10:42 AM >> To: WANG, Rick (Export Controlled); KUPIEC, John; MANZOLILLO, David >> Subject: kim graph knowledge explorer >> This (below) is what I was referring to in the meeting - but i'm >> not sure if the platform still supports this. >> If so - it would be very interesting to see it in action. >> Rick - >> Thanks again for the demo today. >> That was very useful and I think there are significant >> opportunities for SD in this space... >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Anton.Andreev at ontotext.com Thu Mar 4 10:50:48 2010 From: Anton.Andreev at ontotext.com (Anton Andreev) Date: Thu, 04 Mar 2010 17:50:48 +0200 Subject: [Kim-discussion] VENN_Diagram_Gate_vs._KIM In-Reply-To: <252DC4CD40D8754E9F4E258E84BCD29753660B32@HVXMSP2.us.lmco.com> References: <252DC4CD40D8754E9F4E258E84BCD29753660B32@HVXMSP2.us.lmco.com> Message-ID: <4B8FD6D8.3080802@ontotext.com> Hello Dave, Sorry about this late answer, Wednesday was our Independence Day here in Bulgaria. I would like to add something to the features listed by Borislav: - We do use our own web crawlers with KIM, they are not part of the standard KIM package, but we can provide them. - With the RSS functionality that is now in development we would like to export results from queries. Once the user issue a query he or she will be able to get the updates from it in its RSS browser. There will be updates because new documents are coming. - Parallel annotation gives the ability KIM to instantiate a selected number of GATE pipelines on the same machine. This is useful in today multi-core world. In KIM 3.0 you can set this option to "auto" in the config file and KIM will create pipelines that are equal to the number of CPU cores. In GATE you can do this programmatically. - KIM provides search capabilities that combine full text search data(for example from Lucene) and semantic data though OWLIM. KIM is well integrated with OWLIM as we also produce OWLIM. Hope this helps, --------------------------------- Anton Andreev email: anton.andreev at ontotext.com Account Manager at Ontotext www.ontotext.com --------------------------------- On 3.3.2010 ?. 01:15 ?., Harrill, David C wrote: > > As an addendum to my last message, would the following be accurate > based on what I was inquiring about: > > *GATE* > > ? corpora comprising sets of documents, grouping documents for the > purpose of running uniform processes across them. > > ? processing resources that manipulate and create annotations on documents > > o Orthomatcher > > o Pronominal Coreference > > o Tokenizer > > o There are many additional resources. > > ? applications, comprising sequences of processing resources, that can > be applied to a document or corpus. > > * * > > *KIM* > > ? Popularity Timelines** > > ? Ontologies (PROTON + KIMSO + KIMLO)** > > ? KIM World KB** > > ? Web UI** > > ? Lucene -- an open-source IR engine by Apache** > > *OVERLAP* > > ? scalable and customizable** > > ? ontology-based information extraction (IE)** > > ? annotation** > > ? document management > > ? Sesame ontology > > ? OWLIM Ontology > > Hopefully I clarified my own question. Are there additional features > in any of the categories that I have not listed? As always, thanks for > your quick response in regard to this matter. > > Dave > > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From borislav.popov at ontotext.com Thu Mar 4 11:25:32 2010 From: borislav.popov at ontotext.com (borislav popov) Date: Thu, 4 Mar 2010 18:25:32 +0200 Subject: [Kim-discussion] VENN_Diagram_Gate_vs._KIM In-Reply-To: <4B8FD6D8.3080802@ontotext.com> References: <252DC4CD40D8754E9F4E258E84BCD29753660B32@HVXMSP2.us.lmco.com> <4B8FD6D8.3080802@ontotext.com> Message-ID: <7F3CC4D1-15AD-4AE0-B9BD-BA40F31C8B7A@ontotext.com> Hi Dave, what you summarized is correct. On the other hand it is incomplete and it is not easy to complete this list. The reason is that we acquired basically every feature that GATE has, as it is embedded in KIM and even provides the IDE along the normal KIM installation. On the other hand we contributed to GATE all the ontology related stuff. So it is getting messy. One of the confusions is based on the fact that until now KIM ships as a single piece although made of many parts. This prevents people from seeing the actual modules and how they interact. But yes, besides crawling search functionality, extensive semantic annotation capabilities we can also mention: identity resolution platform for solving ambiguities of entity identities based on semantic descriptions and contexts derived from content for many solutions we need a customization, as KIM is a platform and can not serve as an off-the shelf product for all purposes (just for some). if you need a more elaborate analysis we talk it over together - feel free to contact me on skype: borislav.popov b On Mar 4, 2010, at 5:50 PM, Anton Andreev wrote: > Hello Dave, > > Sorry about this late answer, Wednesday was our Independence Day > here in Bulgaria. > > I would like to add something to the features listed by Borislav: > > - We do use our own web crawlers with KIM, they are not part of the > standard KIM package, but we can provide them. > > - With the RSS functionality that is now in development we would > like to export results from queries. Once the user issue a query he > or she will be able to get the updates from it in its RSS browser. > There will be updates because new documents are coming. > > - Parallel annotation gives the ability KIM to instantiate a > selected number of GATE pipelines on the same machine. This is > useful in today multi-core world. In KIM 3.0 you can set this option > to "auto" in the config file and KIM will create pipelines that are > equal to the number of CPU cores. In GATE you can do this > programmatically. > > - KIM provides search capabilities that combine full text search > data(for example from Lucene) and semantic data though OWLIM. KIM is > well integrated with OWLIM as we also produce OWLIM. > > Hope this helps, > --------------------------------- > Anton Andreev > email: anton.andreev at ontotext.com > Account Manager at Ontotext > www.ontotext.com > --------------------------------- > > On 3.3.2010 ?. 01:15 ?., Harrill, David C wrote: >> >> As an addendum to my last message, would the following be accurate >> based on what I was inquiring about: >> >> GATE >> ? corpora comprising sets of documents, grouping documents >> for the purpose of running uniform processes across them. >> ? processing resources that manipulate and create >> annotations on documents >> o Orthomatcher >> o Pronominal Coreference >> o Tokenizer >> o There are many additional resources. >> ? applications, comprising sequences of processing >> resources, that can be applied to a document or corpus. >> >> KIM >> ? Popularity Timelines >> ? Ontologies (PROTON + KIMSO + KIMLO) >> ? KIM World KB >> ? Web UI >> ? Lucene ? an open-source IR engine by Apache >> >> OVERLAP >> ? scalable and customizable >> ? ontology-based information extraction (IE) >> ? annotation >> ? document management >> ? Sesame ontology >> ? OWLIM Ontology >> >> Hopefully I clarified my own question. Are there additional >> features in any of the categories that I have not listed? As >> always, thanks for your quick response in regard to this matter. >> >> Dave >> >> _______________________________________________ >> Kim-discussion mailing list >> Kim-discussion at ontotext.com >> http://ontotext.com/mailman/listinfo/kim-discussion >> > > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.c.harrill at lmco.com Thu Mar 4 11:45:45 2010 From: david.c.harrill at lmco.com (Harrill, David C) Date: Thu, 04 Mar 2010 11:45:45 -0500 Subject: [Kim-discussion] VENN_Diagram_Gate_vs._KIM In-Reply-To: <7F3CC4D1-15AD-4AE0-B9BD-BA40F31C8B7A@ontotext.com> References: <252DC4CD40D8754E9F4E258E84BCD29753660B32@HVXMSP2.us.lmco.com> <4B8FD6D8.3080802@ontotext.com> <7F3CC4D1-15AD-4AE0-B9BD-BA40F31C8B7A@ontotext.com> Message-ID: <252DC4CD40D8754E9F4E258E84BCD297536BB892@HVXMSP2.us.lmco.com> Boris, That information is awesome. This will be extremely helpful. Thanks for the skype info I will definitely contact you if I have additional questions. Thanks, Dave From: borislav popov [mailto:borislav.popov at ontotext.com] Sent: Thursday, March 04, 2010 11:26 AM To: Anton Andreev; Harrill, David C Cc: [KIM-discussion] Subject: Re: [Kim-discussion] VENN_Diagram_Gate_vs._KIM Hi Dave, what you summarized is correct. On the other hand it is incomplete and it is not easy to complete this list. The reason is that we acquired basically every feature that GATE has, as it is embedded in KIM and even provides the IDE along the normal KIM installation. On the other hand we contributed to GATE all the ontology related stuff. So it is getting messy. One of the confusions is based on the fact that until now KIM ships as a single piece although made of many parts. This prevents people from seeing the actual modules and how they interact. But yes, besides crawling search functionality, extensive semantic annotation capabilities we can also mention: identity resolution platform for solving ambiguities of entity identities based on semantic descriptions and contexts derived from content for many solutions we need a customization, as KIM is a platform and can not serve as an off-the shelf product for all purposes (just for some). if you need a more elaborate analysis we talk it over together - feel free to contact me on skype: borislav.popov b On Mar 4, 2010, at 5:50 PM, Anton Andreev wrote: Hello Dave, Sorry about this late answer, Wednesday was our Independence Day here in Bulgaria. I would like to add something to the features listed by Borislav: - We do use our own web crawlers with KIM, they are not part of the standard KIM package, but we can provide them. - With the RSS functionality that is now in development we would like to export results from queries. Once the user issue a query he or she will be able to get the updates from it in its RSS browser. There will be updates because new documents are coming. - Parallel annotation gives the ability KIM to instantiate a selected number of GATE pipelines on the same machine. This is useful in today multi-core world. In KIM 3.0 you can set this option to "auto" in the config file and KIM will create pipelines that are equal to the number of CPU cores. In GATE you can do this programmatically. - KIM provides search capabilities that combine full text search data(for example from Lucene) and semantic data though OWLIM. KIM is well integrated with OWLIM as we also produce OWLIM. Hope this helps, --------------------------------- Anton Andreev email: anton.andreev at ontotext.com Account Manager at Ontotext www.ontotext.com --------------------------------- On 3.3.2010 ?. 01:15 ?., Harrill, David C wrote: As an addendum to my last message, would the following be accurate based on what I was inquiring about: GATE ? corpora comprising sets of documents, grouping documents for the purpose of running uniform processes across them. ? processing resources that manipulate and create annotations on documents o Orthomatcher o Pronominal Coreference o Tokenizer o There are many additional resources. ? applications, comprising sequences of processing resources, that can be applied to a document or corpus. KIM ? Popularity Timelines ? Ontologies (PROTON + KIMSO + KIMLO) ? KIM World KB ? Web UI ? Lucene ? an open-source IR engine by Apache OVERLAP ? scalable and customizable ? ontology-based information extraction (IE) ? annotation ? document management ? Sesame ontology ? OWLIM Ontology Hopefully I clarified my own question. Are there additional features in any of the categories that I have not listed? As always, thanks for your quick response in regard to this matter. Dave _______________________________________________ Kim-discussion mailing list Kim-discussion at ontotext.com http://ontotext.com/mailman/listinfo/kim-discussion _______________________________________________ Kim-discussion mailing list Kim-discussion at ontotext.com http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From kim-info at ontotext.com Thu Mar 11 08:27:53 2010 From: kim-info at ontotext.com (kim-info at ontotext.com) Date: Thu, 11 Mar 2010 15:27:53 +0200 Subject: [Kim-discussion] KIM 3.0: Do you have suggestions, feature requests, or just ideas? Message-ID: <63719D385DCF1AF90E3EFFE38F72E031970A3718@aandreev.sirma.int> Dear friends, At some point we froze the evaluation version of KIM to 2.4, but down in the basement, we continue to hammer the current version KIM 3.0 - new architecture, new scale, new possibilities. In this process we have been customizing it for various customers, but there should be a moment when this becomes available for evaluation as well. We have planned this moment to be in May 2010. This time, we would like to hear from all of you - who evaluated, used or bought our solutions: What suggestions, ideas and features requests do you have? Thanks for staying in touch! Yours: Ontotext. www.ontotext.com p.s. Please notify us if you consider this email as spam. Please excuse us. From mnozchev at sirma.bg Fri Mar 12 05:01:44 2010 From: mnozchev at sirma.bg (Marin Nozhchev) Date: Fri, 12 Mar 2010 12:01:44 +0200 Subject: [Kim-discussion] KIM 3.0: Do you have suggestions, feature requests, or just ideas? In-Reply-To: References: Message-ID: <4B9A1108.6060306@sirma.bg> Dear Mr Fontes, All, Thank you for your input. We consider RDFa support a priority as well. Would you mind answering a few questions on the details of that support? We intend to support RDFa and not Microdata or Microformats. Obviously, we are partial to solutions that use RDF? Do you prefer RDFa or Microdata ? We intend to support RDFa in three ways: - the output of the semantic annotation can be received in RDFa format - if you find a document through our web interface, you will be able to download it in RDFa* - when you browse annotated documents in the web interface (like this one - http://bit.ly/onto_document_sample) , their XHTML view will contain RDFa Do you have any other uses that may apply to the current KIM ? We intend to use our regular PROTON ontology to describe the metadata in RDFa. Do you think that this will affect the utility of the metadata? Thank you, Marin Nozhchev, Project Manager, KIM Platform * It is currently possible to export the document in GATE XML format. Try clicking Export near the top right corner of that link - http://bit.ly/onto_document_sample On 11.03.2010 19:33 ?., Celso Fontes wrote: > > * RDFa and/or Microdata support !!!! > > > 2010/3/11 > > > Dear friends, > > At some point we froze the evaluation version of KIM to 2.4, but > down in the basement, we continue to hammer the current version > KIM 3.0 - new architecture, new scale, new possibilities. In this > process we have been customizing it for various customers, but > there should be a moment when this becomes available for > evaluation as well. We have planned this moment to be in May 2010. > > This time, we would like to hear from all of you - who evaluated, > used or bought our solutions: > What suggestions, ideas and features requests do you have? > > Thanks for staying in touch! > > Yours: Ontotext. > www.ontotext.com > > p.s. Please notify us if you consider this email as spam. Please > excuse us. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Mon Mar 15 05:56:33 2010 From: philip.alexiev at ontotext.com (Philip Alexiev) Date: Mon, 15 Mar 2010 11:56:33 +0200 Subject: [Kim-discussion] Urgent Request In-Reply-To: References: <20100309155025.0qfiue1ko080gsko@webmail.sirma.bg> Message-ID: <4B9E0451.7000400@ontotext.com> Hello Nader, 1 and 3: KIM does not provide functionality to get the output in rdf/xml format. I don't recall older versions being able to do this either. Maybe it is achievable through the API. We haven't developed it in this direction. 2. There are a number of efforts to make public data available in RDF format. There is also a big projects which aims to connect the different disjoint datasets in one large Knoledge Base. The project is called : Linked Open Data (http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData) . In the data sets it uses you may find useful references for your task (http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets). For example: Geonames - http://www.geonames.org/ontology/ . If you have your own data which is in a different format, you may write your own custom tool to create RDF from it. You will have to tie it to the ontology KIM uses by default - PROTON (http://proton.semanticweb.org/). There is a section in the documentation of KIM which will be helpful : Creating Knowledge Bases and Ontologies . Some tools for designing and viewing ontologies are : Protege , Swoop, Top Braid Composer. Hope this helps Philip On 03/14/2010 10:42 PM, Nader Zaki wrote: > Dear Philip, > > I want to ask few questions about the Kim: > > 1-How can I get the output of the annotation as OWL or RDF/XML files ? > > 2-You told me before to go to *kim/config/sesame.conf* and edit the > file by adding the namespaces of any new knowledge base but how can I > build this knowledge base , can you explain in more detail ?? > > 3-How can I get the older versions of the Kim platform ? > As I need precisely an API converting from HTML to OWL or RDF based > files and as I know the older versions of the Kim platform did that. > > Thanks alot for your time.Waiting for your reply. > > Regards,, > > Nader Nassef Zaki > > > Date: Tue, 9 Mar 2010 15:50:25 +0200 > > From: philip.alexiev at sirma.bg > > To: nadora5 at hotmail.com > > CC: kim-info at ontotext.com > > Subject: Re: Urgent Request > > > > Hello Nader, > > > > If you are using kim prior to 3.0, the knowledge base used is > > described in kim/config/sesame.conf - there is an imports section. > > You can add your custom files containing RDF data there. Have in mind > > that you will have to provide a corresponding default namespace below > > as well. > > > > The files can be in ntriples format or in rdf/xml . You can use any > > ontology editor to achieve this. Protege is a good choise. Just take a > > look at the resulting rdf/xml to make sure it is OK. > > > > Hope this helps > > Philip > > > > > > The New Busy is not the old busy. Search, chat and e-mail from your > inbox. Get started. > -- Philip Alexiev Software Engineer Ontotext AD -------------- next part -------------- An HTML attachment was scrubbed... URL: From Anton.Andreev at ontotext.com Mon Mar 15 07:02:19 2010 From: Anton.Andreev at ontotext.com (Anton Andreev) Date: Mon, 15 Mar 2010 13:02:19 +0200 Subject: [Kim-discussion] Urgent Request In-Reply-To: <4B9E0451.7000400@ontotext.com> References: <20100309155025.0qfiue1ko080gsko@webmail.sirma.bg> <4B9E0451.7000400@ontotext.com> Message-ID: <4B9E13BB.2080506@ontotext.com> Hello Nader, You can process documents and htmls with KIM and the resulting RDF is stored in our built-in OWLIM database in KIM. You may also try this tool: http://ontotext.com/kim/doc/sys-doc/RDFExport.html. This tool will export the RDF from OWLIM. Cheers, Anton Andreev email: anton.andreev at ontotext.com Account Manager at Ontotext www.ontotext.com On 15.3.2010 ?. 11:56 ?., Philip Alexiev wrote: > Hello Nader, > > 1 and 3: KIM does not provide functionality to get the output in > rdf/xml format. I don't recall older versions being able to do this > either. Maybe it is achievable through the API. We haven't developed > it in this direction. > > 2. There are a number of efforts to make public data available in RDF > format. There is also a big projects which aims to connect the > different disjoint datasets in one large Knoledge Base. The project is > called : Linked Open Data > (http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData) > . In the data sets it uses you may find useful references for your > task > (http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets). > For example: Geonames - http://www.geonames.org/ontology/ . > > If you have your own data which is in a different format, you may > write your own custom tool to create RDF from it. You will have to tie > it to the ontology KIM uses by default - PROTON > (http://proton.semanticweb.org/). There is a section in the > documentation of KIM which will be helpful : Creating Knowledge Bases > and Ontologies . > > Some tools for designing and viewing ontologies are : Protege , Swoop, > Top Braid Composer. > > Hope this helps > Philip > > On 03/14/2010 10:42 PM, Nader Zaki wrote: >> Dear Philip, >> >> I want to ask few questions about the Kim: >> >> 1-How can I get the output of the annotation as OWL or RDF/XML files ? >> >> 2-You told me before to go to *kim/config/sesame.conf* and edit the >> file by adding the namespaces of any new knowledge base but how can I >> build this knowledge base , can you explain in more detail ?? >> >> 3-How can I get the older versions of the Kim platform ? >> As I need precisely an API converting from HTML to OWL or RDF based >> files and as I know the older versions of the Kim platform did that. >> >> Thanks alot for your time.Waiting for your reply. >> >> Regards,, >> >> Nader Nassef Zaki >> >> > Date: Tue, 9 Mar 2010 15:50:25 +0200 >> > From: philip.alexiev at sirma.bg >> > To: nadora5 at hotmail.com >> > CC: kim-info at ontotext.com >> > Subject: Re: Urgent Request >> > >> > Hello Nader, >> > >> > If you are using kim prior to 3.0, the knowledge base used is >> > described in kim/config/sesame.conf - there is an imports section. >> > You can add your custom files containing RDF data there. Have in mind >> > that you will have to provide a corresponding default namespace below >> > as well. >> > >> > The files can be in ntriples format or in rdf/xml . You can use any >> > ontology editor to achieve this. Protege is a good choise. Just take a >> > look at the resulting rdf/xml to make sure it is OK. >> > >> > Hope this helps >> > Philip >> > >> > >> >> The New Busy is not the old busy. Search, chat and e-mail from your >> inbox. Get started. >> > > > -- > Philip Alexiev > Software Engineer > Ontotext AD > > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From borislav.popov at ontotext.com Mon Mar 15 11:46:29 2010 From: borislav.popov at ontotext.com (borislav popov) Date: Mon, 15 Mar 2010 17:46:29 +0200 Subject: [Kim-discussion] Urgent Request In-Reply-To: <4B9E0451.7000400@ontotext.com> References: <20100309155025.0qfiue1ko080gsko@webmail.sirma.bg> <4B9E0451.7000400@ontotext.com> Message-ID: <6CA3F8AF-EA3F-4B0B-A344-1872FB848391@ontotext.com> >> As I need precisely an API converting from HTML to OWL or RDF based >> files and as I know the older versions of the Kim platform did that. Nader, try to be more specific and we might be able to help. converting HTML to RDF seems too general to me. There is some functionality that allows extensible exporters based on the kim document and the annotations it has. One option would be to have RDFa embeded in HTML expressing the mentions of entities that are found by the text analysis pipeline. This would need some custom work related to what you need to export in RDF. take care b On Mar 15, 2010, at 11:56 AM, Philip Alexiev wrote: > Hello Nader, > > 1 and 3: KIM does not provide functionality to get the output in rdf/ > xml format. I don't recall older versions being able to do this > either. Maybe it is achievable through the API. We haven't developed > it in this direction. > > 2. There are a number of efforts to make public data available in > RDF format. There is also a big projects which aims to connect the > different disjoint datasets in one large Knoledge Base. The project > is called : Linked Open Data (http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData > ) . In the data sets it uses you may find useful references for your > task (http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets > ). For example: Geonames - http://www.geonames.org/ontology/ . > > If you have your own data which is in a different format, you may > write your own custom tool to create RDF from it. You will have to > tie it to the ontology KIM uses by default - PROTON (http://proton.semanticweb.org/ > ). There is a section in the documentation of KIM which will be > helpful : Creating Knowledge Bases and Ontologies . > > Some tools for designing and viewing ontologies are : Protege , > Swoop, Top Braid Composer. > > Hope this helps > Philip > > On 03/14/2010 10:42 PM, Nader Zaki wrote: >> >> Dear Philip, >> >> I want to ask few questions about the Kim: >> >> 1-How can I get the output of the annotation as OWL or RDF/XML >> files ? >> >> 2-You told me before to go to kim/config/sesame.conf and edit the >> file by adding the namespaces of any new knowledge base but how can >> I build this knowledge base , can you explain in more detail ?? >> >> 3-How can I get the older versions of the Kim platform ? >> As I need precisely an API converting from HTML to OWL or RDF based >> files and as I know the older versions of the Kim platform did that. >> >> Thanks alot for your time.Waiting for your reply. >> >> Regards,, >> >> Nader Nassef Zaki >> >> > Date: Tue, 9 Mar 2010 15:50:25 +0200 >> > From: philip.alexiev at sirma.bg >> > To: nadora5 at hotmail.com >> > CC: kim-info at ontotext.com >> > Subject: Re: Urgent Request >> > >> > Hello Nader, >> > >> > If you are using kim prior to 3.0, the knowledge base used is >> > described in kim/config/sesame.conf - there is an imports section. >> > You can add your custom files containing RDF data there. Have in >> mind >> > that you will have to provide a corresponding default namespace >> below >> > as well. >> > >> > The files can be in ntriples format or in rdf/xml . You can use any >> > ontology editor to achieve this. Protege is a good choise. Just >> take a >> > look at the resulting rdf/xml to make sure it is OK. >> > >> > Hope this helps >> > Philip >> > >> > >> >> The New Busy is not the old busy. Search, chat and e-mail from your >> inbox. Get started. > > > -- > Philip Alexiev > Software Engineer > Ontotext AD > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From Anton.Andreev at ontotext.com Mon Mar 15 13:12:43 2010 From: Anton.Andreev at ontotext.com (Anton Andreev) Date: Mon, 15 Mar 2010 19:12:43 +0200 Subject: [Kim-discussion] KIM Platform Registration In-Reply-To: <20100315123945.pg70vuag2vxcgs84@mail.encs.concordia.ca> References: <201003121638.o2CGcMIe009630@kim.virtual.vps-host.net> <4B9A7201.6040005@ontotext.com> <20100312121954.ag2bju1lxalc08og@mail.encs.concordia.ca> <4B9A7D1B.7080605@ontotext.com> <20100315123945.pg70vuag2vxcgs84@mail.encs.concordia.ca> Message-ID: <4B9E6A8B.6080708@ontotext.com> Hello Nona, Please delete the entire "owlim-storage" storage folder. and this will force OWLIM to reinitialize. -- Best regards, Anton Andreev email: anton.andreev at ontotext.com Account Manager at Ontotext www.ontotext.com On 15.3.2010 ?. 18:39 ?., n_nad at encs.concordia.ca wrote: > Hello Anton, > Thank you so much for letting me access the application. > However, I want to use the Large KB Gazetteer as a PR under Gate, and > right now > apparently one file ignoreList.def is missing (for loading the Large > KB gazetteer in > Gate), and I can not load your PR; Here I attach the error: > Unhandled error! > com.ontotext.kim.client.KIMRuntimeException: The loading failed > ERROR [AliasCacheImpl] - Error while reading ignore > list:./resources/gazetteer/ignoreList.def (No such file or directory) > com.ontotext.kim.client.KIMRuntimeException: The loading failed. > at > com.ontotext.kim.model.AliasCacheImpl.loadTrustedMaps(AliasCacheImpl.java:350) > > at > com.ontotext.kim.model.AliasCacheImpl.initCache(AliasCacheImpl.java:289) > at > com.ontotext.kim.model.AliasCacheImpl.createInstance(AliasCacheImpl.java:145) > > at > com.ontotext.kim.model.AliasCacheImpl.getInstance(AliasCacheImpl.java:115) > > at com.ontotext.kim.gate.KimGazetteer.init(KimGazetteer.java:89) > at gate.Factory.createResource(Factory.java:382) > at gate.gui.NewResourceDialog$4.run(NewResourceDialog.java:220) > at java.lang.Thread.run(Thread.java:619) > Caused by: com.ontotext.kim.client.query.KIMQueryException: > KIMQueryException was caused by java.io.FileNotFoundException: > /usr/local/durmtools/GATE_SVN/gate/./owlim-storage/entities (No such > file or directory) > at > com.ontotext.kim.util.datastore.PrivateRepositoryFeed.feedTo(PrivateRepositoryFeed.java:75) > > at > com.ontotext.kim.model.AliasCacheImpl.loadTrustedMaps(AliasCacheImpl.java:348) > > ... 7 more > Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: > /usr/local/durmtools/GATE_SVN/gate/./owlim-storage/entities (No such > file or directory) > at com.ontotext.trree.owlim_ext.a.a.(Unknown Source) > at com.ontotext.trree.owlim_ext.EntityPool.a(Unknown Source) > at com.ontotext.trree.owlim_ext.EntityPool.(Unknown Source) > at > com.ontotext.trree.owlim_ext.SailImpl.initialize(SailImpl.java:167) > at > org.openrdf.repository.sail.SailRepository.initialize(SailRepository.java:84) > > at > com.ontotext.kim.util.datastore.PrivateRepositoryFeed.feedTo(PrivateRepositoryFeed.java:62) > > ... 8 more > Caused by: java.io.FileNotFoundException: > /usr/local/durmtools/GATE_SVN/gate/./owlim-storage/entities (No such > file or directory) > at java.io.RandomAccessFile.open(Native Method) > at java.io.RandomAccessFile.(RandomAccessFile.java:212) > ... 14 more > > If it is possible I can have the Gazetteer as a PR, or how I can fix it? > Your help is greatly appreciated. > Best regards, > Nona > > Quoting Anton Andreev : > >> Hello Nona, >> >> We can provide you with evaluation copy of KIM. >> >> You can download the latest version of the KIM 2.4 for evaluation at: >> http://ontotext.com/kim/kim-install.html >> >> You will find KIM documentation and installation details at: >> http://ontotext.com/kim/doc/sys-doc/Installation.html >> >> If you have any further questions, do not hesitate to contact us at: >> KIM-info at ontotext.com. We recommend you to subscribe for KIM-discussion >> and interested-in-KIM mailing lists at: >> http://www.ontotext.com/kim/mailing-lists-info.html. >> >> The KIM-discussion mailing list has public archive available at: >> http://www.mail-archive.com/kim-discussion at ontotext.com/maillist.html >> >> We hope you will find KIM useful! We would be happy to receive your >> feedback! >> >> -- >> Best regards, >> Anton Andreev >> email: anton.andreev at ontotext.com >> Account Manager at Ontotext >> www.ontotext.com >> >> >> On 12.3.2010 ?. 19:19 ?., n_nad at encs.concordia.ca wrote: >>> Hello, >>> Thanks for your response. I am using Gate, and your Large KB >>> Gazetteer component is provided under gate. I am interested in >>> using your gazetteer with my ontology to annotate some texts. >>> Thank you so much for your attention. >>> Best, >>> Nona >>> >>> Quoting Anton Andreev : >>> >>>> Hello, >>>> >>>> We usually provide KIM to all interested academic parties, but please >>>> tell us a little bit more about you. >>>> For example how do you intend to use KIM? >>>> >>>> Thanks, >>>> >>>> -- >>>> Best regards, >>>> Anton Andreev >>>> email: anton.andreev at ontotext.com >>>> Account Manager at Ontotext >>>> www.ontotext.com >>> >>> >>> > > > From Anton.Andreev at ontotext.com Mon Mar 22 07:10:06 2010 From: Anton.Andreev at ontotext.com (Anton Andreev) Date: Mon, 22 Mar 2010 13:10:06 +0200 Subject: [Kim-discussion] Urgent Request In-Reply-To: References: <20100309155025.0qfiue1ko080gsko@webmail.sirma.bg> <4B9E0451.7000400@ontotext.com>, <4B9E13BB.2080506@ontotext.com> Message-ID: <4BA7500E.70306@ontotext.com> Hello Nader, 1. You need to supply a file that contains a SeRQL query. SeRQL is a language similar to SPARQL and it is used for semantic queries. I have attached such a file with a sample SeRQL query that extracts all the companies that are loaded in OWLIM/KIM by default. Your query must use the "construct" clause: http://www.openrdf.org/doc/sesame/users/ch06.html#d0e1371 Sample command line: kim\bin\tools>toolRdfExport.cmd query.txt result.rdf RDF/XML 2. I have attached a "2-Getting started.pdf" which is part of a KIM-Guide which still has not been released. It should be considered as a almost ready draft. You will find it useful in order to comprehend what you can do with KIM in general. Check point 5. By setting up the the Sesame UI you will be able to make queries to the built-in OWLIM in KIM. Also the Sesame UI might provide the functionality you need. Hope this helps, -- Anton Andreev email: anton.andreev at ontotext.com Account Manager at Ontotext www.ontotext.com On 20.3.2010 ?. 01:00 ?., Nader Zaki wrote: > Dear sir, > > First of all, thanks alot for your efforts.Second, > I have some questions: > > 1-I tried to use the RDF export tool but there was something missing, > I couldn't get the SeRQL file as I didn't know where to find it and > what's its extension, so can you tell me ?? > 2-I tried to use the OWLIM but I couldn't even operate it so can > I have more guidance to use it ?? > > My overall goal is as follows: > > Taking any http page as an input and converting it from HTML to RDF or > OWL format so that I have the important information in the HTML page > but in the rdf format file.Then I build a semantic application that > uses this rdf files in a specific domain: Mechanical for example .So > what I need for now is a program that converts from HTML to RDF .Also > I need to know how is this done if it's possible to be known. > > Thanks alot for your time.Waiting for your reply as soon as possible. > > Regards,, > > > Nader Nassef Zaki > ------------------------------------------------------------------------ > Date: Mon, 15 Mar 2010 13:02:19 +0200 > From: Anton.Andreev at ontotext.com > To: nadora5 at hotmail.com > CC: KIM-discussion at ontotext.com > Subject: Re: [Kim-discussion] Urgent Request > > Hello Nader, > > You can process documents and htmls with KIM and the resulting RDF is > stored in our built-in OWLIM database in KIM. You may also try this > tool: http://ontotext.com/kim/doc/sys-doc/RDFExport.html. This tool > will export the RDF from OWLIM. > > > Cheers, > > Anton Andreev > email:anton.andreev at ontotext.com > Account Manager at Ontotext > www.ontotext.com > > > On 15.3.2010 ?. 11:56 ?., Philip Alexiev wrote: > > Hello Nader, > > 1 and 3: KIM does not provide functionality to get the output in > rdf/xml format. I don't recall older versions being able to do > this either. Maybe it is achievable through the API. We haven't > developed it in this direction. > > 2. There are a number of efforts to make public data available in > RDF format. There is also a big projects which aims to connect the > different disjoint datasets in one large Knoledge Base. The > project is called : Linked Open Data > (http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData) > . In the data sets it uses you may find useful references for your > task > (http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets). > For example: Geonames - http://www.geonames.org/ontology/ . > > If you have your own data which is in a different format, you may > write your own custom tool to create RDF from it. You will have to > tie it to the ontology KIM uses by default - PROTON > (http://proton.semanticweb.org/). There is a section in the > documentation of KIM which will be helpful : Creating Knowledge > Bases and Ontologies . > > Some tools for designing and viewing ontologies are : Protege , > Swoop, Top Braid Composer. > > Hope this helps > Philip > > On 03/14/2010 10:42 PM, Nader Zaki wrote: > > Dear Philip, > > I want to ask few questions about the Kim: > > 1-How can I get the output of the annotation as OWL or > RDF/XML files ? > > 2-You told me before to go to *kim/config/sesame.conf* and > edit the file by adding the namespaces of any new knowledge > base but how can I build this knowledge base , can you explain > in more detail ?? > > 3-How can I get the older versions of the Kim platform ? > As I need precisely an API converting from HTML to OWL or RDF > based files and as I know the older versions of the Kim > platform did that. > > Thanks alot for your time.Waiting for your reply. > > Regards,, > > Nader Nassef Zaki > > > Date: Tue, 9 Mar 2010 15:50:25 +0200 > > From: philip.alexiev at sirma.bg > > To: nadora5 at hotmail.com > > CC: kim-info at ontotext.com > > Subject: Re: Urgent Request > > > > Hello Nader, > > > > If you are using kim prior to 3.0, the knowledge base used is > > described in kim/config/sesame.conf - there is an imports > section. > > You can add your custom files containing RDF data there. > Have in mind > > that you will have to provide a corresponding default > namespace below > > as well. > > > > The files can be in ntriples format or in rdf/xml . You can > use any > > ontology editor to achieve this. Protege is a good choise. > Just take a > > look at the resulting rdf/xml to make sure it is OK. > > > > Hope this helps > > Philip > > > > > > The New Busy is not the old busy. Search, chat and e-mail from > your inbox. Get started. > > > > -- > Philip Alexiev > Software Engineer > Ontotext AD > > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion > > > > > > ------------------------------------------------------------------------ > Hotmail: Trusted email with Microsoft?s powerful SPAM protection. Sign > up now. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: query.txt URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 2-Getting started.pdf Type: application/pdf Size: 298974 bytes Desc: not available URL: From Anton.Andreev at ontotext.com Wed Mar 24 08:46:03 2010 From: Anton.Andreev at ontotext.com (Anton Andreev) Date: Wed, 24 Mar 2010 14:46:03 +0200 Subject: [Kim-discussion] Urgent Request In-Reply-To: References: <20100309155025.0qfiue1ko080gsko@webmail.sirma.bg> <4B9E0451.7000400@ontotext.com>, <4B9E13BB.2080506@ontotext.com> , <4BA7500E.70306@ontotext.com> Message-ID: <4BAA098B.3090806@ontotext.com> Hello Nader, a). First you can use the standard GATE or the GATE Developer that comes with KIM. When you process a document the result you get is an annotation set. You either save that annotation set as XML(after you run the pipeline) or use a datastore. When using a datastore the result is automatically saved back in the datastore. Also with a datastores you can process a higher volume of documents as documents are loaded in memory one by one and this result in less memory management. b). When we process a document we do information extraction, but besides that we add the document to a full text search (FTS) index. In KIM you can use different FTS indexers and the default one is Lucene. Depending on the "running strategy" parameter we have different behavior in KIM. With default running strategy we can proceed this way: 1. You call the SemanticAnnotationAPI.execute method to add semantic annotations to your gate document (let's call it kdoc). kdoc= SemanticAnnotationAPI.execute(kdoc); Semantic annotations are these that have a URI in the ontology you are using. To do that you need a processing resource that is capable of doing that. In KIM pipeline it is called "Instance Generator". 2. Next step - you call the DocumentRepository.addDocumenet method. By default this method will create FTS index. But besides that it will generate RDF from using the semantic annotations from step 1. If you do not have the semantic annotations it will only create a FTS index and store the document (storage type is also configurable). The generated RDF is stored and merged in OWLIM with data already available in OWLIM. You can use my answer here to achieve your specific goals. Hope this helps and that I was able to explain it properly. -- Anton Andreev email: anton.andreev at ontotext.com Account Manager at Ontotext www.ontotext.com On 23.3.2010 ?. 13:06 ?., Nader Zaki wrote: > Dear Anton, > > I want to know the importance of using GATE and Lucene in the KIM > platform in detail ? > How can I use each of them separtely to extarct the semantic > information from a HTML page or file ? > Also, what are the inputs and the outputs of each of them ? > Thanks for your time. > > Regards,, > > > Nader Nassef Zaki > > > ------------------------------------------------------------------------ > Date: Mon, 22 Mar 2010 13:10:06 +0200 > From: Anton.Andreev at ontotext.com > To: nadora5 at hotmail.com > CC: KIM-discussion at ontotext.com > Subject: Re: [Kim-discussion] Urgent Request > > Hello Nader, > > 1. You need to supply a file that contains a SeRQL query. SeRQL is a > language similar to SPARQL and it is used for semantic queries. > I have attached such a file with a sample SeRQL query that extracts > all the companies that are loaded in OWLIM/KIM by default. > Your query must use the "construct" clause: > http://www.openrdf.org/doc/sesame/users/ch06.html#d0e1371 > > Sample command line: > kim\bin\tools>toolRdfExport.cmd query.txt result.rdf RDF/XML > > 2. I have attached a "2-Getting started.pdf" which is part of a > KIM-Guide which still has not been released. It should be considered > as a almost ready draft. You will find it useful in order to > comprehend what you can do with KIM in general. Check point 5. By > setting up the the Sesame UI you will be able to make queries to the > built-in OWLIM in KIM. Also the Sesame UI might provide the > functionality you need. > > Hope this helps, > -- > Anton Andreev > email:anton.andreev at ontotext.com > Account Manager at Ontotext > www.ontotext.com > > > On 20.3.2010 ?. 01:00 ?., Nader Zaki wrote: > > Dear sir, > > First of all, thanks alot for your efforts.Second, > I have some questions: > > 1-I tried to use the RDF export tool but there was something > missing, I couldn't get the SeRQL file as I didn't know where to > find it and what's its extension, so can you tell me ?? > 2-I tried to use the OWLIM but I couldn't even operate it so can > I have more guidance to use it ?? > > My overall goal is as follows: > > Taking any http page as an input and converting it from HTML to > RDF or OWL format so that I have the important information in the > HTML page but in the rdf format file.Then I build a > semantic application that uses this rdf files in a specific > domain: Mechanical for example .So what I need for now is a > program that converts from HTML to RDF .Also I need to know how is > this done if it's possible to be known. > > Thanks alot for your time.Waiting for your reply as soon as possible. > > Regards,, > > > Nader Nassef Zaki > ------------------------------------------------------------------------ > Date: Mon, 15 Mar 2010 13:02:19 +0200 > From: Anton.Andreev at ontotext.com > To: nadora5 at hotmail.com > CC: KIM-discussion at ontotext.com > Subject: Re: [Kim-discussion] Urgent Request > > Hello Nader, > > You can process documents and htmls with KIM and the resulting RDF > is stored in our built-in OWLIM database in KIM. You may also try > this tool: http://ontotext.com/kim/doc/sys-doc/RDFExport.html. > This tool will export the RDF from OWLIM. > > > Cheers, > > Anton Andreev > email:anton.andreev at ontotext.com > Account Manager at Ontotext > www.ontotext.com > > > > On 15.3.2010 ?. 11:56 ?., Philip Alexiev wrote: > > Hello Nader, > > 1 and 3: KIM does not provide functionality to get the output > in rdf/xml format. I don't recall older versions being able > to do this either. Maybe it is achievable through the API. We > haven't developed it in this direction. > > 2. There are a number of efforts to make public data available > in RDF format. There is also a big projects which aims to > connect the different disjoint datasets in one large Knoledge > Base. The project is called : Linked Open Data > (http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData) > . In the data sets it uses you may find useful references for > your task > (http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets). > For example: Geonames - http://www.geonames.org/ontology/ . > > If you have your own data which is in a different format, you > may write your own custom tool to create RDF from it. You will > have to tie it to the ontology KIM uses by default - PROTON > (http://proton.semanticweb.org/). There is a section in the > documentation of KIM which will be helpful : Creating > Knowledge Bases and Ontologies . > > Some tools for designing and viewing ontologies are : Protege > , Swoop, Top Braid Composer. > > Hope this helps > Philip > > On 03/14/2010 10:42 PM, Nader Zaki wrote: > > Dear Philip, > > I want to ask few questions about the Kim: > > 1-How can I get the output of the annotation as OWL or > RDF/XML files ? > > 2-You told me before to go to *kim/config/sesame.conf* and > edit the file by adding the namespaces of any new > knowledge base but how can I build this knowledge base , > can you explain in more detail ?? > > 3-How can I get the older versions of the Kim platform ? > As I need precisely an API converting from HTML to OWL or > RDF based files and as I know the older versions of the > Kim platform did that. > > Thanks alot for your time.Waiting for your reply. > > Regards,, > > Nader Nassef Zaki > > > Date: Tue, 9 Mar 2010 15:50:25 +0200 > > From: philip.alexiev at sirma.bg > > To: nadora5 at hotmail.com > > CC: kim-info at ontotext.com > > Subject: Re: Urgent Request > > > > Hello Nader, > > > > If you are using kim prior to 3.0, the knowledge base > used is > > described in kim/config/sesame.conf - there is an > imports section. > > You can add your custom files containing RDF data there. > Have in mind > > that you will have to provide a corresponding default > namespace below > > as well. > > > > The files can be in ntriples format or in rdf/xml . You > can use any > > ontology editor to achieve this. Protege is a good > choise. Just take a > > look at the resulting rdf/xml to make sure it is OK. > > > > Hope this helps > > Philip > > > > > > The New Busy is not the old busy. Search, chat and e-mail > from your inbox. Get started. > > > > -- > Philip Alexiev > Software Engineer > Ontotext AD > > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion > > > > > > ------------------------------------------------------------------------ > Hotmail: Trusted email with Microsoft?s powerful SPAM protection. > Sign up now. > > > > > ------------------------------------------------------------------------ > Hotmail: Trusted email with Microsoft?s powerful SPAM protection. Sign > up now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Wed Mar 24 09:49:38 2010 From: philip.alexiev at ontotext.com (Philip Alexiev) Date: Wed, 24 Mar 2010 15:49:38 +0200 Subject: [Kim-discussion] Urgent Request In-Reply-To: <4BAA098B.3090806@ontotext.com> References: <20100309155025.0qfiue1ko080gsko@webmail.sirma.bg> <4B9E0451.7000400@ontotext.com>, <4B9E13BB.2080506@ontotext.com> , <4BA7500E.70306@ontotext.com> <4BAA098B.3090806@ontotext.com> Message-ID: <4BAA1872.2010903@ontotext.com> Hello Nader Basically, the connection between KIM and Gate is that KIM uses Gate Developer for annotation platform. We have forked our pipeline from a very early version of ANNIE. On top of that we have added some more resources to improve the extraction in our own direction and to make it ontology aware. As a result you can see KIM pipeline from Gate by executing startKIMGate script. This concerns only the entity extraction process. With the results of the extraction, we create some indexes over the entities and documents and occurances of the first in the second. That allows us to perform some very sophisticated searches over them. This is what makes KIM so powerful. Greetings, Philip On 03/24/2010 02:46 PM, Anton Andreev wrote: > Dear Anton, > > I want to know the importance of using GATE and Lucene in the KIM > platform in detail ? > How can I use each of them separately to extarct the semantic > information from a HTML page or file ? > Also, what are the inputs and the outputs of each of them ? > Thanks for your time. > > Regards,, > > > Nader Nassef Zaki -- Philip Alexiev Software Engineer Ontotext AD