From sreckojoksimovic at gmail.com Sat Jul 2 07:23:40 2011 From: sreckojoksimovic at gmail.com (Srecko Joksimovic) Date: Sat, 2 Jul 2011 13:23:40 +0200 Subject: [Kim-discussion] Extend proton ontology Message-ID: <001701cc38aa$7fd038d0$7f70aa70$@com> Hi Philip! I was out of work for few days, but now I have new question. I'm reading Customizing KIM3.pdf, step by step. As a result, I got this: Screenshot.png As you can see, everything is there. Even Topic. Topic is class that I added. But, looks like there is no annotations for Topic. Could it maybe be because of query? prefix rdfs: prefix protont: PREFIX protons: SELECT ?entity ?cl WHERE { ?entity a ?cl ; protons:generatedBy . ?cl rdfs:subClassOf protont:Topic . OPTIONAL { ?sc rdfs:subClassOf ?cl. ?entity a ?sc . filter(?cl != ?sc) } filter (!bound(?sc) && isURI (?cl)) } I know that it could be anything. But I created Large KB Gazetter (TopicLKBGazetter), added to pipeline, and as you can see, it is there. If you have any idea, please let me know. I will try few more things. Best, Srecko -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 124107 bytes Desc: not available URL: From sreckojoksimovic at gmail.com Sat Jul 2 07:27:08 2011 From: sreckojoksimovic at gmail.com (Srecko Joksimovic) Date: Sat, 2 Jul 2011 13:27:08 +0200 Subject: [Kim-discussion] Extend proton ontology Message-ID: <001d01cc38aa$fbd86380$f3892a80$@com> I'm sorry, I forgot to attach screenshot in last email. Just in case, I'm sending it again. Hi Philip! I was out of work for few days, but now I have new question. I'm reading Customizing KIM3.pdf, step by step. As a result, I got this: Screenshot.png As you can see, everything is there. Even Topic. Topic is class that I added. But, looks like there is no annotations for Topic. Could it maybe be because of query? prefix rdfs: prefix protont: PREFIX protons: SELECT ?entity ?cl WHERE { ?entity a ?cl ; protons:generatedBy . ?cl rdfs:subClassOf protont:Topic . OPTIONAL { ?sc rdfs:subClassOf ?cl. ?entity a ?sc . filter(?cl != ?sc) } filter (!bound(?sc) && isURI (?cl)) } I know that it could be anything. But I created Large KB Gazetter (TopicLKBGazetter), added to pipeline, and as you can see, it is there. If you have any idea, please let me know. I will try few more things. Best, Srecko -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 124107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot.png Type: image/png Size: 124107 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: query.txt URL: From philip.alexiev at ontotext.com Sat Jul 2 10:50:46 2011 From: philip.alexiev at ontotext.com (Philip Alexiev @ Ontotext) Date: Sat, 2 Jul 2011 17:50:46 +0300 Subject: [Kim-discussion] Extend proton ontology In-Reply-To: <001d01cc38aa$fbd86380$f3892a80$@com> References: <001d01cc38aa$fbd86380$f3892a80$@com> Message-ID: <74230E74-AF8D-4052-AB71-87B7D9D9B6F7@ontotext.com> Hi Srecko, The best way to check what will be filled in the gazetteer dictionary is to run the KIM server and use JVisualVM to execute the gazetteer query against it. If you give more context, I can provide some more concrete guidelines. What is the ontology you are using. What are the concepts you want to recognize. All the best Philip On 2 Jul 2011, at 2:27 PM, Srecko Joksimovic wrote: > I?m sorry, I forgot to attach screenshot in last email. Just in case, I?m sending it again. > > > Hi Philip! > > I was out of work for few days, but now I have new question. I'm reading Customizing KIM3.pdf, step by step. As a result, I got this: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As you can see, everything is there. Even Topic. Topic is class that I added. But, looks like there is no annotations for Topic. Could it maybe be because of query? > > prefix rdfs: > prefix protont: > PREFIX protons: > > SELECT ?entity ?cl > WHERE { > > ?entity a ?cl ; > protons:generatedBy . > ?cl rdfs:subClassOf protont:Topic . > OPTIONAL > { > ?sc rdfs:subClassOf ?cl. > ?entity a ?sc . > filter(?cl != ?sc) > } > filter (!bound(?sc) && isURI (?cl)) > } > > I know that it could be anything. But I created Large KB Gazetter (TopicLKBGazetter), added to pipeline, and as you can see, it is there. If you have any idea, please let me know. I will try few more things. > > Best, > Srecko > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sreckojoksimovic at gmail.com Sat Jul 2 11:11:10 2011 From: sreckojoksimovic at gmail.com (Srecko Joksimovic) Date: Sat, 2 Jul 2011 17:11:10 +0200 Subject: [Kim-discussion] Extend proton ontology In-Reply-To: <74230E74-AF8D-4052-AB71-87B7D9D9B6F7@ontotext.com> References: <001d01cc38aa$fbd86380$f3892a80$@com> <74230E74-AF8D-4052-AB71-87B7D9D9B6F7@ontotext.com> Message-ID: <003a01cc38ca$47d74020$d785c060$@com> Hi Philip, I forgot to post more context. This is the part of acm.ttl file: @prefix protons: > . @prefix protont: > . a protons:Alias ; "Convex Programming at en" . a protons:Alias ; "Document and Text Processing at en" . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias . a protons:Alias ; "Store and Forward Networks at en" . a protons:Alias ; "Integral Equations at en" . a protons:Alias ; "Information Filtering at en" . a protons:Alias ; "Surfac eFitting at en" . a protons:Alias ; "Reliability,Availability and Serviceability at en" . a protons:Alias ; "Aerospace at en" . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias . a protons:Alias ; "Pixel Classification at en" . a protons:Alias ; "Reliability,Testing and Fault-Tolerance at en" . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias . a protons:Alias ; "Scheduling at en" . a protons:Alias ; "Automata at en" . I want to recognize Topic, so I could annotate my documents. When I past SPARQL query to JVisualVM I get empty string as a result. Actually: Operation Return Value [ ] I have already posted this question. So, I configured owlim, and all the other steps from guide. But something is wrong. From: Philip Alexiev @ Ontotext [mailto:philip.alexiev at ontotext.com] Sent: Saturday, July 02, 2011 16:51 To: Srecko Joksimovic Cc: kim-discussion at ontotext.com Subject: Re: [Kim-discussion] Extend proton ontology Hi Srecko, The best way to check what will be filled in the gazetteer dictionary is to run the KIM server and use JVisualVM to execute the gazetteer query against it. If you give more context, I can provide some more concrete guidelines. What is the ontology you are using. What are the concepts you want to recognize. All the best Philip On 2 Jul 2011, at 2:27 PM, Srecko Joksimovic wrote: I'm sorry, I forgot to attach screenshot in last email. Just in case, I'm sending it again. Hi Philip! I was out of work for few days, but now I have new question. I'm reading Customizing KIM3.pdf, step by step. As a result, I got this: As you can see, everything is there. Even Topic. Topic is class that I added. But, looks like there is no annotations for Topic. Could it maybe be because of query? prefix rdfs: -schema#> prefix protont: > PREFIX protons: > SELECT ?entity ?cl WHERE { ?entity a ?cl ; protons:generatedBy . ?cl rdfs:subClassOf protont:Topic . OPTIONAL { ?sc rdfs:subClassOf ?cl. ?entity a ?sc . filter(?cl != ?sc) } filter (!bound(?sc) && isURI (?cl)) } I know that it could be anything. But I created Large KB Gazetter (TopicLKBGazetter), added to pipeline, and as you can see, it is there. If you have any idea, please let me know. I will try few more things. Best, Srecko _______________________________________________ Kim-discussion mailing list Kim-discussion at ontotext.com http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Mon Jul 4 02:07:43 2011 From: philip.alexiev at ontotext.com (Philip Alexiev @ Ontotext) Date: Mon, 4 Jul 2011 09:07:43 +0300 Subject: [Kim-discussion] Fwd: Install Ontologies? References: <937F8D99-81C5-496F-AA71-E07D3525A6F2@ontotext.com> Message-ID: <1134C499-E01B-419F-9B6F-387C8F3E5948@ontotext.com> Begin forwarded message: > From: "Philip Alexiev @ Ontotext" > Date: 4 July 2011 9:01:10 AM GMT+03:00 > To: Ben Fino-Radin > Cc: kim-info at ontotext.com > Subject: Re: Install Ontologies? > > Hi Ben, > > http://www.ontotext.com/sites/default/files/Customizing%20KIM3.pdf > > This is a guide how to add a new ontology to KIM. It is more complicated than just importing it in the semantic repository. The resources in KIM should be made aware of it and start working with it. The steps are described in the guide. > > Hope this helps > > Philip Alexiev > Sofware Engineer, > KIM team > > On 4 Jul 2011, at 12:09 AM, Ben Fino-Radin wrote: > >> Hi There, >> >> Does KIM offer the option to install ontologies from RDF/XML? >> >> For exmple: http://vocab.org/bio/0.1/.html >> >> Best, >> Ben > -------------- next part -------------- An HTML attachment was scrubbed... URL: From borislav.popov at ontotext.com Tue Jul 5 10:30:48 2011 From: borislav.popov at ontotext.com (borislav popov) Date: Tue, 5 Jul 2011 16:30:48 +0200 Subject: [Kim-discussion] KIM In-Reply-To: <1309866879.12301.YahooMailClassic@web26901.mail.ukl.yahoo.com> References: <1309866879.12301.YahooMailClassic@web26901.mail.ukl.yahoo.com> Message-ID: <7645E1F2-A9DB-4593-96B3-CC5E50F98983@ontotext.com> Hi Soufiene, i fwd-ed your request to kim-discussion Please sign in to the list from http://www.ontotext.com/kim/support all the best borislav On Jul 5, 2011, at 1:54 PM, soufiene katet wrote: > Dear Mr , > i'm using your guide > http://www.ontotext.com/sites/default/files/kim/KIM_Getting_Started_Guide.pdf to install KIM ,bat i have a problem "We All Make Mistakes > A problem occurs while processing the request: > java.lang.NullPointerException > Cannot add null element to the form. > The issue has been logged. The cause may have been loss of connection to the KIM Server. > > The connection to KIM cannot be established. Please verify the server is still running and is not reporting errors. Try connecting again after that." > > Can you help me to resolve it . > > Cordially > Soufiene KATET -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Tue Jul 5 09:41:03 2011 From: philip.alexiev at ontotext.com (Philip Alexiev @ Ontotext) Date: Tue, 5 Jul 2011 16:41:03 +0300 Subject: [Kim-discussion] KIM In-Reply-To: <7645E1F2-A9DB-4593-96B3-CC5E50F98983@ontotext.com> References: <1309866879.12301.YahooMailClassic@web26901.mail.ukl.yahoo.com> <7645E1F2-A9DB-4593-96B3-CC5E50F98983@ontotext.com> Message-ID: <96D788DF-F16E-442B-8998-28B9171932B3@ontotext.com> Hello Saufiene, Please check that you have followed all the steps in the guide. Make sure the KIM server is running. If the problem persists, please send me the logs of KIM, which are in KIM/log/ folder. All the best, Philip Alexiev Software Engineer, KIM team On 5 Jul 2011, at 5:30 PM, borislav popov wrote: > Hi Soufiene, > i fwd-ed your request to kim-discussion > Please sign in to the list from http://www.ontotext.com/kim/support > all the best > borislav > > > On Jul 5, 2011, at 1:54 PM, soufiene katet wrote: > >> Dear Mr , >> i'm using your guide >> http://www.ontotext.com/sites/default/files/kim/KIM_Getting_Started_Guide.pdf to install KIM ,bat i have a problem "We All Make Mistakes >> A problem occurs while processing the request: >> java.lang.NullPointerException >> Cannot add null element to the form. >> The issue has been logged. The cause may have been loss of connection to the KIM Server. >> >> The connection to KIM cannot be established. Please verify the server is still running and is not reporting errors. Try connecting again after that." >> >> Can you help me to resolve it . >> >> Cordially >> Soufiene KATET > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sreckojoksimovic at gmail.com Thu Jul 7 15:08:12 2011 From: sreckojoksimovic at gmail.com (Srecko Joksimovic) Date: Thu, 7 Jul 2011 21:08:12 +0200 Subject: [Kim-discussion] How to declare trusted? Message-ID: <001b01cc3cd9$388cc0a0$a9a641e0$@com> How can I make sure that: http://www.lornet.org/acm-ccs/proton#TrustedSrc is declared as Trusted, when I want to extend Proton ontology? I did everything like it says in Customizing KIM 3.0, but I'm not sure how to check if this is declared as trusted? Best, Srecko -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Fri Jul 8 03:51:05 2011 From: philip.alexiev at ontotext.com (Philip Alexiev @ Ontotext) Date: Fri, 8 Jul 2011 10:51:05 +0300 Subject: [Kim-discussion] How to declare trusted? In-Reply-To: <001b01cc3cd9$388cc0a0$a9a641e0$@com> References: <001b01cc3cd9$388cc0a0$a9a641e0$@com> Message-ID: <1868E1BC-EDC5-4C00-8235-39E4D237379E@ontotext.com> Hi Srecko, You can see how some trusted sources are defined in KIM's KB in KIM/context/default/kb/wkb.nt : . . In order to declare your custom gazetteer as a trusted source, just add to the RDF this statement: . Best, Philip On 7 Jul 2011, at 10:08 PM, Srecko Joksimovic wrote: > How can I make sure that: > > http://www.lornet.org/acm-ccs/proton#TrustedSrc is declared as Trusted, when I want to extend Proton ontology? > I did everything like it says in Customizing KIM 3.0, but I?m not sure how to check if this is declared as trusted? > > Best, > Srecko > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sreckojoksimovic at gmail.com Fri Jul 8 04:09:17 2011 From: sreckojoksimovic at gmail.com (Srecko Joksimovic) Date: Fri, 8 Jul 2011 10:09:17 +0200 Subject: [Kim-discussion] How to declare trusted? In-Reply-To: <1868E1BC-EDC5-4C00-8235-39E4D237379E@ontotext.com> References: <001b01cc3cd9$388cc0a0$a9a641e0$@com> <1868E1BC-EDC5-4C00-8235-39E4D237379E@ontotext.com> Message-ID: <000701cc3d46$56629c90$0327d5b0$@com> Hi Philip, That is what I did, but I thought there is something else. Thank you! Best, Srecko From: Philip Alexiev @ Ontotext [mailto:philip.alexiev at ontotext.com] Sent: Friday, July 08, 2011 09:51 To: Srecko Joksimovic Cc: kim-discussion at ontotext.com Subject: Re: [Kim-discussion] How to declare trusted? Hi Srecko, You can see how some trusted sources are defined in KIM's KB in KIM/context/default/kb/wkb.nt : . . In order to declare your custom gazetteer as a trusted source, just add to the RDF this statement: . Best, Philip On 7 Jul 2011, at 10:08 PM, Srecko Joksimovic wrote: How can I make sure that: http://www.lornet.org/acm-ccs/proton#TrustedSrc is declared as Trusted, when I want to extend Proton ontology? I did everything like it says in Customizing KIM 3.0, but I'm not sure how to check if this is declared as trusted? Best, Srecko _______________________________________________ Kim-discussion mailing list Kim-discussion at ontotext.com http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.A.N.Raes at student.tudelft.nl Fri Jul 8 11:08:29 2011 From: J.A.N.Raes at student.tudelft.nl (Jeremy Raes) Date: Fri, 8 Jul 2011 17:08:29 +0200 Subject: [Kim-discussion] Checking if a document already exists in the document repository -- KIMQueryException: Lucene special character Message-ID: Hey, I am building an application upon KIM whereby I need to check if a document already exist in the repository before deciding on adding it. To do this, I wrote the following code: private boolean itemNotInRepository(Item item){ > assert(item != null); > DocumentQuery query = new DocumentQuery(); > DocumentQueryResult queryResult = null; > try { > String escaped = QueryParser.escape(item.getDescription()); > query.setKeywordRestriction(escaped); > queryResult = this.apiDR.getDocumentIds(query); > } catch (KIMQueryException e) { > e.printStackTrace(); > } > return queryResult.isEmpty(); > } Because some of the Strings, returned by item.getDescription(), might contain special characters [mainly "(" and ")"], I added the String escaped = QueryParser.escape(item.getDescription()) to my code, but nonetheless I get a KIMQueryException: com.ontotext.kim.client.query.KIMQueryException: Lucene special characters > in field name in brackets: [Canalhopper.\(Duur\] > at com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( > LuceneDocumentRepositoryImpl.java:429) > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > CachingDocumentRepository.java:91) > at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > CachingDocumentRepository.java:91) > at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( > ChannelIfaceImpl.java:513) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > at sun.rmi.transport.Transport$1.run(Transport.java:159) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535 > ) > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( > TCPTransport.java:790) > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( > TCPTransport.java:649) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( > ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:680) > Exception in thread "main" java.lang.NullPointerException > at knowledgeAcquisition.KIMKnowledgeAcquisition.itemNotInRepository( > KIMKnowledgeAcquisition.java:147) > at knowledgeAcquisition.KIMKnowledgeAcquisition.execute( > KIMKnowledgeAcquisition.java:188) > at run.Main.main(Main.java:21) My guess is that KIM pre-checks the query (before processing it with Lucene) and throws an error when a special character is found -- even though there is an "\" before the special character. Any suggestions on how I can (1) either avoid this error or (2) any other methods to check if a document already exists in the document repository? Any help is appreciated. Thanks in advanced! Best regards, Jeremy -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Sat Jul 9 07:23:30 2011 From: philip.alexiev at ontotext.com (Philip Alexiev @ Ontotext) Date: Sat, 9 Jul 2011 14:23:30 +0300 Subject: [Kim-discussion] Checking if a document already exists in the document repository -- KIMQueryException: Lucene special character In-Reply-To: References: Message-ID: <41147EBC-A8C8-4535-B677-0A0B2AD1364E@ontotext.com> Hi Jeremy, It is best if you provide a simple standalone class or a test case that works with with some test data and will reproduce the problem. That way we can track exactly what is happening. Thank you, Philip On 8 Jul 2011, at 6:08 PM, Jeremy Raes wrote: > Hey, > > I am building an application upon KIM whereby I need to check if a document already exist in the repository before deciding on adding it. > > To do this, I wrote the following code: > > private boolean itemNotInRepository(Item item){ > assert(item != null); > DocumentQuery query = new DocumentQuery(); > DocumentQueryResult queryResult = null; > try { > String escaped = QueryParser.escape(item.getDescription()); > query.setKeywordRestriction(escaped); > queryResult = this.apiDR.getDocumentIds(query); > } catch (KIMQueryException e) { > e.printStackTrace(); > } > return queryResult.isEmpty(); > } > > > > > > > > > > > > > > > > > > > > > > Because some of the Strings, returned by item.getDescription(), might contain special characters [mainly "(" and ")"], I added the String escaped = QueryParser.escape(item.getDescription()) to my code, but nonetheless I get a KIMQueryException: > > com.ontotext.kim.client.query.KIMQueryException: Lucene special characters in field name in brackets: [Canalhopper.\(Duur\] > at com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds(LuceneDocumentRepositoryImpl.java:429) > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds(CachingDocumentRepository.java:91) > at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds(CachingDocumentRepository.java:91) > at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke(ChannelIfaceImpl.java:513) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > at sun.rmi.transport.Transport$1.run(Transport.java:159) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:680) > Exception in thread "main" java.lang.NullPointerException > at knowledgeAcquisition.KIMKnowledgeAcquisition.itemNotInRepository(KIMKnowledgeAcquisition.java:147) > at knowledgeAcquisition.KIMKnowledgeAcquisition.execute(KIMKnowledgeAcquisition.java:188) > at run.Main.main(Main.java:21) > > > > > > > > > > > > > > > > > > > > > > > > > > > My guess is that KIM pre-checks the query (before processing it with Lucene) and throws an error when a special character is found -- even though there is an "\" before the special character. Any suggestions on how I can (1) either avoid this error or (2) any other methods to check if a document already exists in the document repository? > > > > Any help is appreciated. Thanks in advanced! > > > > Best regards, > > Jeremy > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.A.N.Raes at student.tudelft.nl Sat Jul 9 12:43:16 2011 From: J.A.N.Raes at student.tudelft.nl (Jeremy Raes) Date: Sat, 9 Jul 2011 18:43:16 +0200 Subject: [Kim-discussion] Checking if a document already exists in the document repository -- KIMQueryException: Lucene special character In-Reply-To: <41147EBC-A8C8-4535-B677-0A0B2AD1364E@ontotext.com> References: <41147EBC-A8C8-4535-B677-0A0B2AD1364E@ontotext.com> Message-ID: Dear Philip, Thanks for the fast reply. Attached to this mail, you'll find a file with java code, producing the following error: com.ontotext.kim.client.query.KIMQueryException: Lucene special characters in field name in brackets: [Canalhopper.\(Duur\] at com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( LuceneDocumentRepositoryImpl.java:429) at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( CachingDocumentRepository.java:91) at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( CachingDocumentRepository.java:91) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke( NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke( DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( ChannelIfaceImpl.java:513) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke( DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run( ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Exception in thread "main" java.lang.NullPointerException at tmp.QueryDoc.itemNotInRepository(QueryDoc.java:47) at tmp.QueryDoc.main(QueryDoc.java:96) This error does not occur whenever I query a String that does not contain any special characters. Thanks for your help! Best, Jeremy On 9 July 2011 13:23, Philip Alexiev @ Ontotext wrote: > Hi Jeremy, > > It is best if you provide a simple standalone class or a test case that > works with with some test data and will reproduce the problem. That way we > can track exactly what is happening. > > Thank you, > Philip > > On 8 Jul 2011, at 6:08 PM, Jeremy Raes wrote: > > Hey, > > I am building an application upon KIM whereby I need to check if a document > already exist in the repository before deciding on adding it. > > To do this, I wrote the following code: > > private boolean itemNotInRepository(Item item){ >> assert(item != null); >> DocumentQuery query = new DocumentQuery(); >> DocumentQueryResult queryResult = null; >> try { >> String escaped = QueryParser.escape(item.getDescription()); >> query.setKeywordRestriction(escaped); >> queryResult = this.apiDR.getDocumentIds(query); >> } catch (KIMQueryException e) { >> e.printStackTrace(); >> } >> return queryResult.isEmpty(); >> } > > > > > > > > > > > > > > > > > > > > > > > Because some of the Strings, returned by item.getDescription(), might > contain special characters [mainly "(" and ")"], I added the String > escaped = QueryParser.escape(item.getDescription()) to my code, but > nonetheless I get a KIMQueryException: > > com.ontotext.kim.client.query.KIMQueryException: Lucene special characters >> in field name in brackets: [Canalhopper.\(Duur\] >> at com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( >> LuceneDocumentRepositoryImpl.java:429) >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( >> CachingDocumentRepository.java:91) >> at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( >> CachingDocumentRepository.java:91) >> at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( >> DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( >> ChannelIfaceImpl.java:513) >> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( >> DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) >> at sun.rmi.transport.Transport$1.run(Transport.java:159) >> at java.security.AccessController.doPrivileged(Native Method) >> at sun.rmi.transport.Transport.serviceCall(Transport.java:155) >> at sun.rmi.transport.tcp.TCPTransport.handleMessages( >> TCPTransport.java:535) >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( >> TCPTransport.java:790) >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( >> TCPTransport.java:649) >> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( >> ThreadPoolExecutor.java:886) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >> ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:680) >> Exception in thread "main" java.lang.NullPointerException >> at knowledgeAcquisition.KIMKnowledgeAcquisition.itemNotInRepository( >> KIMKnowledgeAcquisition.java:147) >> at knowledgeAcquisition.KIMKnowledgeAcquisition.execute( >> KIMKnowledgeAcquisition.java:188) >> at run.Main.main(Main.java:21) > > > > > > > > > > > > > > > > > > > > > > > > > > > My guess is that KIM pre-checks the query (before processing it with > Lucene) and throws an error when a special character is found -- even though > there is an "\" before the special character. Any suggestions on how I can > (1) either avoid this error or (2) any other methods to check if a document > already exists in the document repository? > > > Any help is appreciated. Thanks in advanced! > > > Best regards, > > Jeremy > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: QueryDoc.java Type: application/octet-stream Size: 2363 bytes Desc: not available URL: From boyan.kukushev at ontotext.com Mon Jul 11 09:54:24 2011 From: boyan.kukushev at ontotext.com (Boyan Kukushev) Date: Mon, 11 Jul 2011 16:54:24 +0300 Subject: [Kim-discussion] Checking if a document already exists in the document repository -- KIMQueryException: Lucene special character In-Reply-To: References: <41147EBC-A8C8-4535-B677-0A0B2AD1364E@ontotext.com> Message-ID: <201107111654.24886.boyan.kukushev@ontotext.com> Hi Jeremy, Yes, your guess is correct - KIM is doing several checks on the user query and these checks will throw an exception if there is a special character in the query. A simple workaround that would be useful in this case is calculating a hash code of the document content before adding it in the KIM repository. The hash code is then added as a KIM document feature in the document's feature map. *Important:* in order to store this value in Owlim and Lucene as a document field, you should add the name of the hash code feature in the *com.ontotext.kim.KIMConstants.DOCUMENT_FEAT_LIST* list, located in */config/document.repository.properties* configuration file. You can find attached sample code showing the required API usage. The code uses the apache commons-codec library to produce a hex MD5 hash strings. If this solution does not work, please provide information about the version of KIM you are using along with log files and possible exceptions that occurred while executing the sample code. Hope this helps! Regards, Boyan On Saturday, July 09, 2011 19:43:16 Jeremy Raes wrote: > Dear Philip, > > Thanks for the fast reply. > > Attached to this mail, you'll find a file with java code, producing the > following error: > > com.ontotext.kim.client.query.KIMQueryException: Lucene special > characters in field name in brackets: [Canalhopper.\(Duur\] > > at com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( > LuceneDocumentRepositoryImpl.java:429) > > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > CachingDocumentRepository.java:91) > > at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > CachingDocumentRepository.java:91) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:39) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( > ChannelIfaceImpl.java:513) > > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > > at sun.rmi.transport.Transport$1.run(Transport.java:159) > > at java.security.AccessController.doPrivileged(Native Method) > > at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > > at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) > > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( > TCPTransport.java:790) > > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( > TCPTransport.java:649) > > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( > ThreadPoolExecutor.java:886) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:908) > > at java.lang.Thread.run(Thread.java:680) > > Exception in thread "main" java.lang.NullPointerException > > at tmp.QueryDoc.itemNotInRepository(QueryDoc.java:47) > > at tmp.QueryDoc.main(QueryDoc.java:96) > > This error does not occur whenever I query a String that does not contain > any special characters. > > Thanks for your help! > > Best, > Jeremy > > On 9 July 2011 13:23, Philip Alexiev @ Ontotext > > > wrote: > > > > Hi Jeremy, > > > > It is best if you provide a simple standalone class or a test case that > > works with with some test data and will reproduce the problem. That way > > we can track exactly what is happening. > > > > Thank you, > > Philip > > > > On 8 Jul 2011, at 6:08 PM, Jeremy Raes wrote: > > > > Hey, > > > > I am building an application upon KIM whereby I need to check if a > > document already exist in the repository before deciding on adding it. > > > > To do this, I wrote the following code: > > > > private boolean itemNotInRepository(Item item){ > > > >> assert(item != null); > >> DocumentQuery query = new DocumentQuery(); > >> DocumentQueryResult queryResult = null; > >> try { > >> String escaped = QueryParser.escape(item.getDescription()); > >> query.setKeywordRestriction(escaped); > >> queryResult = this.apiDR.getDocumentIds(query); > >> } catch (KIMQueryException e) { > >> e.printStackTrace(); > >> } > >> return queryResult.isEmpty(); > >> } > > > > Because some of the Strings, returned by item.getDescription(), might > > contain special characters [mainly "(" and ")"], I added the String > > escaped = QueryParser.escape(item.getDescription()) to my code, but > > nonetheless I get a KIMQueryException: > > > > com.ontotext.kim.client.query.KIMQueryException: Lucene special > > characters > > > >> in field name in brackets: [Canalhopper.\(Duur\] > >> at com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( > >> LuceneDocumentRepositoryImpl.java:429) > >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > >> CachingDocumentRepository.java:91) > >> at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > >> CachingDocumentRepository.java:91) > >> at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > >> DelegatingMethodAccessorImpl.java:25) > >> at java.lang.reflect.Method.invoke(Method.java:597) > >> at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( > >> ChannelIfaceImpl.java:513) > >> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > >> DelegatingMethodAccessorImpl.java:25) > >> at java.lang.reflect.Method.invoke(Method.java:597) > >> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > >> at sun.rmi.transport.Transport$1.run(Transport.java:159) > >> at java.security.AccessController.doPrivileged(Native Method) > >> at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > >> at sun.rmi.transport.tcp.TCPTransport.handleMessages( > >> TCPTransport.java:535) > >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( > >> TCPTransport.java:790) > >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( > >> TCPTransport.java:649) > >> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( > >> ThreadPoolExecutor.java:886) > >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( > >> ThreadPoolExecutor.java:908) > >> at java.lang.Thread.run(Thread.java:680) > >> Exception in thread "main" java.lang.NullPointerException > >> at knowledgeAcquisition.KIMKnowledgeAcquisition.itemNotInRepository( > >> KIMKnowledgeAcquisition.java:147) > >> at knowledgeAcquisition.KIMKnowledgeAcquisition.execute( > >> KIMKnowledgeAcquisition.java:188) > >> at run.Main.main(Main.java:21) > > > > My guess is that KIM pre-checks the query (before processing it with > > Lucene) and throws an error when a special character is found -- even > > though there is an "\" before the special character. Any suggestions on > > how I can (1) either avoid this error or (2) any other methods to check > > if a document already exists in the document repository? > > > > > > Any help is appreciated. Thanks in advanced! > > > > > > Best regards, > > > > Jeremy > > _______________________________________________ > > Kim-discussion mailing list > > Kim-discussion at ontotext.com > > http://ontotext.com/mailman/listinfo/kim-discussion -- Boyan Kukushev Senior Software Engineer / Java Developer Ontotext AD @ Sirma Group Corp. -------------- next part -------------- A non-text attachment was scrubbed... Name: DocumentAddedCheck.java Type: text/x-java Size: 2443 bytes Desc: not available URL: From boyan.kukushev at ontotext.com Mon Jul 11 11:42:45 2011 From: boyan.kukushev at ontotext.com (Boyan Kukushev) Date: Mon, 11 Jul 2011 18:42:45 +0300 Subject: [Kim-discussion] Checking if a document already exists in the document repository -- KIMQueryException: Lucene special character In-Reply-To: <201107111654.24886.boyan.kukushev@ontotext.com> References: <201107111654.24886.boyan.kukushev@ontotext.com> Message-ID: <201107111842.45251.boyan.kukushev@ontotext.com> Hi again, I forgot something in the code - one should call test.repository.synchronizeIndex(false) after each document is added. HTH, Boyan On Monday, July 11, 2011 16:54:24 Boyan Kukushev wrote: > Hi Jeremy, > > Yes, your guess is correct - KIM is doing several checks on the user query > and these checks will throw an exception if there is a special character > in the query. > > A simple workaround that would be useful in this case is calculating a hash > code of the document content before adding it in the KIM repository. The > hash code is then added as a KIM document feature in the document's > feature map. > > *Important:* in order to store this value in Owlim and Lucene as a document > field, you should add the name of the hash code feature in the > *com.ontotext.kim.KIMConstants.DOCUMENT_FEAT_LIST* list, > located in > */config/document.repository.properties* configuration file. > > You can find attached sample code showing the required API usage. The code > uses the apache commons-codec library to produce a hex MD5 hash strings. > > If this solution does not work, please provide information about the > version of KIM you are using along with log files and possible exceptions > that occurred while executing the sample code. > > Hope this helps! > > Regards, > Boyan > > On Saturday, July 09, 2011 19:43:16 Jeremy Raes wrote: > > Dear Philip, > > > > Thanks for the fast reply. > > > > Attached to this mail, you'll find a file with java code, producing the > > > > following error: > > com.ontotext.kim.client.query.KIMQueryException: Lucene special > > > > characters in field name in brackets: [Canalhopper.\(Duur\] > > > > at com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( > > LuceneDocumentRepositoryImpl.java:429) > > > > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > CachingDocumentRepository.java:91) > > > > at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > > > > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > CachingDocumentRepository.java:91) > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > at sun.reflect.NativeMethodAccessorImpl.invoke( > > NativeMethodAccessorImpl.java:39) > > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > DelegatingMethodAccessorImpl.java:25) > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( > > ChannelIfaceImpl.java:513) > > > > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > DelegatingMethodAccessorImpl.java:25) > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > > > > at sun.rmi.transport.Transport$1.run(Transport.java:159) > > > > at java.security.AccessController.doPrivileged(Native Method) > > > > at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > > > > at > > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) > > > > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( > > TCPTransport.java:790) > > > > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( > > TCPTransport.java:649) > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( > > ThreadPoolExecutor.java:886) > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > ThreadPoolExecutor.java:908) > > > > at java.lang.Thread.run(Thread.java:680) > > > > Exception in thread "main" java.lang.NullPointerException > > > > at tmp.QueryDoc.itemNotInRepository(QueryDoc.java:47) > > > > at tmp.QueryDoc.main(QueryDoc.java:96) > > > > This error does not occur whenever I query a String that does not contain > > any special characters. > > > > Thanks for your help! > > > > Best, > > Jeremy > > > > On 9 July 2011 13:23, Philip Alexiev @ Ontotext > > > > > > wrote: > > > > > > Hi Jeremy, > > > > > > It is best if you provide a simple standalone class or a test case that > > > works with with some test data and will reproduce the problem. That > > > way we can track exactly what is happening. > > > > > > Thank you, > > > Philip > > > > > > On 8 Jul 2011, at 6:08 PM, Jeremy Raes wrote: > > > > > > Hey, > > > > > > I am building an application upon KIM whereby I need to check if a > > > document already exist in the repository before deciding on adding it. > > > > > > To do this, I wrote the following code: > > > > > > private boolean itemNotInRepository(Item item){ > > > > > >> assert(item != null); > > >> DocumentQuery query = new DocumentQuery(); > > >> DocumentQueryResult queryResult = null; > > >> try { > > >> String escaped = QueryParser.escape(item.getDescription()); > > >> query.setKeywordRestriction(escaped); > > >> queryResult = this.apiDR.getDocumentIds(query); > > >> } catch (KIMQueryException e) { > > >> e.printStackTrace(); > > >> } > > >> return queryResult.isEmpty(); > > >> } > > > > > > Because some of the Strings, returned by item.getDescription(), might > > > contain special characters [mainly "(" and ")"], I added the String > > > escaped = QueryParser.escape(item.getDescription()) to my code, but > > > nonetheless I get a KIMQueryException: > > > > > > com.ontotext.kim.client.query.KIMQueryException: Lucene special > > > characters > > > > > >> in field name in brackets: [Canalhopper.\(Duur\] > > >> at > > >> com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( > > >> LuceneDocumentRepositoryImpl.java:429) > > >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > >> CachingDocumentRepository.java:91) > > >> at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > > >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > >> CachingDocumentRepository.java:91) > > >> at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) > > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > >> DelegatingMethodAccessorImpl.java:25) > > >> at java.lang.reflect.Method.invoke(Method.java:597) > > >> at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( > > >> ChannelIfaceImpl.java:513) > > >> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > >> DelegatingMethodAccessorImpl.java:25) > > >> at java.lang.reflect.Method.invoke(Method.java:597) > > >> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > > >> at sun.rmi.transport.Transport$1.run(Transport.java:159) > > >> at java.security.AccessController.doPrivileged(Native Method) > > >> at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > > >> at sun.rmi.transport.tcp.TCPTransport.handleMessages( > > >> TCPTransport.java:535) > > >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( > > >> TCPTransport.java:790) > > >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( > > >> TCPTransport.java:649) > > >> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( > > >> ThreadPoolExecutor.java:886) > > >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > >> ThreadPoolExecutor.java:908) > > >> at java.lang.Thread.run(Thread.java:680) > > >> Exception in thread "main" java.lang.NullPointerException > > >> at knowledgeAcquisition.KIMKnowledgeAcquisition.itemNotInRepository( > > >> KIMKnowledgeAcquisition.java:147) > > >> at knowledgeAcquisition.KIMKnowledgeAcquisition.execute( > > >> KIMKnowledgeAcquisition.java:188) > > >> at run.Main.main(Main.java:21) > > > > > > My guess is that KIM pre-checks the query (before processing it with > > > Lucene) and throws an error when a special character is found -- even > > > though there is an "\" before the special character. Any suggestions on > > > how I can (1) either avoid this error or (2) any other methods to check > > > if a document already exists in the document repository? > > > > > > > > > Any help is appreciated. Thanks in advanced! > > > > > > > > > Best regards, > > > > > > Jeremy > > > _______________________________________________ > > > Kim-discussion mailing list > > > Kim-discussion at ontotext.com > > > http://ontotext.com/mailman/listinfo/kim-discussion -- Boyan Kukushev Senior Software Engineer / Java Developer Ontotext AD @ Sirma Group Corp. -------------- next part -------------- A non-text attachment was scrubbed... Name: DocumentAddedCheck.java Type: text/x-java Size: 2470 bytes Desc: not available URL: From J.A.N.Raes at student.tudelft.nl Tue Jul 12 11:48:18 2011 From: J.A.N.Raes at student.tudelft.nl (Jeremy Raes) Date: Tue, 12 Jul 2011 17:48:18 +0200 Subject: [Kim-discussion] Checking if a document already exists in the document repository -- KIMQueryException: Lucene special character In-Reply-To: <201107111842.45251.boyan.kukushev@ontotext.com> References: <201107111654.24886.boyan.kukushev@ontotext.com> <201107111842.45251.boyan.kukushev@ontotext.com> Message-ID: Hi boyan, Thank you for taking the time to look into this. The suggested method works smoothly! Best regards, Jeremy On 11 July 2011 17:42, Boyan Kukushev wrote: > Hi again, > > I forgot something in the code - one should call > test.repository.synchronizeIndex(false) > > after each document is added. > > HTH, > Boyan > > On Monday, July 11, 2011 16:54:24 Boyan Kukushev wrote: > > Hi Jeremy, > > > > Yes, your guess is correct - KIM is doing several checks on the user > query > > and these checks will throw an exception if there is a special character > > in the query. > > > > A simple workaround that would be useful in this case is calculating a > hash > > code of the document content before adding it in the KIM repository. The > > hash code is then added as a KIM document feature in the document's > > feature map. > > > > *Important:* in order to store this value in Owlim and Lucene as a > document > > field, you should add the name of the hash code feature in the > > *com.ontotext.kim.KIMConstants.DOCUMENT_FEAT_LIST* list, > > located in > > */config/document.repository.properties* configuration file. > > > > You can find attached sample code showing the required API usage. The > code > > uses the apache commons-codec library to produce a hex MD5 hash strings. > > > > If this solution does not work, please provide information about the > > version of KIM you are using along with log files and possible exceptions > > that occurred while executing the sample code. > > > > Hope this helps! > > > > Regards, > > Boyan > > > > On Saturday, July 09, 2011 19:43:16 Jeremy Raes wrote: > > > Dear Philip, > > > > > > Thanks for the fast reply. > > > > > > Attached to this mail, you'll find a file with java code, producing the > > > > > > following error: > > > com.ontotext.kim.client.query.KIMQueryException: Lucene special > > > > > > characters in field name in brackets: [Canalhopper.\(Duur\] > > > > > > at com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( > > > LuceneDocumentRepositoryImpl.java:429) > > > > > > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > > CachingDocumentRepository.java:91) > > > > > > at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > > > > > > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > > CachingDocumentRepository.java:91) > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke( > > > NativeMethodAccessorImpl.java:39) > > > > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > DelegatingMethodAccessorImpl.java:25) > > > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > > > at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( > > > ChannelIfaceImpl.java:513) > > > > > > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > > > > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > DelegatingMethodAccessorImpl.java:25) > > > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > > > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > > > > > > at sun.rmi.transport.Transport$1.run(Transport.java:159) > > > > > > at java.security.AccessController.doPrivileged(Native Method) > > > > > > at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > > > > > > at > > > > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) > > > > > > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( > > > TCPTransport.java:790) > > > > > > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( > > > TCPTransport.java:649) > > > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( > > > ThreadPoolExecutor.java:886) > > > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > > ThreadPoolExecutor.java:908) > > > > > > at java.lang.Thread.run(Thread.java:680) > > > > > > Exception in thread "main" java.lang.NullPointerException > > > > > > at tmp.QueryDoc.itemNotInRepository(QueryDoc.java:47) > > > > > > at tmp.QueryDoc.main(QueryDoc.java:96) > > > > > > This error does not occur whenever I query a String that does not > contain > > > any special characters. > > > > > > Thanks for your help! > > > > > > Best, > > > Jeremy > > > > > > On 9 July 2011 13:23, Philip Alexiev @ Ontotext > > > > > > > > > wrote: > > > > > > > > Hi Jeremy, > > > > > > > > It is best if you provide a simple standalone class or a test case > that > > > > works with with some test data and will reproduce the problem. That > > > > way we can track exactly what is happening. > > > > > > > > Thank you, > > > > Philip > > > > > > > > On 8 Jul 2011, at 6:08 PM, Jeremy Raes wrote: > > > > > > > > Hey, > > > > > > > > I am building an application upon KIM whereby I need to check if a > > > > document already exist in the repository before deciding on adding > it. > > > > > > > > To do this, I wrote the following code: > > > > > > > > private boolean itemNotInRepository(Item item){ > > > > > > > >> assert(item != null); > > > >> DocumentQuery query = new DocumentQuery(); > > > >> DocumentQueryResult queryResult = null; > > > >> try { > > > >> String escaped = QueryParser.escape(item.getDescription()); > > > >> query.setKeywordRestriction(escaped); > > > >> queryResult = this.apiDR.getDocumentIds(query); > > > >> } catch (KIMQueryException e) { > > > >> e.printStackTrace(); > > > >> } > > > >> return queryResult.isEmpty(); > > > >> } > > > > > > > > Because some of the Strings, returned by item.getDescription(), might > > > > contain special characters [mainly "(" and ")"], I added the String > > > > escaped = QueryParser.escape(item.getDescription()) to my code, but > > > > nonetheless I get a KIMQueryException: > > > > > > > > com.ontotext.kim.client.query.KIMQueryException: Lucene special > > > > characters > > > > > > > >> in field name in brackets: [Canalhopper.\(Duur\] > > > >> at > > > >> com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( > > > >> LuceneDocumentRepositoryImpl.java:429) > > > >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > > >> CachingDocumentRepository.java:91) > > > >> at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > > > >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > > >> CachingDocumentRepository.java:91) > > > >> at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) > > > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > >> DelegatingMethodAccessorImpl.java:25) > > > >> at java.lang.reflect.Method.invoke(Method.java:597) > > > >> at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( > > > >> ChannelIfaceImpl.java:513) > > > >> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > > > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > >> DelegatingMethodAccessorImpl.java:25) > > > >> at java.lang.reflect.Method.invoke(Method.java:597) > > > >> at > sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > > > >> at sun.rmi.transport.Transport$1.run(Transport.java:159) > > > >> at java.security.AccessController.doPrivileged(Native Method) > > > >> at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > > > >> at sun.rmi.transport.tcp.TCPTransport.handleMessages( > > > >> TCPTransport.java:535) > > > >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( > > > >> TCPTransport.java:790) > > > >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( > > > >> TCPTransport.java:649) > > > >> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( > > > >> ThreadPoolExecutor.java:886) > > > >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > > >> ThreadPoolExecutor.java:908) > > > >> at java.lang.Thread.run(Thread.java:680) > > > >> Exception in thread "main" java.lang.NullPointerException > > > >> at knowledgeAcquisition.KIMKnowledgeAcquisition.itemNotInRepository( > > > >> KIMKnowledgeAcquisition.java:147) > > > >> at knowledgeAcquisition.KIMKnowledgeAcquisition.execute( > > > >> KIMKnowledgeAcquisition.java:188) > > > >> at run.Main.main(Main.java:21) > > > > > > > > My guess is that KIM pre-checks the query (before processing it with > > > > Lucene) and throws an error when a special character is found -- even > > > > though there is an "\" before the special character. Any suggestions > on > > > > how I can (1) either avoid this error or (2) any other methods to > check > > > > if a document already exists in the document repository? > > > > > > > > > > > > Any help is appreciated. Thanks in advanced! > > > > > > > > > > > > Best regards, > > > > > > > > Jeremy > > > > _______________________________________________ > > > > Kim-discussion mailing list > > > > Kim-discussion at ontotext.com > > > > http://ontotext.com/mailman/listinfo/kim-discussion > > -- > Boyan Kukushev > Senior Software Engineer / Java Developer > Ontotext AD @ Sirma Group Corp. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sreckojoksimovic at gmail.com Thu Jul 14 06:19:03 2011 From: sreckojoksimovic at gmail.com (srecko joksimovic) Date: Thu, 14 Jul 2011 12:19:03 +0200 Subject: [Kim-discussion] Topic Message-ID: Hello Philip, I included my instances in KIM. When I use web UI, I see them all, and everything looks ok. But when I run code like this: KIMDocument kimDoc = apiCorpora.createDocument(_string_to_annotate, true); kimDoc = apiSemAnn.execute(kimDoc); KIMAnnotationSet kimASet = kimDoc.getAnnotations(); Set typesSet = kimASet.getAllTypes(); Iterator iterator = typesSet.iterator(); // show annotations of every type separately while(iterator.hasNext()) { Object key = iterator.next(); KIMAnnotationSet kimFilteredASet = kimASet.get(String.valueOf(key)); Iterator annIterator = kimFilteredASet.iterator(); System.out.println(" = Annotations of type [" + String.valueOf(key) + "] :"); while(annIterator.hasNext()) { System.out.println(" -- " + annIterator.next()); } } System.out.println("[ Document's Typed Annotations (end) ]"); I don't see any annotation of type Topic. I see all of them when I use web UI, like I said. But when I try to annotate string from Java application, I don't get any Topic annotations. Could you please help me on this one? Best, Srecko -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Thu Jul 14 06:26:20 2011 From: philip.alexiev at ontotext.com (Philip Alexiev @ Ontotext) Date: Thu, 14 Jul 2011 13:26:20 +0300 Subject: [Kim-discussion] Topic In-Reply-To: References: Message-ID: <0B1D4FB4-E7F8-4BCD-A4FF-F138F2D3D990@ontotext.com> Hi Srecko, You can run the gate interface to check exactly what annotations are create ant their type. You can do this by running: bash KIM/bin/kim gate You probably use a Jape rule to match the Lookup annotations with class="http://proton.semanticweb.org/2006/05/protont#Topic" and are creating one of the entity annotations over it (the entity annotations are a whitelist of annotations that remain after the annotation process finishes, all annotations not in this list are removed). So check what type of annotation you are creating. If this is not the case, please provide more details how you handle the topic lookups. All the best, Philip On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: > Hello Philip, > > I included my instances in KIM. When I use web UI, I see them all, and everything looks ok. But when I run code like this: > > KIMDocument kimDoc = apiCorpora.createDocument(_string_to_annotate, true); > > kimDoc = apiSemAnn.execute(kimDoc); > > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); > Set typesSet = kimASet.getAllTypes(); > Iterator iterator = typesSet.iterator(); > > // show annotations of every type separately > while(iterator.hasNext()) > { > Object key = iterator.next(); > KIMAnnotationSet kimFilteredASet = kimASet.get(String.valueOf(key)); > Iterator annIterator = kimFilteredASet.iterator(); > System.out.println(" = Annotations of type [" + String.valueOf(key) + "] :"); > > while(annIterator.hasNext()) > { > System.out.println(" -- " + annIterator.next()); > } > } > System.out.println("[ Document's Typed Annotations (end) ]"); > > I don't see any annotation of type Topic. I see all of them when I use web UI, like I said. But when I try to annotate string from Java application, I don't get any Topic annotations. > > Could you please help me on this one? > > Best, > Srecko > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion From sreckojoksimovic at gmail.com Thu Jul 14 06:41:33 2011 From: sreckojoksimovic at gmail.com (srecko joksimovic) Date: Thu, 14 Jul 2011 12:41:33 +0200 Subject: [Kim-discussion] Topic In-Reply-To: <0B1D4FB4-E7F8-4BCD-A4FF-F138F2D3D990@ontotext.com> References: <0B1D4FB4-E7F8-4BCD-A4FF-F138F2D3D990@ontotext.com> Message-ID: Hi Philip, with GATE is same as with Java code. I get the same annotations. I tried to edit nerc.properties and add Topic to *com.ontotext.kim.KIMConstants.IE_ANN_TYPES * list, but nothing changed*. * Do I have to change something else? Best, Srecko On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < philip.alexiev at ontotext.com> wrote: > Hi Srecko, > > You can run the gate interface to check exactly what annotations are create > ant their type. You can do this by running: > bash KIM/bin/kim gate > > You probably use a Jape rule to match the Lookup annotations with class=" > http://proton.semanticweb.org/2006/05/protont#Topic" and are creating one > of the entity annotations over it (the entity annotations are a whitelist > of annotations that remain after the annotation process finishes, all > annotations not in this list are removed). > > So check what type of annotation you are creating. > > If this is not the case, please provide more details how you handle the > topic lookups. > > All the best, > Philip > > > On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: > > > Hello Philip, > > > > I included my instances in KIM. When I use web UI, I see them all, and > everything looks ok. But when I run code like this: > > > > KIMDocument kimDoc = > apiCorpora.createDocument(_string_to_annotate, true); > > > > kimDoc = apiSemAnn.execute(kimDoc); > > > > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); > > Set typesSet = kimASet.getAllTypes(); > > Iterator iterator = typesSet.iterator(); > > > > // show annotations of every type separately > > while(iterator.hasNext()) > > { > > Object key = iterator.next(); > > KIMAnnotationSet kimFilteredASet = > kimASet.get(String.valueOf(key)); > > Iterator annIterator = kimFilteredASet.iterator(); > > System.out.println(" = Annotations of type [" + > String.valueOf(key) + "] :"); > > > > while(annIterator.hasNext()) > > { > > System.out.println(" -- " + annIterator.next()); > > } > > } > > System.out.println("[ Document's Typed Annotations (end) ]"); > > > > I don't see any annotation of type Topic. I see all of them when I use > web UI, like I said. But when I try to annotate string from Java > application, I don't get any Topic annotations. > > > > Could you please help me on this one? > > > > Best, > > Srecko > > _______________________________________________ > > Kim-discussion mailing list > > Kim-discussion at ontotext.com > > http://ontotext.com/mailman/listinfo/kim-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Thu Jul 14 06:48:53 2011 From: philip.alexiev at ontotext.com (Philip Alexiev @ Ontotext) Date: Thu, 14 Jul 2011 13:48:53 +0300 Subject: [Kim-discussion] Topic In-Reply-To: References: <0B1D4FB4-E7F8-4BCD-A4FF-F138F2D3D990@ontotext.com> Message-ID: Can you describe the exact actions you take to add the topics to the IE logic ? The exact customizations you have made to KIM. Thanks, Philip On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: > Hi Philip, > with GATE is same as with Java code. I get the same annotations. I tried to edit nerc.properties and add Topic to com.ontotext.kim.KIMConstants.IE_ANN_TYPES list, but nothing changed. > > Do I have to change something else? > > Best, > Srecko > > On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext wrote: > Hi Srecko, > > You can run the gate interface to check exactly what annotations are create ant their type. You can do this by running: > bash KIM/bin/kim gate > > You probably use a Jape rule to match the Lookup annotations with class="http://proton.semanticweb.org/2006/05/protont#Topic" and are creating one of the entity annotations over it (the entity annotations are a whitelist of annotations that remain after the annotation process finishes, all annotations not in this list are removed). > > So check what type of annotation you are creating. > > If this is not the case, please provide more details how you handle the topic lookups. > > All the best, > Philip > > > On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: > > > Hello Philip, > > > > I included my instances in KIM. When I use web UI, I see them all, and everything looks ok. But when I run code like this: > > > > KIMDocument kimDoc = apiCorpora.createDocument(_string_to_annotate, true); > > > > kimDoc = apiSemAnn.execute(kimDoc); > > > > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); > > Set typesSet = kimASet.getAllTypes(); > > Iterator iterator = typesSet.iterator(); > > > > // show annotations of every type separately > > while(iterator.hasNext()) > > { > > Object key = iterator.next(); > > KIMAnnotationSet kimFilteredASet = kimASet.get(String.valueOf(key)); > > Iterator annIterator = kimFilteredASet.iterator(); > > System.out.println(" = Annotations of type [" + String.valueOf(key) + "] :"); > > > > while(annIterator.hasNext()) > > { > > System.out.println(" -- " + annIterator.next()); > > } > > } > > System.out.println("[ Document's Typed Annotations (end) ]"); > > > > I don't see any annotation of type Topic. I see all of them when I use web UI, like I said. But when I try to annotate string from Java application, I don't get any Topic annotations. > > > > Could you please help me on this one? > > > > Best, > > Srecko > > _______________________________________________ > > Kim-discussion mailing list > > Kim-discussion at ontotext.com > > http://ontotext.com/mailman/listinfo/kim-discussion > > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sreckojoksimovic at gmail.com Thu Jul 14 06:58:44 2011 From: sreckojoksimovic at gmail.com (srecko joksimovic) Date: Thu, 14 Jul 2011 12:58:44 +0200 Subject: [Kim-discussion] Topic In-Reply-To: References: <0B1D4FB4-E7F8-4BCD-A4FF-F138F2D3D990@ontotext.com> Message-ID: It is little hard to explain because I didn't do customisation. I took the file where one of my colleagues did it. File contains about 1200 instances and has content like this: @prefix protons: . @prefix protont: . < http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c9121b4297 > a protons:Alias ; "Convex Programming at en" . < http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227de53c015 > a protons:Alias ; "Document and Text Processing at en" . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias < http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5d5ee1a> . < http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f09898d2d > a protons:Alias ; "Store and Forward Networks at en" . < http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb53ae3b0b6 > a protons:Alias ; "Integral Equations at en" . < http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a8c034636 > a protons:Alias ; "Information Filtering at en" . < http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-811610a2f77f > a protons:Alias ; "Surfac eFitting at en" . < http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d8d150f30 > a protons:Alias ; "Reliability,Availability and Serviceability at en" . < http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c8817647765a > a protons:Alias ; "Aerospace at en" . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias < http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b55f9c7> . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias < http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b94790c4e> . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias < http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c994542b23> . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias < http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df79734c5a3d> . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias < http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332ad9d839> . a protont:Topic ; protons:generatedBy ; protons:hasMainAlias < http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416d28f4cf1> . < http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbda4616b72 > a protons:Alias ; "Pixel Classification at en" . I added this document to owlim.ttl and imported my instances. I tried to follow document Customizing KIM 3.pdf, but as mapping has already been done, I didn't know what else to do. Maybe I should create Jape rule, or something like that, but I think that I should see Topic with or without my instances. I'm not sure, that is only my opinion. Best, Srecko On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < philip.alexiev at ontotext.com> wrote: > Can you describe the exact actions you take to add the topics to the IE > logic ? The exact customizations you have made to KIM. > > Thanks, > Philip > > On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: > > Hi Philip, > with GATE is same as with Java code. I get the same annotations. I tried to > edit nerc.properties and add Topic to *com.ontotext.kim.KIMConstants.IE_ANN_TYPES > * list, but nothing changed*. > * > Do I have to change something else? > > Best, > Srecko > > On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < > philip.alexiev at ontotext.com> wrote: > >> Hi Srecko, >> >> You can run the gate interface to check exactly what annotations are >> create ant their type. You can do this by running: >> bash KIM/bin/kim gate >> >> You probably use a Jape rule to match the Lookup annotations with class=" >> http://proton.semanticweb.org/2006/05/protont#Topic" and are creating >> one of the entity annotations over it (the entity annotations are a >> whitelist of annotations that remain after the annotation process finishes, >> all annotations not in this list are removed). >> >> So check what type of annotation you are creating. >> >> If this is not the case, please provide more details how you handle the >> topic lookups. >> >> All the best, >> Philip >> >> >> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: >> >> > Hello Philip, >> > >> > I included my instances in KIM. When I use web UI, I see them all, and >> everything looks ok. But when I run code like this: >> > >> > KIMDocument kimDoc = >> apiCorpora.createDocument(_string_to_annotate, true); >> > >> > kimDoc = apiSemAnn.execute(kimDoc); >> > >> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); >> > Set typesSet = kimASet.getAllTypes(); >> > Iterator iterator = typesSet.iterator(); >> > >> > // show annotations of every type separately >> > while(iterator.hasNext()) >> > { >> > Object key = iterator.next(); >> > KIMAnnotationSet kimFilteredASet = >> kimASet.get(String.valueOf(key)); >> > Iterator annIterator = kimFilteredASet.iterator(); >> > System.out.println(" = Annotations of type [" + >> String.valueOf(key) + "] :"); >> > >> > while(annIterator.hasNext()) >> > { >> > System.out.println(" -- " + annIterator.next()); >> > } >> > } >> > System.out.println("[ Document's Typed Annotations (end) ]"); >> > >> > I don't see any annotation of type Topic. I see all of them when I use >> web UI, like I said. But when I try to annotate string from Java >> application, I don't get any Topic annotations. >> > >> > Could you please help me on this one? >> > >> > Best, >> > Srecko >> > _______________________________________________ >> > Kim-discussion mailing list >> > Kim-discussion at ontotext.com >> > http://ontotext.com/mailman/listinfo/kim-discussion >> >> > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Thu Jul 14 07:15:59 2011 From: philip.alexiev at ontotext.com (Philip Alexiev @ Ontotext) Date: Thu, 14 Jul 2011 14:15:59 +0300 Subject: [Kim-discussion] Topic In-Reply-To: References: <0B1D4FB4-E7F8-4BCD-A4FF-F138F2D3D990@ontotext.com> Message-ID: The process is described in the customization guide you mentioned. You have added this RDF to the semantic repository. This means that now the gazetteer will be able to match the instances described there (searching for their labels in the texts) and will create Lookup annotations, when it finds a match. The lookup process is generally only one side of the IE process. Lookup annotations are further examined by some logic to determine their validity, or they take part in the recognition of more complex phrases. That is why they are not left after the IE has finished over the document. Tip: You can disable the last resource in the pipeline and run it again over a document to see all the annotations that are created in the process - also the temporary ones. This will show you the Lookup annotations as well. You can search for your Topic instances there. Once you have the lookups, you should tell KIM that they are important for you and you want to keep them. Add the Topic annotation type to the com.ontotext.kim.KIMConstants.IE_ANN_TYPES list. This will tell KIM not to clear them in the end of the IE process. Now what is left for you to do, is to create a Topic annotation over the Lookup for the topic that the Gazetteer has created. You can use a simple Jape rule to do that: Phase: GazTopic Input: Lookup Options: control = appelt Rule: Topic ( {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} ):topic --> :topic.Topic = {rule = Topic, class = :topic.Lookup.class, inst=:topic.Lookup.inst} This is all that you need to include your topics in the IE process and to be able to see them in the graphical interface. Hope this helps philip On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: > It is little hard to explain because I didn't do customisation. I took the file where one of my colleagues did it. File contains about 1200 instances and has content like this: > > @prefix protons: . > @prefix protont: . > > > a protons:Alias ; > > "Convex Programming at en" . > > > a protons:Alias ; > > "Document and Text Processing at en" . > > > a protont:Topic ; > protons:generatedBy ; > protons:hasMainAlias > . > > > a protons:Alias ; > > "Store and Forward Networks at en" . > > > a protons:Alias ; > > "Integral Equations at en" . > > > a protons:Alias ; > > "Information Filtering at en" . > > > a protons:Alias ; > > "Surfac eFitting at en" . > > > a protons:Alias ; > > "Reliability,Availability and Serviceability at en" . > > > a protons:Alias ; > > "Aerospace at en" . > > > a protont:Topic ; > protons:generatedBy ; > protons:hasMainAlias > . > > > a protont:Topic ; > protons:generatedBy ; > protons:hasMainAlias > . > > > a protont:Topic ; > protons:generatedBy ; > protons:hasMainAlias > . > > > a protont:Topic ; > protons:generatedBy ; > protons:hasMainAlias > . > > > a protont:Topic ; > protons:generatedBy ; > protons:hasMainAlias > . > > > a protont:Topic ; > protons:generatedBy ; > protons:hasMainAlias > . > > > a protons:Alias ; > > "Pixel Classification at en" . > > I added this document to owlim.ttl and imported my instances. > > I tried to follow document Customizing KIM 3.pdf, but as mapping has already been done, I didn't know what else to do. Maybe I should create Jape rule, or something like that, but I think that I should see Topic with or without my instances. I'm not sure, that is only my opinion. > > Best, > Srecko > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext wrote: > Can you describe the exact actions you take to add the topics to the IE logic ? The exact customizations you have made to KIM. > > Thanks, > Philip > > On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: > >> Hi Philip, >> with GATE is same as with Java code. I get the same annotations. I tried to edit nerc.properties and add Topic to com.ontotext.kim.KIMConstants.IE_ANN_TYPES list, but nothing changed. >> >> Do I have to change something else? >> >> Best, >> Srecko >> >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext wrote: >> Hi Srecko, >> >> You can run the gate interface to check exactly what annotations are create ant their type. You can do this by running: >> bash KIM/bin/kim gate >> >> You probably use a Jape rule to match the Lookup annotations with class="http://proton.semanticweb.org/2006/05/protont#Topic" and are creating one of the entity annotations over it (the entity annotations are a whitelist of annotations that remain after the annotation process finishes, all annotations not in this list are removed). >> >> So check what type of annotation you are creating. >> >> If this is not the case, please provide more details how you handle the topic lookups. >> >> All the best, >> Philip >> >> >> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: >> >> > Hello Philip, >> > >> > I included my instances in KIM. When I use web UI, I see them all, and everything looks ok. But when I run code like this: >> > >> > KIMDocument kimDoc = apiCorpora.createDocument(_string_to_annotate, true); >> > >> > kimDoc = apiSemAnn.execute(kimDoc); >> > >> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); >> > Set typesSet = kimASet.getAllTypes(); >> > Iterator iterator = typesSet.iterator(); >> > >> > // show annotations of every type separately >> > while(iterator.hasNext()) >> > { >> > Object key = iterator.next(); >> > KIMAnnotationSet kimFilteredASet = kimASet.get(String.valueOf(key)); >> > Iterator annIterator = kimFilteredASet.iterator(); >> > System.out.println(" = Annotations of type [" + String.valueOf(key) + "] :"); >> > >> > while(annIterator.hasNext()) >> > { >> > System.out.println(" -- " + annIterator.next()); >> > } >> > } >> > System.out.println("[ Document's Typed Annotations (end) ]"); >> > >> > I don't see any annotation of type Topic. I see all of them when I use web UI, like I said. But when I try to annotate string from Java application, I don't get any Topic annotations. >> > >> > Could you please help me on this one? >> > >> > Best, >> > Srecko >> > _______________________________________________ >> > Kim-discussion mailing list >> > Kim-discussion at ontotext.com >> > http://ontotext.com/mailman/listinfo/kim-discussion >> >> >> _______________________________________________ >> Kim-discussion mailing list >> Kim-discussion at ontotext.com >> http://ontotext.com/mailman/listinfo/kim-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sreckojoksimovic at gmail.com Thu Jul 14 07:50:51 2011 From: sreckojoksimovic at gmail.com (srecko joksimovic) Date: Thu, 14 Jul 2011 13:50:51 +0200 Subject: [Kim-discussion] Topic In-Reply-To: References: <0B1D4FB4-E7F8-4BCD-A4FF-F138F2D3D990@ontotext.com> Message-ID: I configured nerc.properties, and now I have this: com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, ContactInfo, Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, KeyPerson, KeyPhrase, Location, Money, Object, Organization, Percent, Person, Position, Time, Acquirement, JobTitle, Number, Topic then I disabled last resource in pipeline, but I still can't see Topic. Maybe I didn't understand well... should I first create Jape rule, or this is enough to see Topic? Best, Srecko On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext < philip.alexiev at ontotext.com> wrote: > The process is described in the customization guide you mentioned. > > You have added this RDF to the semantic repository. This means that now > the gazetteer will be able to match the instances described there (searching > for their labels in the texts) and will create Lookup annotations, when it > finds a match. The lookup process is generally only one side of the IE > process. Lookup annotations are further examined by some logic to determine > their validity, or they take part in the recognition of more complex > phrases. That is why they are not left after the IE has finished over the > document. > > Tip: You can disable the last resource in the pipeline and run it again > over a document to see all the annotations that are created in the process - > also the temporary ones. This will show you the Lookup annotations as well. > You can search for your Topic instances there. > > Once you have the lookups, you should tell KIM that they are important for > you and you want to keep them. Add the Topic annotation type to the * > com.ontotext.kim.KIMConstants.IE_ANN_TYPES *list. This will tell KIM not > to clear them in the end of the IE process. Now what is left for you to do, > is to create a Topic annotation over the Lookup for the topic that the > Gazetteer has created. You can use a simple Jape rule to do that: > > > Phase: GazTopic > Input: Lookup > Options: control = appelt > > Rule: Topic > ( > {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} > ):topic > --> > :topic.Topic = {rule = Topic, class = :topic.Lookup.class, > inst=:topic.Lookup.inst} > > > This is all that you need to include your topics in the IE process and to > be able to see them in the graphical interface. > > Hope this helps > philip > > > On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: > > It is little hard to explain because I didn't do customisation. I took the > file where one of my colleagues did it. File contains about 1200 instances > and has content like this: > > @prefix protons: . > @prefix protont: . > > < > http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c9121b4297 > > > a protons:Alias ; > > "Convex Programming at en" . > > < > http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227de53c015 > > > a protons:Alias ; > > "Document and Text Processing at en" . > > > a protont:Topic ; > protons:generatedBy > ; > protons:hasMainAlias > < > http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5d5ee1a> > . > > < > http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f09898d2d > > > a protons:Alias ; > > "Store and Forward Networks at en" . > > < > http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb53ae3b0b6 > > > a protons:Alias ; > > "Integral Equations at en" . > > < > http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a8c034636 > > > a protons:Alias ; > > "Information Filtering at en" . > > < > http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-811610a2f77f > > > a protons:Alias ; > > "Surfac eFitting at en" . > > < > http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d8d150f30 > > > a protons:Alias ; > > "Reliability,Availability and Serviceability at en" . > > < > http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c8817647765a > > > a protons:Alias ; > > "Aerospace at en" . > > > a protont:Topic ; > protons:generatedBy > ; > protons:hasMainAlias > < > http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b55f9c7> > . > > > a protont:Topic ; > protons:generatedBy > ; > protons:hasMainAlias > < > http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b94790c4e> > . > > > a protont:Topic ; > protons:generatedBy > ; > protons:hasMainAlias > < > http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c994542b23> > . > > > a protont:Topic ; > protons:generatedBy > ; > protons:hasMainAlias > < > http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df79734c5a3d> > . > > > a protont:Topic ; > protons:generatedBy > ; > protons:hasMainAlias > < > http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332ad9d839> > . > > > a protont:Topic ; > protons:generatedBy > ; > protons:hasMainAlias > < > http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416d28f4cf1> > . > > < > http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbda4616b72 > > > a protons:Alias ; > > "Pixel Classification at en" . > > I added this document to owlim.ttl and imported my instances. > > I tried to follow document Customizing KIM 3.pdf, but as mapping has > already been done, I didn't know what else to do. Maybe I should create Jape > rule, or something like that, but I think that I should see Topic with or > without my instances. I'm not sure, that is only my opinion. > > Best, > Srecko > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < > philip.alexiev at ontotext.com> wrote: > >> Can you describe the exact actions you take to add the topics to the IE >> logic ? The exact customizations you have made to KIM. >> >> Thanks, >> Philip >> >> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: >> >> Hi Philip, >> with GATE is same as with Java code. I get the same annotations. I tried >> to edit nerc.properties and add Topic to *com.ontotext.kim.KIMConstants.IE_ANN_TYPES >> * list, but nothing changed*. >> * >> Do I have to change something else? >> >> Best, >> Srecko >> >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < >> philip.alexiev at ontotext.com> wrote: >> >>> Hi Srecko, >>> >>> You can run the gate interface to check exactly what annotations are >>> create ant their type. You can do this by running: >>> bash KIM/bin/kim gate >>> >>> You probably use a Jape rule to match the Lookup annotations with class=" >>> http://proton.semanticweb.org/2006/05/protont#Topic" and are creating >>> one of the entity annotations over it (the entity annotations are a >>> whitelist of annotations that remain after the annotation process finishes, >>> all annotations not in this list are removed). >>> >>> So check what type of annotation you are creating. >>> >>> If this is not the case, please provide more details how you handle the >>> topic lookups. >>> >>> All the best, >>> Philip >>> >>> >>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: >>> >>> > Hello Philip, >>> > >>> > I included my instances in KIM. When I use web UI, I see them all, and >>> everything looks ok. But when I run code like this: >>> > >>> > KIMDocument kimDoc = >>> apiCorpora.createDocument(_string_to_annotate, true); >>> > >>> > kimDoc = apiSemAnn.execute(kimDoc); >>> > >>> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); >>> > Set typesSet = kimASet.getAllTypes(); >>> > Iterator iterator = typesSet.iterator(); >>> > >>> > // show annotations of every type separately >>> > while(iterator.hasNext()) >>> > { >>> > Object key = iterator.next(); >>> > KIMAnnotationSet kimFilteredASet = >>> kimASet.get(String.valueOf(key)); >>> > Iterator annIterator = kimFilteredASet.iterator(); >>> > System.out.println(" = Annotations of type [" + >>> String.valueOf(key) + "] :"); >>> > >>> > while(annIterator.hasNext()) >>> > { >>> > System.out.println(" -- " + annIterator.next()); >>> > } >>> > } >>> > System.out.println("[ Document's Typed Annotations (end) >>> ]"); >>> > >>> > I don't see any annotation of type Topic. I see all of them when I use >>> web UI, like I said. But when I try to annotate string from Java >>> application, I don't get any Topic annotations. >>> > >>> > Could you please help me on this one? >>> > >>> > Best, >>> > Srecko >>> > _______________________________________________ >>> > Kim-discussion mailing list >>> > Kim-discussion at ontotext.com >>> > http://ontotext.com/mailman/listinfo/kim-discussion >>> >>> >> _______________________________________________ >> Kim-discussion mailing list >> Kim-discussion at ontotext.com >> http://ontotext.com/mailman/listinfo/kim-discussion >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From boyan.kukushev at ontotext.com Thu Jul 14 08:13:17 2011 From: boyan.kukushev at ontotext.com (Boyan Kukushev) Date: Thu, 14 Jul 2011 15:13:17 +0300 Subject: [Kim-discussion] Topic In-Reply-To: References: Message-ID: <201107141513.17099.boyan.kukushev@ontotext.com> Hi Srecko, In order to see your Topic annotations, you must create the JAPE rule that Philip suggested: Phase: GazTopic Input: Lookup Options: control = appelt Rule: Topic ( {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} ):topic --> :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, inst=:topic.Lookup.inst} and put that rule just after the gazetteer phases within the GATE pipeline. The easiest way to do this is using the KIM GATE interface by starting KIM/bin/kim(.bat) gate and modifying the pipeline. You have already added the Topic annotation type to the list of allowed annotation types in KIM/config/nerc.properties. After you run the pipeline with this new resource incuded, Topic annotations should appear in the default annotation set for each document you process. To be able to use again the pipeline, you should save it, again using the KIM GATE interface - right click on the pipeline and select 'Save application state'. Remember to remove (or empty) the document corpus used by the application. You choose whether to overwrite the default KIM pipeline (IE.gapp) or create a new one and point KIM to use it (setting the corresponding property in KIM/config/nerc.properties). Hope this helps! Regards, Boyan P.S. What is happening exactly: - the gazetteer phases use pre-defined knowledge base to find specific 'things' in the text you process; they produce annotations of type Lookup - the JAPE rule would take all Lookup annotations that have the specific class (in your case that is "http://proton.semanticweb.org/2006/05/protont#Topic") and would create a new annotation of type Topic that is fully overlapping the current Lookup annotation - the last phase in the pipeline removes all temporary annotations - the Lookup annotation is also a temporary annotation, but Topic (as it is added to the allowed annotations list) will not be removed. On Thursday, July 14, 2011 14:50:51 srecko joksimovic wrote: > I configured nerc.properties, and now I have this: > > com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, ContactInfo, > Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, KeyPerson, > KeyPhrase, Location, Money, Object, Organization, Percent, Person, > Position, Time, Acquirement, JobTitle, Number, Topic > > then I disabled last resource in pipeline, but I still can't see Topic. > Maybe I didn't understand well... should I first create Jape rule, or this > is enough to see Topic? > > Best, > Srecko > > > > On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext < > > philip.alexiev at ontotext.com> wrote: > > The process is described in the customization guide you mentioned. > > > > You have added this RDF to the semantic repository. This means that now > > the gazetteer will be able to match the instances described there > > (searching for their labels in the texts) and will create Lookup > > annotations, when it finds a match. The lookup process is generally > > only one side of the IE process. Lookup annotations are further > > examined by some logic to determine their validity, or they take part in > > the recognition of more complex phrases. That is why they are not left > > after the IE has finished over the document. > > > > Tip: You can disable the last resource in the pipeline and run it again > > over a document to see all the annotations that are created in the > > process - also the temporary ones. This will show you the Lookup > > annotations as well. You can search for your Topic instances there. > > > > Once you have the lookups, you should tell KIM that they are important > > for you and you want to keep them. Add the Topic annotation type to the > > * com.ontotext.kim.KIMConstants.IE_ANN_TYPES *list. This will tell KIM > > not to clear them in the end of the IE process. Now what is left for > > you to do, is to create a Topic annotation over the Lookup for the > > topic that the Gazetteer has created. You can use a simple Jape rule to > > do that: > > > > > > Phase: GazTopic > > Input: Lookup > > Options: control = appelt > > > > Rule: Topic > > ( > > > > {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} > > > > ):topic > > --> > > > > :topic.Topic = {rule = Topic, class = :topic.Lookup.class, > > > > inst=:topic.Lookup.inst} > > > > > > This is all that you need to include your topics in the IE process and to > > be able to see them in the graphical interface. > > > > Hope this helps > > philip > > > > > > On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: > > > > It is little hard to explain because I didn't do customisation. I took > > the file where one of my colleagues did it. File contains about 1200 > > instances and has content like this: > > > > @prefix protons: . > > @prefix protont: . > > > > < > > http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c912 > > 1b4297 > > > > a protons:Alias ; > > > > > > "Convex Programming at en" . > > > > < > > http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227de > > 53c015 > > > > a protons:Alias ; > > > > > > "Document and Text Processing at en" . > > > > > > > > a protont:Topic ; > > protons:generatedBy > > > > > > ; > > > > protons:hasMainAlias > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5 > > d5ee1a> . > > > > < > > http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f09 > > 898d2d > > > > a protons:Alias ; > > > > > > "Store and Forward Networks at en" . > > > > < > > http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb53a > > e3b0b6 > > > > a protons:Alias ; > > > > > > "Integral Equations at en" . > > > > < > > http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a8c > > 034636 > > > > a protons:Alias ; > > > > > > "Information Filtering at en" . > > > > < > > http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-811610 > > a2f77f > > > > a protons:Alias ; > > > > > > "Surfac eFitting at en" . > > > > < > > http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d8d > > 150f30 > > > > a protons:Alias ; > > > > > > "Reliability,Availability and Serviceability at en" . > > > > < > > http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c88176 > > 47765a > > > > a protons:Alias ; > > > > > > "Aerospace at en" . > > > > > > > > a protont:Topic ; > > protons:generatedBy > > > > > > ; > > > > protons:hasMainAlias > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b > > 55f9c7> . > > > > > > > > a protont:Topic ; > > protons:generatedBy > > > > > > ; > > > > protons:hasMainAlias > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b94 > > 790c4e> . > > > > > > > > a protont:Topic ; > > protons:generatedBy > > > > > > ; > > > > protons:hasMainAlias > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c994 > > 542b23> . > > > > > > > > a protont:Topic ; > > protons:generatedBy > > > > > > ; > > > > protons:hasMainAlias > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df7973 > > 4c5a3d> . > > > > > > > > a protont:Topic ; > > protons:generatedBy > > > > > > ; > > > > protons:hasMainAlias > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332a > > d9d839> . > > > > > > > > a protont:Topic ; > > protons:generatedBy > > > > > > ; > > > > protons:hasMainAlias > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416d2 > > 8f4cf1> . > > > > < > > http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbda4 > > 616b72 > > > > a protons:Alias ; > > > > > > "Pixel Classification at en" . > > > > I added this document to owlim.ttl and imported my instances. > > > > I tried to follow document Customizing KIM 3.pdf, but as mapping has > > already been done, I didn't know what else to do. Maybe I should create > > Jape rule, or something like that, but I think that I should see Topic > > with or without my instances. I'm not sure, that is only my opinion. > > > > Best, > > Srecko > > > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < > > > > philip.alexiev at ontotext.com> wrote: > >> Can you describe the exact actions you take to add the topics to the IE > >> logic ? The exact customizations you have made to KIM. > >> > >> Thanks, > >> Philip > >> > >> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: > >> > >> Hi Philip, > >> with GATE is same as with Java code. I get the same annotations. I tried > >> to edit nerc.properties and add Topic to > >> *com.ontotext.kim.KIMConstants.IE_ANN_TYPES * list, but nothing > >> changed*. > >> * > >> Do I have to change something else? > >> > >> Best, > >> Srecko > >> > >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < > >> > >> philip.alexiev at ontotext.com> wrote: > >>> Hi Srecko, > >>> > >>> You can run the gate interface to check exactly what annotations are > >>> create ant their type. You can do this by running: > >>> bash KIM/bin/kim gate > >>> > >>> You probably use a Jape rule to match the Lookup annotations with > >>> class=" http://proton.semanticweb.org/2006/05/protont#Topic" and are > >>> creating one of the entity annotations over it (the entity > >>> annotations are a whitelist of annotations that remain after the > >>> annotation process finishes, all annotations not in this list are > >>> removed). > >>> > >>> So check what type of annotation you are creating. > >>> > >>> If this is not the case, please provide more details how you handle the > >>> > >>> topic lookups. > >>> > >>> All the best, > >>> Philip > >>> > >>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: > >>> > Hello Philip, > >>> > > >>> > I included my instances in KIM. When I use web UI, I see them all, > >>> > and > >>> > >>> everything looks ok. But when I run code like this: > >>> > KIMDocument kimDoc = > >>> > >>> apiCorpora.createDocument(_string_to_annotate, true); > >>> > >>> > kimDoc = apiSemAnn.execute(kimDoc); > >>> > > >>> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); > >>> > Set typesSet = kimASet.getAllTypes(); > >>> > Iterator iterator = typesSet.iterator(); > >>> > > >>> > // show annotations of every type separately > >>> > while(iterator.hasNext()) > >>> > { > >>> > > >>> > Object key = iterator.next(); > >>> > KIMAnnotationSet kimFilteredASet = > >>> > >>> kimASet.get(String.valueOf(key)); > >>> > >>> > Iterator annIterator = kimFilteredASet.iterator(); > >>> > System.out.println(" = Annotations of type [" + > >>> > >>> String.valueOf(key) + "] :"); > >>> > >>> > while(annIterator.hasNext()) > >>> > { > >>> > > >>> > System.out.println(" -- " + annIterator.next()); > >>> > > >>> > } > >>> > > >>> > } > >>> > System.out.println("[ Document's Typed Annotations (end) > >>> > >>> ]"); > >>> > >>> > I don't see any annotation of type Topic. I see all of them when I > >>> > use > >>> > >>> web UI, like I said. But when I try to annotate string from Java > >>> application, I don't get any Topic annotations. > >>> > >>> > Could you please help me on this one? > >>> > > >>> > Best, > >>> > Srecko > >>> > _______________________________________________ > >>> > Kim-discussion mailing list > >>> > Kim-discussion at ontotext.com > >>> > http://ontotext.com/mailman/listinfo/kim-discussion > >> > >> _______________________________________________ > >> Kim-discussion mailing list > >> Kim-discussion at ontotext.com > >> http://ontotext.com/mailman/listinfo/kim-discussion -- Boyan Kukushev Senior Software Engineer / Java Developer Ontotext AD @ Sirma Group Corp. From sreckojoksimovic at gmail.com Thu Jul 14 08:15:26 2011 From: sreckojoksimovic at gmail.com (srecko joksimovic) Date: Thu, 14 Jul 2011 14:15:26 +0200 Subject: [Kim-discussion] Topic In-Reply-To: <201107141513.17099.boyan.kukushev@ontotext.com> References: <201107141513.17099.boyan.kukushev@ontotext.com> Message-ID: Hi Boyan, I didn't understand that I must create JAPE rule before I do everything else. I'll try this now. Thank you! Srecko On Thu, Jul 14, 2011 at 2:13 PM, Boyan Kukushev wrote: > Hi Srecko, > > In order to see your Topic annotations, you must create the JAPE rule that > Philip suggested: > > Phase: GazTopic > Input: Lookup > Options: control = appelt > Rule: Topic > ( > {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} > ):topic > --> > :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, > inst=:topic.Lookup.inst} > > and put that rule just after the gazetteer phases within the GATE pipeline. > The easiest way to do this is using the KIM GATE interface by starting > KIM/bin/kim(.bat) gate > > and modifying the pipeline. > > You have already added the Topic annotation type to the list of allowed > annotation types in KIM/config/nerc.properties. After you run the pipeline > with this new resource incuded, Topic annotations should appear in the > default > annotation set for each document you process. > > To be able to use again the pipeline, you should save it, again using the > KIM > GATE interface - right click on the pipeline and select 'Save application > state'. Remember to remove (or empty) the document corpus used by the > application. You choose whether to overwrite the default KIM pipeline > (IE.gapp) or create a new one and point KIM to use it (setting the > corresponding property in KIM/config/nerc.properties). > > Hope this helps! > > Regards, > Boyan > > P.S. What is happening exactly: > - the gazetteer phases use pre-defined knowledge base to find specific > 'things' in the text you process; they produce annotations of type Lookup > - the JAPE rule would take all Lookup annotations that have the specific > class (in your case that is > "http://proton.semanticweb.org/2006/05/protont#Topic") and would create a > new > annotation of type Topic that is fully overlapping the current Lookup > annotation > - the last phase in the pipeline removes all temporary annotations - the > Lookup annotation is also a temporary annotation, but Topic (as it is added > to > the allowed annotations list) will not be removed. > > On Thursday, July 14, 2011 14:50:51 srecko joksimovic wrote: > > I configured nerc.properties, and now I have this: > > > > com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, ContactInfo, > > Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, > KeyPerson, > > KeyPhrase, Location, Money, Object, Organization, Percent, Person, > > Position, Time, Acquirement, JobTitle, Number, Topic > > > > then I disabled last resource in pipeline, but I still can't see Topic. > > Maybe I didn't understand well... should I first create Jape rule, or > this > > is enough to see Topic? > > > > Best, > > Srecko > > > > > > > > On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext < > > > > philip.alexiev at ontotext.com> wrote: > > > The process is described in the customization guide you mentioned. > > > > > > You have added this RDF to the semantic repository. This means that > now > > > the gazetteer will be able to match the instances described there > > > (searching for their labels in the texts) and will create Lookup > > > annotations, when it finds a match. The lookup process is generally > > > only one side of the IE process. Lookup annotations are further > > > examined by some logic to determine their validity, or they take part > in > > > the recognition of more complex phrases. That is why they are not > left > > > after the IE has finished over the document. > > > > > > Tip: You can disable the last resource in the pipeline and run it again > > > over a document to see all the annotations that are created in the > > > process - also the temporary ones. This will show you the Lookup > > > annotations as well. You can search for your Topic instances there. > > > > > > Once you have the lookups, you should tell KIM that they are important > > > for you and you want to keep them. Add the Topic annotation type to > the > > > * com.ontotext.kim.KIMConstants.IE_ANN_TYPES *list. This will tell > KIM > > > not to clear them in the end of the IE process. Now what is left for > > > you to do, is to create a Topic annotation over the Lookup for the > > > topic that the Gazetteer has created. You can use a simple Jape rule > to > > > do that: > > > > > > > > > Phase: GazTopic > > > Input: Lookup > > > Options: control = appelt > > > > > > Rule: Topic > > > ( > > > > > > {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic > "} > > > > > > ):topic > > > --> > > > > > > :topic.Topic = {rule = Topic, class = :topic.Lookup.class, > > > > > > inst=:topic.Lookup.inst} > > > > > > > > > This is all that you need to include your topics in the IE process and > to > > > be able to see them in the graphical interface. > > > > > > Hope this helps > > > philip > > > > > > > > > On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: > > > > > > It is little hard to explain because I didn't do customisation. I took > > > the file where one of my colleagues did it. File contains about 1200 > > > instances and has content like this: > > > > > > @prefix protons: . > > > @prefix protont: . > > > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c912 > > > 1b4297 > > > > > > a protons:Alias ; > > > > > > > > > "Convex Programming at en" . > > > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227de > > > 53c015 > > > > > > a protons:Alias ; > > > > > > > > > "Document and Text Processing at en" . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5 > > > d5ee1a> . > > > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f09 > > > 898d2d > > > > > > a protons:Alias ; > > > > > > > > > "Store and Forward Networks at en" . > > > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb53a > > > e3b0b6 > > > > > > a protons:Alias ; > > > > > > > > > "Integral Equations at en" . > > > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a8c > > > 034636 > > > > > > a protons:Alias ; > > > > > > > > > "Information Filtering at en" . > > > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-811610 > > > a2f77f > > > > > > a protons:Alias ; > > > > > > > > > "Surfac eFitting at en" . > > > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d8d > > > 150f30 > > > > > > a protons:Alias ; > > > > > > > > > "Reliability,Availability and Serviceability at en" . > > > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c88176 > > > 47765a > > > > > > a protons:Alias ; > > > > > > > > > "Aerospace at en" . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b > > > 55f9c7> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b94 > > > 790c4e> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c994 > > > 542b23> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df7973 > > > 4c5a3d> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332a > > > d9d839> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416d2 > > > 8f4cf1> . > > > > > > < > > > > http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbda4 > > > 616b72 > > > > > > a protons:Alias ; > > > > > > > > > "Pixel Classification at en" . > > > > > > I added this document to owlim.ttl and imported my instances. > > > > > > I tried to follow document Customizing KIM 3.pdf, but as mapping has > > > already been done, I didn't know what else to do. Maybe I should create > > > Jape rule, or something like that, but I think that I should see Topic > > > with or without my instances. I'm not sure, that is only my opinion. > > > > > > Best, > > > Srecko > > > > > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < > > > > > > philip.alexiev at ontotext.com> wrote: > > >> Can you describe the exact actions you take to add the topics to the > IE > > >> logic ? The exact customizations you have made to KIM. > > >> > > >> Thanks, > > >> Philip > > >> > > >> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: > > >> > > >> Hi Philip, > > >> with GATE is same as with Java code. I get the same annotations. I > tried > > >> to edit nerc.properties and add Topic to > > >> *com.ontotext.kim.KIMConstants.IE_ANN_TYPES * list, but nothing > > >> changed*. > > >> * > > >> Do I have to change something else? > > >> > > >> Best, > > >> Srecko > > >> > > >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < > > >> > > >> philip.alexiev at ontotext.com> wrote: > > >>> Hi Srecko, > > >>> > > >>> You can run the gate interface to check exactly what annotations are > > >>> create ant their type. You can do this by running: > > >>> bash KIM/bin/kim gate > > >>> > > >>> You probably use a Jape rule to match the Lookup annotations with > > >>> class=" http://proton.semanticweb.org/2006/05/protont#Topic" and > are > > >>> creating one of the entity annotations over it (the entity > > >>> annotations are a whitelist of annotations that remain after the > > >>> annotation process finishes, all annotations not in this list are > > >>> removed). > > >>> > > >>> So check what type of annotation you are creating. > > >>> > > >>> If this is not the case, please provide more details how you handle > the > > >>> > > >>> topic lookups. > > >>> > > >>> All the best, > > >>> Philip > > >>> > > >>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: > > >>> > Hello Philip, > > >>> > > > >>> > I included my instances in KIM. When I use web UI, I see them all, > > >>> > and > > >>> > > >>> everything looks ok. But when I run code like this: > > >>> > KIMDocument kimDoc = > > >>> > > >>> apiCorpora.createDocument(_string_to_annotate, true); > > >>> > > >>> > kimDoc = apiSemAnn.execute(kimDoc); > > >>> > > > >>> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); > > >>> > Set typesSet = kimASet.getAllTypes(); > > >>> > Iterator iterator = typesSet.iterator(); > > >>> > > > >>> > // show annotations of every type separately > > >>> > while(iterator.hasNext()) > > >>> > { > > >>> > > > >>> > Object key = iterator.next(); > > >>> > KIMAnnotationSet kimFilteredASet = > > >>> > > >>> kimASet.get(String.valueOf(key)); > > >>> > > >>> > Iterator annIterator = kimFilteredASet.iterator(); > > >>> > System.out.println(" = Annotations of type [" + > > >>> > > >>> String.valueOf(key) + "] :"); > > >>> > > >>> > while(annIterator.hasNext()) > > >>> > { > > >>> > > > >>> > System.out.println(" -- " + annIterator.next()); > > >>> > > > >>> > } > > >>> > > > >>> > } > > >>> > System.out.println("[ Document's Typed Annotations (end) > > >>> > > >>> ]"); > > >>> > > >>> > I don't see any annotation of type Topic. I see all of them when I > > >>> > use > > >>> > > >>> web UI, like I said. But when I try to annotate string from Java > > >>> application, I don't get any Topic annotations. > > >>> > > >>> > Could you please help me on this one? > > >>> > > > >>> > Best, > > >>> > Srecko > > >>> > _______________________________________________ > > >>> > Kim-discussion mailing list > > >>> > Kim-discussion at ontotext.com > > >>> > http://ontotext.com/mailman/listinfo/kim-discussion > > >> > > >> _______________________________________________ > > >> Kim-discussion mailing list > > >> Kim-discussion at ontotext.com > > >> http://ontotext.com/mailman/listinfo/kim-discussion > > -- > Boyan Kukushev > Senior Software Engineer / Java Developer > Ontotext AD @ Sirma Group Corp. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sreckojoksimovic at gmail.com Thu Jul 14 08:42:00 2011 From: sreckojoksimovic at gmail.com (srecko joksimovic) Date: Thu, 14 Jul 2011 14:42:00 +0200 Subject: [Kim-discussion] Topic In-Reply-To: References: <201107141513.17099.boyan.kukushev@ontotext.com> Message-ID: Hi Boyan, I created JAPE rule like you and Philip sugested, and stored to context/default/resources/grammar/acm folder. Then I run gate, and created JAPE transducer, topic_jape. I didn't specify inputASName, but I did add it to pipeline. Saved application state, populated KIMServer corpus, and run the application. I don't know how, but there is everything but Topic. I'm still missing something, but I don't know what. Should I create Large KB Gazetteer? Best, Srecko On Thu, Jul 14, 2011 at 2:15 PM, srecko joksimovic < sreckojoksimovic at gmail.com> wrote: > Hi Boyan, > I didn't understand that I must create JAPE rule before I do everything > else. > I'll try this now. > > Thank you! > > Srecko > > > On Thu, Jul 14, 2011 at 2:13 PM, Boyan Kukushev < > boyan.kukushev at ontotext.com> wrote: > >> Hi Srecko, >> >> In order to see your Topic annotations, you must create the JAPE rule that >> Philip suggested: >> >> Phase: GazTopic >> Input: Lookup >> Options: control = appelt >> Rule: Topic >> ( >> {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} >> ):topic >> --> >> :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, >> inst=:topic.Lookup.inst} >> >> and put that rule just after the gazetteer phases within the GATE >> pipeline. >> The easiest way to do this is using the KIM GATE interface by starting >> KIM/bin/kim(.bat) gate >> >> and modifying the pipeline. >> >> You have already added the Topic annotation type to the list of allowed >> annotation types in KIM/config/nerc.properties. After you run the pipeline >> with this new resource incuded, Topic annotations should appear in the >> default >> annotation set for each document you process. >> >> To be able to use again the pipeline, you should save it, again using the >> KIM >> GATE interface - right click on the pipeline and select 'Save application >> state'. Remember to remove (or empty) the document corpus used by the >> application. You choose whether to overwrite the default KIM pipeline >> (IE.gapp) or create a new one and point KIM to use it (setting the >> corresponding property in KIM/config/nerc.properties). >> >> Hope this helps! >> >> Regards, >> Boyan >> >> P.S. What is happening exactly: >> - the gazetteer phases use pre-defined knowledge base to find specific >> 'things' in the text you process; they produce annotations of type Lookup >> - the JAPE rule would take all Lookup annotations that have the specific >> class (in your case that is >> "http://proton.semanticweb.org/2006/05/protont#Topic") and would create a >> new >> annotation of type Topic that is fully overlapping the current Lookup >> annotation >> - the last phase in the pipeline removes all temporary annotations - the >> Lookup annotation is also a temporary annotation, but Topic (as it is >> added to >> the allowed annotations list) will not be removed. >> >> On Thursday, July 14, 2011 14:50:51 srecko joksimovic wrote: >> > I configured nerc.properties, and now I have this: >> > >> > com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, ContactInfo, >> > Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, >> KeyPerson, >> > KeyPhrase, Location, Money, Object, Organization, Percent, Person, >> > Position, Time, Acquirement, JobTitle, Number, Topic >> > >> > then I disabled last resource in pipeline, but I still can't see Topic. >> > Maybe I didn't understand well... should I first create Jape rule, or >> this >> > is enough to see Topic? >> > >> > Best, >> > Srecko >> > >> > >> > >> > On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext < >> > >> > philip.alexiev at ontotext.com> wrote: >> > > The process is described in the customization guide you mentioned. >> > > >> > > You have added this RDF to the semantic repository. This means that >> now >> > > the gazetteer will be able to match the instances described there >> > > (searching for their labels in the texts) and will create Lookup >> > > annotations, when it finds a match. The lookup process is generally >> > > only one side of the IE process. Lookup annotations are further >> > > examined by some logic to determine their validity, or they take part >> in >> > > the recognition of more complex phrases. That is why they are not >> left >> > > after the IE has finished over the document. >> > > >> > > Tip: You can disable the last resource in the pipeline and run it >> again >> > > over a document to see all the annotations that are created in the >> > > process - also the temporary ones. This will show you the Lookup >> > > annotations as well. You can search for your Topic instances there. >> > > >> > > Once you have the lookups, you should tell KIM that they are important >> > > for you and you want to keep them. Add the Topic annotation type to >> the >> > > * com.ontotext.kim.KIMConstants.IE_ANN_TYPES *list. This will tell >> KIM >> > > not to clear them in the end of the IE process. Now what is left for >> > > you to do, is to create a Topic annotation over the Lookup for the >> > > topic that the Gazetteer has created. You can use a simple Jape rule >> to >> > > do that: >> > > >> > > >> > > Phase: GazTopic >> > > Input: Lookup >> > > Options: control = appelt >> > > >> > > Rule: Topic >> > > ( >> > > >> > > {Lookup.class == " >> http://proton.semanticweb.org/2006/05/protont#Topic"} >> > > >> > > ):topic >> > > --> >> > > >> > > :topic.Topic = {rule = Topic, class = :topic.Lookup.class, >> > > >> > > inst=:topic.Lookup.inst} >> > > >> > > >> > > This is all that you need to include your topics in the IE process and >> to >> > > be able to see them in the graphical interface. >> > > >> > > Hope this helps >> > > philip >> > > >> > > >> > > On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: >> > > >> > > It is little hard to explain because I didn't do customisation. I took >> > > the file where one of my colleagues did it. File contains about 1200 >> > > instances and has content like this: >> > > >> > > @prefix protons: . >> > > @prefix protont: . >> > > >> > > < >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c912 >> > > 1b4297 >> > > >> > > a protons:Alias ; >> > > >> > > >> > > "Convex Programming at en" . >> > > >> > > < >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227de >> > > 53c015 >> > > >> > > a protons:Alias ; >> > > >> > > >> > > "Document and Text Processing at en" . >> > > >> > > >> > > >> > > a protont:Topic ; >> > > protons:generatedBy >> > > >> > > >> > > ; >> > > >> > > protons:hasMainAlias >> > > >> > > < >> > > >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5 >> > > d5ee1a> . >> > > >> > > < >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f09 >> > > 898d2d >> > > >> > > a protons:Alias ; >> > > >> > > >> > > "Store and Forward Networks at en" . >> > > >> > > < >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb53a >> > > e3b0b6 >> > > >> > > a protons:Alias ; >> > > >> > > >> > > "Integral Equations at en" . >> > > >> > > < >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a8c >> > > 034636 >> > > >> > > a protons:Alias ; >> > > >> > > >> > > "Information Filtering at en" . >> > > >> > > < >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-811610 >> > > a2f77f >> > > >> > > a protons:Alias ; >> > > >> > > >> > > "Surfac eFitting at en" . >> > > >> > > < >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d8d >> > > 150f30 >> > > >> > > a protons:Alias ; >> > > >> > > >> > > "Reliability,Availability and Serviceability at en" . >> > > >> > > < >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c88176 >> > > 47765a >> > > >> > > a protons:Alias ; >> > > >> > > >> > > "Aerospace at en" . >> > > >> > > >> > > >> > > a protont:Topic ; >> > > protons:generatedBy >> > > >> > > >> > > ; >> > > >> > > protons:hasMainAlias >> > > >> > > < >> > > >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b >> > > 55f9c7> . >> > > >> > > >> > > >> > > a protont:Topic ; >> > > protons:generatedBy >> > > >> > > >> > > ; >> > > >> > > protons:hasMainAlias >> > > >> > > < >> > > >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b94 >> > > 790c4e> . >> > > >> > > >> > > >> > > a protont:Topic ; >> > > protons:generatedBy >> > > >> > > >> > > ; >> > > >> > > protons:hasMainAlias >> > > >> > > < >> > > >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c994 >> > > 542b23> . >> > > >> > > >> > > >> > > a protont:Topic ; >> > > protons:generatedBy >> > > >> > > >> > > ; >> > > >> > > protons:hasMainAlias >> > > >> > > < >> > > >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df7973 >> > > 4c5a3d> . >> > > >> > > >> > > >> > > a protont:Topic ; >> > > protons:generatedBy >> > > >> > > >> > > ; >> > > >> > > protons:hasMainAlias >> > > >> > > < >> > > >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332a >> > > d9d839> . >> > > >> > > >> > > >> > > a protont:Topic ; >> > > protons:generatedBy >> > > >> > > >> > > ; >> > > >> > > protons:hasMainAlias >> > > >> > > < >> > > >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416d2 >> > > 8f4cf1> . >> > > >> > > < >> > > >> http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbda4 >> > > 616b72 >> > > >> > > a protons:Alias ; >> > > >> > > >> > > "Pixel Classification at en" . >> > > >> > > I added this document to owlim.ttl and imported my instances. >> > > >> > > I tried to follow document Customizing KIM 3.pdf, but as mapping has >> > > already been done, I didn't know what else to do. Maybe I should >> create >> > > Jape rule, or something like that, but I think that I should see Topic >> > > with or without my instances. I'm not sure, that is only my opinion. >> > > >> > > Best, >> > > Srecko >> > > >> > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < >> > > >> > > philip.alexiev at ontotext.com> wrote: >> > >> Can you describe the exact actions you take to add the topics to the >> IE >> > >> logic ? The exact customizations you have made to KIM. >> > >> >> > >> Thanks, >> > >> Philip >> > >> >> > >> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: >> > >> >> > >> Hi Philip, >> > >> with GATE is same as with Java code. I get the same annotations. I >> tried >> > >> to edit nerc.properties and add Topic to >> > >> *com.ontotext.kim.KIMConstants.IE_ANN_TYPES * list, but nothing >> > >> changed*. >> > >> * >> > >> Do I have to change something else? >> > >> >> > >> Best, >> > >> Srecko >> > >> >> > >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < >> > >> >> > >> philip.alexiev at ontotext.com> wrote: >> > >>> Hi Srecko, >> > >>> >> > >>> You can run the gate interface to check exactly what annotations are >> > >>> create ant their type. You can do this by running: >> > >>> bash KIM/bin/kim gate >> > >>> >> > >>> You probably use a Jape rule to match the Lookup annotations with >> > >>> class=" http://proton.semanticweb.org/2006/05/protont#Topic" and >> are >> > >>> creating one of the entity annotations over it (the entity >> > >>> annotations are a whitelist of annotations that remain after the >> > >>> annotation process finishes, all annotations not in this list are >> > >>> removed). >> > >>> >> > >>> So check what type of annotation you are creating. >> > >>> >> > >>> If this is not the case, please provide more details how you handle >> the >> > >>> >> > >>> topic lookups. >> > >>> >> > >>> All the best, >> > >>> Philip >> > >>> >> > >>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: >> > >>> > Hello Philip, >> > >>> > >> > >>> > I included my instances in KIM. When I use web UI, I see them all, >> > >>> > and >> > >>> >> > >>> everything looks ok. But when I run code like this: >> > >>> > KIMDocument kimDoc = >> > >>> >> > >>> apiCorpora.createDocument(_string_to_annotate, true); >> > >>> >> > >>> > kimDoc = apiSemAnn.execute(kimDoc); >> > >>> > >> > >>> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); >> > >>> > Set typesSet = kimASet.getAllTypes(); >> > >>> > Iterator iterator = typesSet.iterator(); >> > >>> > >> > >>> > // show annotations of every type separately >> > >>> > while(iterator.hasNext()) >> > >>> > { >> > >>> > >> > >>> > Object key = iterator.next(); >> > >>> > KIMAnnotationSet kimFilteredASet = >> > >>> >> > >>> kimASet.get(String.valueOf(key)); >> > >>> >> > >>> > Iterator annIterator = kimFilteredASet.iterator(); >> > >>> > System.out.println(" = Annotations of type [" + >> > >>> >> > >>> String.valueOf(key) + "] :"); >> > >>> >> > >>> > while(annIterator.hasNext()) >> > >>> > { >> > >>> > >> > >>> > System.out.println(" -- " + >> annIterator.next()); >> > >>> > >> > >>> > } >> > >>> > >> > >>> > } >> > >>> > System.out.println("[ Document's Typed Annotations >> (end) >> > >>> >> > >>> ]"); >> > >>> >> > >>> > I don't see any annotation of type Topic. I see all of them when I >> > >>> > use >> > >>> >> > >>> web UI, like I said. But when I try to annotate string from Java >> > >>> application, I don't get any Topic annotations. >> > >>> >> > >>> > Could you please help me on this one? >> > >>> > >> > >>> > Best, >> > >>> > Srecko >> > >>> > _______________________________________________ >> > >>> > Kim-discussion mailing list >> > >>> > Kim-discussion at ontotext.com >> > >>> > http://ontotext.com/mailman/listinfo/kim-discussion >> > >> >> > >> _______________________________________________ >> > >> Kim-discussion mailing list >> > >> Kim-discussion at ontotext.com >> > >> http://ontotext.com/mailman/listinfo/kim-discussion >> >> -- >> Boyan Kukushev >> Senior Software Engineer / Java Developer >> Ontotext AD @ Sirma Group Corp. >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip.alexiev at ontotext.com Thu Jul 14 08:49:28 2011 From: philip.alexiev at ontotext.com (Philip Alexiev @ Ontotext) Date: Thu, 14 Jul 2011 15:49:28 +0300 Subject: [Kim-discussion] Topic In-Reply-To: References: <201107141513.17099.boyan.kukushev@ontotext.com> Message-ID: Hi Srecko, Make sure you add the JAPE transducer after the gazetteer. Also it will be much simpler to examine the process directly in the GATE interface. Create a file, add it to the corpus, set the corpus to the pipeline and execute the pipeline. Disable the last resource to not remove the temporary annotations. Run the pipeline and explore the annotations. See if Lookup are created over the topics in the text. Things to have in mind: - after you change the RDF that is imported in KIM, you should delete the KIM/context/default/populated folder (where the cache is stored) and restart the server - the JAPE transducer should be after the Gazetteer, as it uses the annotations , created by the Gazetteer as its input. Hope this helps Philip On 14 Jul 2011, at 3:42 PM, srecko joksimovic wrote: > Hi Boyan, > I created JAPE rule like you and Philip sugested, and stored to context/default/resources/grammar/acm folder. Then I run gate, and created JAPE transducer, topic_jape. I didn't specify inputASName, but I did add it to pipeline. Saved application state, populated KIMServer corpus, and run the application. I don't know how, but there is everything but Topic. > > I'm still missing something, but I don't know what. Should I create Large KB Gazetteer? > > Best, > Srecko > > On Thu, Jul 14, 2011 at 2:15 PM, srecko joksimovic wrote: > Hi Boyan, > I didn't understand that I must create JAPE rule before I do everything else. > I'll try this now. > > Thank you! > > Srecko > > > On Thu, Jul 14, 2011 at 2:13 PM, Boyan Kukushev wrote: > Hi Srecko, > > In order to see your Topic annotations, you must create the JAPE rule that > Philip suggested: > > Phase: GazTopic > Input: Lookup > Options: control = appelt > Rule: Topic > ( > {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} > ):topic > --> > :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, > inst=:topic.Lookup.inst} > > and put that rule just after the gazetteer phases within the GATE pipeline. > The easiest way to do this is using the KIM GATE interface by starting > KIM/bin/kim(.bat) gate > > and modifying the pipeline. > > You have already added the Topic annotation type to the list of allowed > annotation types in KIM/config/nerc.properties. After you run the pipeline > with this new resource incuded, Topic annotations should appear in the default > annotation set for each document you process. > > To be able to use again the pipeline, you should save it, again using the KIM > GATE interface - right click on the pipeline and select 'Save application > state'. Remember to remove (or empty) the document corpus used by the > application. You choose whether to overwrite the default KIM pipeline > (IE.gapp) or create a new one and point KIM to use it (setting the > corresponding property in KIM/config/nerc.properties). > > Hope this helps! > > Regards, > Boyan > > P.S. What is happening exactly: > - the gazetteer phases use pre-defined knowledge base to find specific > 'things' in the text you process; they produce annotations of type Lookup > - the JAPE rule would take all Lookup annotations that have the specific > class (in your case that is > "http://proton.semanticweb.org/2006/05/protont#Topic") and would create a new > annotation of type Topic that is fully overlapping the current Lookup > annotation > - the last phase in the pipeline removes all temporary annotations - the > Lookup annotation is also a temporary annotation, but Topic (as it is added to > the allowed annotations list) will not be removed. > > On Thursday, July 14, 2011 14:50:51 srecko joksimovic wrote: > > I configured nerc.properties, and now I have this: > > > > com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, ContactInfo, > > Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, KeyPerson, > > KeyPhrase, Location, Money, Object, Organization, Percent, Person, > > Position, Time, Acquirement, JobTitle, Number, Topic > > > > then I disabled last resource in pipeline, but I still can't see Topic. > > Maybe I didn't understand well... should I first create Jape rule, or this > > is enough to see Topic? > > > > Best, > > Srecko > > > > > > > > On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext < > > > > philip.alexiev at ontotext.com> wrote: > > > The process is described in the customization guide you mentioned. > > > > > > You have added this RDF to the semantic repository. This means that now > > > the gazetteer will be able to match the instances described there > > > (searching for their labels in the texts) and will create Lookup > > > annotations, when it finds a match. The lookup process is generally > > > only one side of the IE process. Lookup annotations are further > > > examined by some logic to determine their validity, or they take part in > > > the recognition of more complex phrases. That is why they are not left > > > after the IE has finished over the document. > > > > > > Tip: You can disable the last resource in the pipeline and run it again > > > over a document to see all the annotations that are created in the > > > process - also the temporary ones. This will show you the Lookup > > > annotations as well. You can search for your Topic instances there. > > > > > > Once you have the lookups, you should tell KIM that they are important > > > for you and you want to keep them. Add the Topic annotation type to the > > > * com.ontotext.kim.KIMConstants.IE_ANN_TYPES *list. This will tell KIM > > > not to clear them in the end of the IE process. Now what is left for > > > you to do, is to create a Topic annotation over the Lookup for the > > > topic that the Gazetteer has created. You can use a simple Jape rule to > > > do that: > > > > > > > > > Phase: GazTopic > > > Input: Lookup > > > Options: control = appelt > > > > > > Rule: Topic > > > ( > > > > > > {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} > > > > > > ):topic > > > --> > > > > > > :topic.Topic = {rule = Topic, class = :topic.Lookup.class, > > > > > > inst=:topic.Lookup.inst} > > > > > > > > > This is all that you need to include your topics in the IE process and to > > > be able to see them in the graphical interface. > > > > > > Hope this helps > > > philip > > > > > > > > > On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: > > > > > > It is little hard to explain because I didn't do customisation. I took > > > the file where one of my colleagues did it. File contains about 1200 > > > instances and has content like this: > > > > > > @prefix protons: . > > > @prefix protont: . > > > > > > < > > > http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c912 > > > 1b4297 > > > > > > a protons:Alias ; > > > > > > > > > "Convex Programming at en" . > > > > > > < > > > http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227de > > > 53c015 > > > > > > a protons:Alias ; > > > > > > > > > "Document and Text Processing at en" . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5 > > > d5ee1a> . > > > > > > < > > > http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f09 > > > 898d2d > > > > > > a protons:Alias ; > > > > > > > > > "Store and Forward Networks at en" . > > > > > > < > > > http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb53a > > > e3b0b6 > > > > > > a protons:Alias ; > > > > > > > > > "Integral Equations at en" . > > > > > > < > > > http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a8c > > > 034636 > > > > > > a protons:Alias ; > > > > > > > > > "Information Filtering at en" . > > > > > > < > > > http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-811610 > > > a2f77f > > > > > > a protons:Alias ; > > > > > > > > > "Surfac eFitting at en" . > > > > > > < > > > http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d8d > > > 150f30 > > > > > > a protons:Alias ; > > > > > > > > > "Reliability,Availability and Serviceability at en" . > > > > > > < > > > http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c88176 > > > 47765a > > > > > > a protons:Alias ; > > > > > > > > > "Aerospace at en" . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b > > > 55f9c7> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b94 > > > 790c4e> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c994 > > > 542b23> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df7973 > > > 4c5a3d> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332a > > > d9d839> . > > > > > > > > > > > > a protont:Topic ; > > > protons:generatedBy > > > > > > > > > ; > > > > > > protons:hasMainAlias > > > > > > < > > > > > > http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416d2 > > > 8f4cf1> . > > > > > > < > > > http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbda4 > > > 616b72 > > > > > > a protons:Alias ; > > > > > > > > > "Pixel Classification at en" . > > > > > > I added this document to owlim.ttl and imported my instances. > > > > > > I tried to follow document Customizing KIM 3.pdf, but as mapping has > > > already been done, I didn't know what else to do. Maybe I should create > > > Jape rule, or something like that, but I think that I should see Topic > > > with or without my instances. I'm not sure, that is only my opinion. > > > > > > Best, > > > Srecko > > > > > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < > > > > > > philip.alexiev at ontotext.com> wrote: > > >> Can you describe the exact actions you take to add the topics to the IE > > >> logic ? The exact customizations you have made to KIM. > > >> > > >> Thanks, > > >> Philip > > >> > > >> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: > > >> > > >> Hi Philip, > > >> with GATE is same as with Java code. I get the same annotations. I tried > > >> to edit nerc.properties and add Topic to > > >> *com.ontotext.kim.KIMConstants.IE_ANN_TYPES * list, but nothing > > >> changed*. > > >> * > > >> Do I have to change something else? > > >> > > >> Best, > > >> Srecko > > >> > > >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < > > >> > > >> philip.alexiev at ontotext.com> wrote: > > >>> Hi Srecko, > > >>> > > >>> You can run the gate interface to check exactly what annotations are > > >>> create ant their type. You can do this by running: > > >>> bash KIM/bin/kim gate > > >>> > > >>> You probably use a Jape rule to match the Lookup annotations with > > >>> class=" http://proton.semanticweb.org/2006/05/protont#Topic" and are > > >>> creating one of the entity annotations over it (the entity > > >>> annotations are a whitelist of annotations that remain after the > > >>> annotation process finishes, all annotations not in this list are > > >>> removed). > > >>> > > >>> So check what type of annotation you are creating. > > >>> > > >>> If this is not the case, please provide more details how you handle the > > >>> > > >>> topic lookups. > > >>> > > >>> All the best, > > >>> Philip > > >>> > > >>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: > > >>> > Hello Philip, > > >>> > > > >>> > I included my instances in KIM. When I use web UI, I see them all, > > >>> > and > > >>> > > >>> everything looks ok. But when I run code like this: > > >>> > KIMDocument kimDoc = > > >>> > > >>> apiCorpora.createDocument(_string_to_annotate, true); > > >>> > > >>> > kimDoc = apiSemAnn.execute(kimDoc); > > >>> > > > >>> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); > > >>> > Set typesSet = kimASet.getAllTypes(); > > >>> > Iterator iterator = typesSet.iterator(); > > >>> > > > >>> > // show annotations of every type separately > > >>> > while(iterator.hasNext()) > > >>> > { > > >>> > > > >>> > Object key = iterator.next(); > > >>> > KIMAnnotationSet kimFilteredASet = > > >>> > > >>> kimASet.get(String.valueOf(key)); > > >>> > > >>> > Iterator annIterator = kimFilteredASet.iterator(); > > >>> > System.out.println(" = Annotations of type [" + > > >>> > > >>> String.valueOf(key) + "] :"); > > >>> > > >>> > while(annIterator.hasNext()) > > >>> > { > > >>> > > > >>> > System.out.println(" -- " + annIterator.next()); > > >>> > > > >>> > } > > >>> > > > >>> > } > > >>> > System.out.println("[ Document's Typed Annotations (end) > > >>> > > >>> ]"); > > >>> > > >>> > I don't see any annotation of type Topic. I see all of them when I > > >>> > use > > >>> > > >>> web UI, like I said. But when I try to annotate string from Java > > >>> application, I don't get any Topic annotations. > > >>> > > >>> > Could you please help me on this one? > > >>> > > > >>> > Best, > > >>> > Srecko > > >>> > _______________________________________________ > > >>> > Kim-discussion mailing list > > >>> > Kim-discussion at ontotext.com > > >>> > http://ontotext.com/mailman/listinfo/kim-discussion > > >> > > >> _______________________________________________ > > >> Kim-discussion mailing list > > >> Kim-discussion at ontotext.com > > >> http://ontotext.com/mailman/listinfo/kim-discussion > > -- > Boyan Kukushev > Senior Software Engineer / Java Developer > Ontotext AD @ Sirma Group Corp. > > > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sreckojoksimovic at gmail.com Fri Jul 15 04:38:00 2011 From: sreckojoksimovic at gmail.com (srecko joksimovic) Date: Fri, 15 Jul 2011 10:38:00 +0200 Subject: [Kim-discussion] Topic In-Reply-To: References: <201107141513.17099.boyan.kukushev@ontotext.com> Message-ID: Hi Philip, Hi Boyan, I'm still having a problem with Topic, and annotations. Here is what I did: I added acm_proton.n3 file to owlim.ttl, and imported new instances. When I run web UI, I'm able to see all I imported. After that I started GATE. Created query: PREFIX rdf: PREFIX protons: PREFIX protont: PREFIX rdfs: SELECT ?topic ?alias ?label WHERE { ?topic rdf:type protont:Topic. ?topic protons:hasMainAlias ?alias. ?alias a protons:Alias. ?alias rdfs:label ?label. } when I evaluate this query with JVisualVM, this is what I get: [http://www.lornet.org/acm-ccs/proton#J.7.1, http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5d5ee1a, "Consumer Products at en"] [http://www.lornet.org/acm-ccs/proton#B.7.3.1, http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b55f9c7, "Error-Checking at en"] [http://www.lornet.org/acm-ccs/proton#K.7.2, http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b94790c4e, "Organizations at en"] [http://www.lornet.org/acm-ccs/proton#D.3.2.10, http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c994542b23, "Nonprocedural Languages at en"] [http://www.lornet.org/acm-ccs/proton#C.1.1.2, http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332ad9d839, "RISC/CISC, VLIW Architectures at en"] ?. With GATE, I used JAPE rule that you suggested: Phase: GazTopic Input: Lookup Options: control = appelt Rule: Topic ( {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} ):topic --> :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, inst=:topic.Lookup.inst} Then I tried to create LKB Gazetteer. When I configure new LKB Gazetteer using this parameters: annotationLimit = -1 caseSensitive = false dictFeederClass = com.ontotext.kim.model.KimDictionaryFeederImpl dictFeederParams = FeedSetupPath=$relpath$../context/default/resources/gazetteer/acm-topic dynamicDictEnabled = false feedTransformerStages = outputASName = relpath = file_path staticDictEnabled = true staticDictSerializationPath = file_path (add to pipeline, before JAPE rule, and after default Gazetteer) I don't see Topic. But, when I put outputASName = Topic, I see Topic as a separated group, with Lookup as it's subgroup. When I select Lookup, I can see that something is annotated, but it is not same as terms that I imported. When I try to annotate using Java code, it's still same. Could you please tell me what I did wrong? Best, Srecko On Thu, Jul 14, 2011 at 2:49 PM, Philip Alexiev @ Ontotext < philip.alexiev at ontotext.com> wrote: > Hi Srecko, > > Make sure you add the JAPE transducer after the gazetteer. > > Also it will be much simpler to examine the process directly in the GATE > interface. Create a file, add it to the corpus, set the corpus to the > pipeline and execute the pipeline. > > Disable the last resource to not remove the temporary annotations. Run the > pipeline and explore the annotations. See if Lookup are created over the > topics in the text. > > Things to have in mind: > - after you change the RDF that is imported in KIM, you should delete the > KIM/context/default/populated folder (where the cache is stored) and > restart the server > - the JAPE transducer should be after the Gazetteer, as it uses the > annotations , created by the Gazetteer as its input. > > Hope this helps > Philip > > > On 14 Jul 2011, at 3:42 PM, srecko joksimovic wrote: > > Hi Boyan, > I created JAPE rule like you and Philip sugested, and stored to > context/default/resources/grammar/acm folder. Then I run gate, and created > JAPE transducer, topic_jape. I didn't specify inputASName, but I did add it > to pipeline. Saved application state, populated KIMServer corpus, and run > the application. I don't know how, but there is everything but Topic. > > I'm still missing something, but I don't know what. Should I create Large > KB Gazetteer? > > Best, > Srecko > > On Thu, Jul 14, 2011 at 2:15 PM, srecko joksimovic < > sreckojoksimovic at gmail.com> wrote: > >> Hi Boyan, >> I didn't understand that I must create JAPE rule before I do everything >> else. >> I'll try this now. >> >> Thank you! >> >> Srecko >> >> >> On Thu, Jul 14, 2011 at 2:13 PM, Boyan Kukushev < >> boyan.kukushev at ontotext.com> wrote: >> >>> Hi Srecko, >>> >>> In order to see your Topic annotations, you must create the JAPE rule >>> that >>> Philip suggested: >>> >>> Phase: GazTopic >>> Input: Lookup >>> Options: control = appelt >>> Rule: Topic >>> ( >>> {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} >>> ):topic >>> --> >>> :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, >>> inst=:topic.Lookup.inst} >>> >>> and put that rule just after the gazetteer phases within the GATE >>> pipeline. >>> The easiest way to do this is using the KIM GATE interface by starting >>> KIM/bin/kim(.bat) gate >>> >>> and modifying the pipeline. >>> >>> You have already added the Topic annotation type to the list of allowed >>> annotation types in KIM/config/nerc.properties. After you run the >>> pipeline >>> with this new resource incuded, Topic annotations should appear in the >>> default >>> annotation set for each document you process. >>> >>> To be able to use again the pipeline, you should save it, again using the >>> KIM >>> GATE interface - right click on the pipeline and select 'Save application >>> state'. Remember to remove (or empty) the document corpus used by the >>> application. You choose whether to overwrite the default KIM pipeline >>> (IE.gapp) or create a new one and point KIM to use it (setting the >>> corresponding property in KIM/config/nerc.properties). >>> >>> Hope this helps! >>> >>> Regards, >>> Boyan >>> >>> P.S. What is happening exactly: >>> - the gazetteer phases use pre-defined knowledge base to find specific >>> 'things' in the text you process; they produce annotations of type Lookup >>> - the JAPE rule would take all Lookup annotations that have the specific >>> class (in your case that is >>> "http://proton.semanticweb.org/2006/05/protont#Topic") and would create >>> a new >>> annotation of type Topic that is fully overlapping the current Lookup >>> annotation >>> - the last phase in the pipeline removes all temporary annotations - the >>> Lookup annotation is also a temporary annotation, but Topic (as it is >>> added to >>> the allowed annotations list) will not be removed. >>> >>> On Thursday, July 14, 2011 14:50:51 srecko joksimovic wrote: >>> > I configured nerc.properties, and now I have this: >>> > >>> > com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, >>> ContactInfo, >>> > Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, >>> KeyPerson, >>> > KeyPhrase, Location, Money, Object, Organization, Percent, Person, >>> > Position, Time, Acquirement, JobTitle, Number, Topic >>> > >>> > then I disabled last resource in pipeline, but I still can't see Topic. >>> > Maybe I didn't understand well... should I first create Jape rule, or >>> this >>> > is enough to see Topic? >>> > >>> > Best, >>> > Srecko >>> > >>> > >>> > >>> > On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext < >>> > >>> > philip.alexiev at ontotext.com> wrote: >>> > > The process is described in the customization guide you mentioned. >>> > > >>> > > You have added this RDF to the semantic repository. This means that >>> now >>> > > the gazetteer will be able to match the instances described there >>> > > (searching for their labels in the texts) and will create Lookup >>> > > annotations, when it finds a match. The lookup process is generally >>> > > only one side of the IE process. Lookup annotations are further >>> > > examined by some logic to determine their validity, or they take part >>> in >>> > > the recognition of more complex phrases. That is why they are not >>> left >>> > > after the IE has finished over the document. >>> > > >>> > > Tip: You can disable the last resource in the pipeline and run it >>> again >>> > > over a document to see all the annotations that are created in the >>> > > process - also the temporary ones. This will show you the Lookup >>> > > annotations as well. You can search for your Topic instances there. >>> > > >>> > > Once you have the lookups, you should tell KIM that they are >>> important >>> > > for you and you want to keep them. Add the Topic annotation type to >>> the >>> > > * com.ontotext.kim.KIMConstants.IE_ANN_TYPES *list. This will tell >>> KIM >>> > > not to clear them in the end of the IE process. Now what is left for >>> > > you to do, is to create a Topic annotation over the Lookup for the >>> > > topic that the Gazetteer has created. You can use a simple Jape rule >>> to >>> > > do that: >>> > > >>> > > >>> > > Phase: GazTopic >>> > > Input: Lookup >>> > > Options: control = appelt >>> > > >>> > > Rule: Topic >>> > > ( >>> > > >>> > > {Lookup.class == " >>> http://proton.semanticweb.org/2006/05/protont#Topic"} >>> > > >>> > > ):topic >>> > > --> >>> > > >>> > > :topic.Topic = {rule = Topic, class = :topic.Lookup.class, >>> > > >>> > > inst=:topic.Lookup.inst} >>> > > >>> > > >>> > > This is all that you need to include your topics in the IE process >>> and to >>> > > be able to see them in the graphical interface. >>> > > >>> > > Hope this helps >>> > > philip >>> > > >>> > > >>> > > On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: >>> > > >>> > > It is little hard to explain because I didn't do customisation. I >>> took >>> > > the file where one of my colleagues did it. File contains about 1200 >>> > > instances and has content like this: >>> > > >>> > > @prefix protons: . >>> > > @prefix protont: . >>> > > >>> > > < >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c912 >>> > > 1b4297 >>> > > >>> > > a protons:Alias ; >>> > > >>> > > >>> > > "Convex Programming at en" . >>> > > >>> > > < >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227de >>> > > 53c015 >>> > > >>> > > a protons:Alias ; >>> > > >>> > > >>> > > "Document and Text Processing at en" . >>> > > >>> > > >>> > > >>> > > a protont:Topic ; >>> > > protons:generatedBy >>> > > >>> > > >>> > > ; >>> > > >>> > > protons:hasMainAlias >>> > > >>> > > < >>> > > >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5 >>> > > d5ee1a> . >>> > > >>> > > < >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f09 >>> > > 898d2d >>> > > >>> > > a protons:Alias ; >>> > > >>> > > >>> > > "Store and Forward Networks at en" . >>> > > >>> > > < >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb53a >>> > > e3b0b6 >>> > > >>> > > a protons:Alias ; >>> > > >>> > > >>> > > "Integral Equations at en" . >>> > > >>> > > < >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a8c >>> > > 034636 >>> > > >>> > > a protons:Alias ; >>> > > >>> > > >>> > > "Information Filtering at en" . >>> > > >>> > > < >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-811610 >>> > > a2f77f >>> > > >>> > > a protons:Alias ; >>> > > >>> > > >>> > > "Surfac eFitting at en" . >>> > > >>> > > < >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d8d >>> > > 150f30 >>> > > >>> > > a protons:Alias ; >>> > > >>> > > >>> > > "Reliability,Availability and Serviceability at en" . >>> > > >>> > > < >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c88176 >>> > > 47765a >>> > > >>> > > a protons:Alias ; >>> > > >>> > > >>> > > "Aerospace at en" . >>> > > >>> > > >>> > > >>> > > a protont:Topic ; >>> > > protons:generatedBy >>> > > >>> > > >>> > > ; >>> > > >>> > > protons:hasMainAlias >>> > > >>> > > < >>> > > >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b >>> > > 55f9c7> . >>> > > >>> > > >>> > > >>> > > a protont:Topic ; >>> > > protons:generatedBy >>> > > >>> > > >>> > > ; >>> > > >>> > > protons:hasMainAlias >>> > > >>> > > < >>> > > >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b94 >>> > > 790c4e> . >>> > > >>> > > >>> > > >>> > > a protont:Topic ; >>> > > protons:generatedBy >>> > > >>> > > >>> > > ; >>> > > >>> > > protons:hasMainAlias >>> > > >>> > > < >>> > > >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c994 >>> > > 542b23> . >>> > > >>> > > >>> > > >>> > > a protont:Topic ; >>> > > protons:generatedBy >>> > > >>> > > >>> > > ; >>> > > >>> > > protons:hasMainAlias >>> > > >>> > > < >>> > > >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df7973 >>> > > 4c5a3d> . >>> > > >>> > > >>> > > >>> > > a protont:Topic ; >>> > > protons:generatedBy >>> > > >>> > > >>> > > ; >>> > > >>> > > protons:hasMainAlias >>> > > >>> > > < >>> > > >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332a >>> > > d9d839> . >>> > > >>> > > >>> > > >>> > > a protont:Topic ; >>> > > protons:generatedBy >>> > > >>> > > >>> > > ; >>> > > >>> > > protons:hasMainAlias >>> > > >>> > > < >>> > > >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416d2 >>> > > 8f4cf1> . >>> > > >>> > > < >>> > > >>> http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbda4 >>> > > 616b72 >>> > > >>> > > a protons:Alias ; >>> > > >>> > > >>> > > "Pixel Classification at en" . >>> > > >>> > > I added this document to owlim.ttl and imported my instances. >>> > > >>> > > I tried to follow document Customizing KIM 3.pdf, but as mapping has >>> > > already been done, I didn't know what else to do. Maybe I should >>> create >>> > > Jape rule, or something like that, but I think that I should see >>> Topic >>> > > with or without my instances. I'm not sure, that is only my opinion. >>> > > >>> > > Best, >>> > > Srecko >>> > > >>> > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < >>> > > >>> > > philip.alexiev at ontotext.com> wrote: >>> > >> Can you describe the exact actions you take to add the topics to the >>> IE >>> > >> logic ? The exact customizations you have made to KIM. >>> > >> >>> > >> Thanks, >>> > >> Philip >>> > >> >>> > >> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: >>> > >> >>> > >> Hi Philip, >>> > >> with GATE is same as with Java code. I get the same annotations. I >>> tried >>> > >> to edit nerc.properties and add Topic to >>> > >> *com.ontotext.kim.KIMConstants.IE_ANN_TYPES * list, but nothing >>> > >> changed*. >>> > >> * >>> > >> Do I have to change something else? >>> > >> >>> > >> Best, >>> > >> Srecko >>> > >> >>> > >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < >>> > >> >>> > >> philip.alexiev at ontotext.com> wrote: >>> > >>> Hi Srecko, >>> > >>> >>> > >>> You can run the gate interface to check exactly what annotations >>> are >>> > >>> create ant their type. You can do this by running: >>> > >>> bash KIM/bin/kim gate >>> > >>> >>> > >>> You probably use a Jape rule to match the Lookup annotations with >>> > >>> class=" http://proton.semanticweb.org/2006/05/protont#Topic" and >>> are >>> > >>> creating one of the entity annotations over it (the entity >>> > >>> annotations are a whitelist of annotations that remain after the >>> > >>> annotation process finishes, all annotations not in this list are >>> > >>> removed). >>> > >>> >>> > >>> So check what type of annotation you are creating. >>> > >>> >>> > >>> If this is not the case, please provide more details how you handle >>> the >>> > >>> >>> > >>> topic lookups. >>> > >>> >>> > >>> All the best, >>> > >>> Philip >>> > >>> >>> > >>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: >>> > >>> > Hello Philip, >>> > >>> > >>> > >>> > I included my instances in KIM. When I use web UI, I see them >>> all, >>> > >>> > and >>> > >>> >>> > >>> everything looks ok. But when I run code like this: >>> > >>> > KIMDocument kimDoc = >>> > >>> >>> > >>> apiCorpora.createDocument(_string_to_annotate, true); >>> > >>> >>> > >>> > kimDoc = apiSemAnn.execute(kimDoc); >>> > >>> > >>> > >>> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); >>> > >>> > Set typesSet = kimASet.getAllTypes(); >>> > >>> > Iterator iterator = typesSet.iterator(); >>> > >>> > >>> > >>> > // show annotations of every type separately >>> > >>> > while(iterator.hasNext()) >>> > >>> > { >>> > >>> > >>> > >>> > Object key = iterator.next(); >>> > >>> > KIMAnnotationSet kimFilteredASet = >>> > >>> >>> > >>> kimASet.get(String.valueOf(key)); >>> > >>> >>> > >>> > Iterator annIterator = kimFilteredASet.iterator(); >>> > >>> > System.out.println(" = Annotations of type [" + >>> > >>> >>> > >>> String.valueOf(key) + "] :"); >>> > >>> >>> > >>> > while(annIterator.hasNext()) >>> > >>> > { >>> > >>> > >>> > >>> > System.out.println(" -- " + >>> annIterator.next()); >>> > >>> > >>> > >>> > } >>> > >>> > >>> > >>> > } >>> > >>> > System.out.println("[ Document's Typed Annotations >>> (end) >>> > >>> >>> > >>> ]"); >>> > >>> >>> > >>> > I don't see any annotation of type Topic. I see all of them when >>> I >>> > >>> > use >>> > >>> >>> > >>> web UI, like I said. But when I try to annotate string from Java >>> > >>> application, I don't get any Topic annotations. >>> > >>> >>> > >>> > Could you please help me on this one? >>> > >>> > >>> > >>> > Best, >>> > >>> > Srecko >>> > >>> > _______________________________________________ >>> > >>> > Kim-discussion mailing list >>> > >>> > Kim-discussion at ontotext.com >>> > >>> > http://ontotext.com/mailman/listinfo/kim-discussion >>> > >> >>> > >> _______________________________________________ >>> > >> Kim-discussion mailing list >>> > >> Kim-discussion at ontotext.com >>> > >> http://ontotext.com/mailman/listinfo/kim-discussion >>> >>> -- >>> Boyan Kukushev >>> Senior Software Engineer / Java Developer >>> Ontotext AD @ Sirma Group Corp. >>> >> >> > _______________________________________________ > Kim-discussion mailing list > Kim-discussion at ontotext.com > http://ontotext.com/mailman/listinfo/kim-discussion > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From boyan.kukushev at ontotext.com Fri Jul 15 06:54:43 2011 From: boyan.kukushev at ontotext.com (Boyan Kukushev) Date: Fri, 15 Jul 2011 13:54:43 +0300 Subject: [Kim-discussion] Topic In-Reply-To: References: Message-ID: <201107151354.44016.boyan.kukushev@ontotext.com> Hi :) We think we finally found where your problem is: the labels in this RDF are wrong. *wrong* : "Consumer Products at en" *correct*: "Consumer Products"@en Steps to resolve: - re-create your RDF information - either move the language descriptor for each label outside the quotes, or remove it from all labels; make sure you have at least these statements for any entity you want to look up: * protons:generatedBy * rdf-syntax:type protont:Topic * protons:hasMainAlias * rdf-syntax:type protons:Alias * rdf-schema:label "entity-label"@lang !!! Do not forget to declare trusted by adding the statement * rdf-syntax:type protons:Trusted - remove your entire KIM/context/default/populated directory (to clear gazetteer cache) - make sure you have added the Topic annotation type to KIM/config/nerc.properties:com.ontotext.kim.KIMConstants.IE_ANN_TYPES - edit KIM/config/owlim.ttl so that KIM will load your RDF information (editing should be adding two lines - one to 'owlim:imports', one to 'owlim:defaultNS' - start KIM server by KIM/bin/kim(.bat) gate and wait till server is up - initialize the JAPE rule for creating Topic annotations from Lookups and add it in the pipeline after the gazetteer phases - process any document you are sure to have Topic annotations in text - if successful - store the GATE application pipeline as described in previous mail; - report what happened :) I hope this brief step listing would be helpful. Regards, Boyan I suggest you re-create On Friday, July 15, 2011 11:38:00 srecko joksimovic wrote: > Hi Philip, > Hi Boyan, > > I'm still having a problem with Topic, and annotations. > Here is what I did: > I added acm_proton.n3 file to owlim.ttl, and imported new instances. When I > run web UI, I'm able to see all I imported. > After that I started GATE. Created query: > > PREFIX rdf: > PREFIX protons: > PREFIX protont: > PREFIX rdfs: > SELECT ?topic ?alias ?label > WHERE { > ?topic rdf:type protont:Topic. > ?topic protons:hasMainAlias ?alias. > ?alias a protons:Alias. > ?alias rdfs:label ?label. > } > > when I evaluate this query with JVisualVM, this is what I get: > > [http://www.lornet.org/acm-ccs/proton#J.7.1, > http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5d5 > ee1a, "Consumer Products at en"] > [http://www.lornet.org/acm-ccs/proton#B.7.3.1, > http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b55 > f9c7, "Error-Checking at en"] > [http://www.lornet.org/acm-ccs/proton#K.7.2, > http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b9479 > 0c4e, "Organizations at en"] > [http://www.lornet.org/acm-ccs/proton#D.3.2.10, > http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c99454 > 2b23, "Nonprocedural Languages at en"] > [http://www.lornet.org/acm-ccs/proton#C.1.1.2, > http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332ad9 > d839, "RISC/CISC, VLIW Architectures at en"] > ?. > > With GATE, I used JAPE rule that you suggested: > Phase: GazTopic > Input: Lookup > Options: control = appelt > Rule: Topic > ( > {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} > ):topic > --> > > :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, > > inst=:topic.Lookup.inst} > > Then I tried to create LKB Gazetteer. When I configure new LKB Gazetteer > using this parameters: > > annotationLimit = -1 > caseSensitive = false > dictFeederClass = com.ontotext.kim.model.KimDictionaryFeederImpl > dictFeederParams = > FeedSetupPath=$relpath$../context/default/resources/gazetteer/acm-topic > dynamicDictEnabled = false > feedTransformerStages = > outputASName = > relpath = file_path > staticDictEnabled = true > staticDictSerializationPath = file_path > > (add to pipeline, before JAPE rule, and after default Gazetteer) > > I don't see Topic. But, when I put outputASName = Topic, I see Topic as a > separated group, with Lookup as it's subgroup. When I select Lookup, I can > see that something is annotated, but it is not same as terms that I > imported. > When I try to annotate using Java code, it's still same. > > Could you please tell me what I did wrong? > > Best, > Srecko > > > On Thu, Jul 14, 2011 at 2:49 PM, Philip Alexiev @ Ontotext < > > philip.alexiev at ontotext.com> wrote: > > Hi Srecko, > > > > Make sure you add the JAPE transducer after the gazetteer. > > > > Also it will be much simpler to examine the process directly in the GATE > > interface. Create a file, add it to the corpus, set the corpus to the > > pipeline and execute the pipeline. > > > > Disable the last resource to not remove the temporary annotations. Run > > the pipeline and explore the annotations. See if Lookup are created over > > the topics in the text. > > > > Things to have in mind: > > - after you change the RDF that is imported in KIM, you should delete > > the > > > > KIM/context/default/populated folder (where the cache is stored) and > > > > restart the server > > - the JAPE transducer should be after the Gazetteer, as it uses the > > annotations , created by the Gazetteer as its input. > > > > Hope this helps > > Philip > > > > > > On 14 Jul 2011, at 3:42 PM, srecko joksimovic wrote: > > > > Hi Boyan, > > I created JAPE rule like you and Philip sugested, and stored to > > context/default/resources/grammar/acm folder. Then I run gate, and > > created JAPE transducer, topic_jape. I didn't specify inputASName, but I > > did add it to pipeline. Saved application state, populated KIMServer > > corpus, and run the application. I don't know how, but there is > > everything but Topic. > > > > I'm still missing something, but I don't know what. Should I create Large > > KB Gazetteer? > > > > Best, > > Srecko > > > > On Thu, Jul 14, 2011 at 2:15 PM, srecko joksimovic < > > > > sreckojoksimovic at gmail.com> wrote: > >> Hi Boyan, > >> I didn't understand that I must create JAPE rule before I do everything > >> else. > >> I'll try this now. > >> > >> Thank you! > >> > >> Srecko > >> > >> > >> On Thu, Jul 14, 2011 at 2:13 PM, Boyan Kukushev < > >> > >> boyan.kukushev at ontotext.com> wrote: > >>> Hi Srecko, > >>> > >>> In order to see your Topic annotations, you must create the JAPE rule > >>> that > >>> Philip suggested: > >>> > >>> Phase: GazTopic > >>> Input: Lookup > >>> Options: control = appelt > >>> Rule: Topic > >>> ( > >>> > >>> {Lookup.class == > >>> "http://proton.semanticweb.org/2006/05/protont#Topic"} > >>> > >>> ):topic > >>> --> > >>> > >>> :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, > >>> > >>> inst=:topic.Lookup.inst} > >>> > >>> and put that rule just after the gazetteer phases within the GATE > >>> pipeline. > >>> The easiest way to do this is using the KIM GATE interface by starting > >>> > >>> KIM/bin/kim(.bat) gate > >>> > >>> and modifying the pipeline. > >>> > >>> You have already added the Topic annotation type to the list of allowed > >>> annotation types in KIM/config/nerc.properties. After you run the > >>> pipeline > >>> with this new resource incuded, Topic annotations should appear in the > >>> default > >>> annotation set for each document you process. > >>> > >>> To be able to use again the pipeline, you should save it, again using > >>> the KIM > >>> GATE interface - right click on the pipeline and select 'Save > >>> application state'. Remember to remove (or empty) the document corpus > >>> used by the application. You choose whether to overwrite the default > >>> KIM pipeline (IE.gapp) or create a new one and point KIM to use it > >>> (setting the corresponding property in KIM/config/nerc.properties). > >>> > >>> Hope this helps! > >>> > >>> Regards, > >>> Boyan > >>> > >>> P.S. What is happening exactly: > >>> - the gazetteer phases use pre-defined knowledge base to find specific > >>> > >>> 'things' in the text you process; they produce annotations of type > >>> Lookup > >>> > >>> - the JAPE rule would take all Lookup annotations that have the > >>> specific > >>> > >>> class (in your case that is > >>> "http://proton.semanticweb.org/2006/05/protont#Topic") and would create > >>> a new > >>> annotation of type Topic that is fully overlapping the current Lookup > >>> annotation > >>> > >>> - the last phase in the pipeline removes all temporary annotations - > >>> the > >>> > >>> Lookup annotation is also a temporary annotation, but Topic (as it is > >>> added to > >>> the allowed annotations list) will not be removed. > >>> > >>> On Thursday, July 14, 2011 14:50:51 srecko joksimovic wrote: > >>> > I configured nerc.properties, and now I have this: > >>> > > >>> > com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, > >>> > >>> ContactInfo, > >>> > >>> > Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, > >>> > >>> KeyPerson, > >>> > >>> > KeyPhrase, Location, Money, Object, Organization, Percent, Person, > >>> > Position, Time, Acquirement, JobTitle, Number, Topic > >>> > > >>> > then I disabled last resource in pipeline, but I still can't see > >>> > Topic. Maybe I didn't understand well... should I first create Jape > >>> > rule, or > >>> > >>> this > >>> > >>> > is enough to see Topic? > >>> > > >>> > Best, > >>> > Srecko > >>> > > >>> > > >>> > > >>> > On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext < > >>> > > >>> > philip.alexiev at ontotext.com> wrote: > >>> > > The process is described in the customization guide you mentioned. > >>> > > > >>> > > You have added this RDF to the semantic repository. This means > >>> > > that > >>> > >>> now > >>> > >>> > > the gazetteer will be able to match the instances described there > >>> > > (searching for their labels in the texts) and will create Lookup > >>> > > annotations, when it finds a match. The lookup process is > >>> > > generally only one side of the IE process. Lookup annotations are > >>> > > further examined by some logic to determine their validity, or > >>> > > they take part > >>> > >>> in > >>> > >>> > > the recognition of more complex phrases. That is why they are not > >>> > >>> left > >>> > >>> > > after the IE has finished over the document. > >>> > > > >>> > > Tip: You can disable the last resource in the pipeline and run it > >>> > >>> again > >>> > >>> > > over a document to see all the annotations that are created in the > >>> > > process - also the temporary ones. This will show you the Lookup > >>> > > annotations as well. You can search for your Topic instances > >>> > > there. > >>> > > > >>> > > Once you have the lookups, you should tell KIM that they are > >>> > >>> important > >>> > >>> > > for you and you want to keep them. Add the Topic annotation type > >>> > > to > >>> > >>> the > >>> > >>> > > * com.ontotext.kim.KIMConstants.IE_ANN_TYPES *list. This will > >>> > > tell > >>> > >>> KIM > >>> > >>> > > not to clear them in the end of the IE process. Now what is left > >>> > > for you to do, is to create a Topic annotation over the Lookup > >>> > > for the topic that the Gazetteer has created. You can use a > >>> > > simple Jape rule > >>> > >>> to > >>> > >>> > > do that: > >>> > > > >>> > > > >>> > > Phase: GazTopic > >>> > > Input: Lookup > >>> > > Options: control = appelt > >>> > > > >>> > > Rule: Topic > >>> > > ( > >>> > > > >>> > > {Lookup.class == " > >>> > >>> http://proton.semanticweb.org/2006/05/protont#Topic"} > >>> > >>> > > ):topic > >>> > > --> > >>> > > > >>> > > :topic.Topic = {rule = Topic, class = :topic.Lookup.class, > >>> > > > >>> > > inst=:topic.Lookup.inst} > >>> > > > >>> > > > >>> > > This is all that you need to include your topics in the IE process > >>> > >>> and to > >>> > >>> > > be able to see them in the graphical interface. > >>> > > > >>> > > Hope this helps > >>> > > philip > >>> > > > >>> > > > >>> > > On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: > >>> > > > >>> > > It is little hard to explain because I didn't do customisation. I > >>> > >>> took > >>> > >>> > > the file where one of my colleagues did it. File contains about > >>> > > 1200 instances and has content like this: > >>> > > > >>> > > @prefix protons: > >>> > > . @prefix protont: > >>> > > . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c9 > >>> 12 > >>> > >>> > > 1b4297 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Convex Programming at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227 > >>> de > >>> > >>> > > 53c015 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Document and Text Processing at en" . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdd > >>> e5 > >>> > >>> > > d5ee1a> . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f > >>> 09 > >>> > >>> > > 898d2d > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Store and Forward Networks at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb5 > >>> 3a > >>> > >>> > > e3b0b6 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Integral Equations at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a > >>> 8c > >>> > >>> > > 034636 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Information Filtering at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-8116 > >>> 10 > >>> > >>> > > a2f77f > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Surfac eFitting at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d > >>> 8d > >>> > >>> > > 150f30 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Reliability,Availability and Serviceability at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c881 > >>> 76 > >>> > >>> > > 47765a > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Aerospace at en" . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a9 > >>> 3b > >>> > >>> > > 55f9c7> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b > >>> 94 > >>> > >>> > > 790c4e> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c9 > >>> 94 > >>> > >>> > > 542b23> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df79 > >>> 73 > >>> > >>> > > 4c5a3d> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-2233 > >>> 2a > >>> > >>> > > d9d839> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416 > >>> d2 > >>> > >>> > > 8f4cf1> . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbd > >>> a4 > >>> > >>> > > 616b72 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Pixel Classification at en" . > >>> > > > >>> > > I added this document to owlim.ttl and imported my instances. > >>> > > > >>> > > I tried to follow document Customizing KIM 3.pdf, but as mapping > >>> > > has already been done, I didn't know what else to do. Maybe I > >>> > > should > >>> > >>> create > >>> > >>> > > Jape rule, or something like that, but I think that I should see > >>> > >>> Topic > >>> > >>> > > with or without my instances. I'm not sure, that is only my > >>> > > opinion. > >>> > > > >>> > > Best, > >>> > > Srecko > >>> > > > >>> > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < > >>> > > > >>> > > philip.alexiev at ontotext.com> wrote: > >>> > >> Can you describe the exact actions you take to add the topics to > >>> > >> the > >>> > >>> IE > >>> > >>> > >> logic ? The exact customizations you have made to KIM. > >>> > >> > >>> > >> Thanks, > >>> > >> Philip > >>> > >> > >>> > >> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: > >>> > >> > >>> > >> Hi Philip, > >>> > >> with GATE is same as with Java code. I get the same annotations. I > >>> > >>> tried > >>> > >>> > >> to edit nerc.properties and add Topic to > >>> > >> *com.ontotext.kim.KIMConstants.IE_ANN_TYPES * list, but nothing > >>> > >> changed*. > >>> > >> * > >>> > >> Do I have to change something else? > >>> > >> > >>> > >> Best, > >>> > >> Srecko > >>> > >> > >>> > >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < > >>> > >> > >>> > >> philip.alexiev at ontotext.com> wrote: > >>> > >>> Hi Srecko, > >>> > >>> > >>> > >>> You can run the gate interface to check exactly what annotations > >>> > >>> are > >>> > >>> > >>> create ant their type. You can do this by running: > >>> > >>> bash KIM/bin/kim gate > >>> > >>> > >>> > >>> You probably use a Jape rule to match the Lookup annotations with > >>> > >>> class=" http://proton.semanticweb.org/2006/05/protont#Topic" and > >>> > >>> are > >>> > >>> > >>> creating one of the entity annotations over it (the entity > >>> > >>> annotations are a whitelist of annotations that remain after the > >>> > >>> annotation process finishes, all annotations not in this list are > >>> > >>> removed). > >>> > >>> > >>> > >>> So check what type of annotation you are creating. > >>> > >>> > >>> > >>> If this is not the case, please provide more details how you > >>> > >>> handle > >>> > >>> the > >>> > >>> > >>> topic lookups. > >>> > >>> > >>> > >>> All the best, > >>> > >>> Philip > >>> > >>> > >>> > >>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: > >>> > >>> > Hello Philip, > >>> > >>> > > >>> > >>> > I included my instances in KIM. When I use web UI, I see them > >>> > >>> all, > >>> > >>> > >>> > and > >>> > >>> > >>> > >>> everything looks ok. But when I run code like this: > >>> > >>> > KIMDocument kimDoc = > >>> > >>> > >>> > >>> apiCorpora.createDocument(_string_to_annotate, true); > >>> > >>> > >>> > >>> > kimDoc = apiSemAnn.execute(kimDoc); > >>> > >>> > > >>> > >>> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); > >>> > >>> > Set typesSet = kimASet.getAllTypes(); > >>> > >>> > Iterator iterator = typesSet.iterator(); > >>> > >>> > > >>> > >>> > // show annotations of every type separately > >>> > >>> > while(iterator.hasNext()) > >>> > >>> > { > >>> > >>> > > >>> > >>> > Object key = iterator.next(); > >>> > >>> > KIMAnnotationSet kimFilteredASet = > >>> > >>> > >>> > >>> kimASet.get(String.valueOf(key)); > >>> > >>> > >>> > >>> > Iterator annIterator = > >>> > >>> > kimFilteredASet.iterator(); > >>> > >>> > System.out.println(" = Annotations of type [" + > >>> > >>> > >>> > >>> String.valueOf(key) + "] :"); > >>> > >>> > >>> > >>> > while(annIterator.hasNext()) > >>> > >>> > { > >>> > >>> > > >>> > >>> > System.out.println(" -- " + > >>> > >>> annIterator.next()); > >>> > >>> > >>> > } > >>> > >>> > > >>> > >>> > } > >>> > >>> > System.out.println("[ Document's Typed Annotations > >>> > >>> (end) > >>> > >>> > >>> ]"); > >>> > >>> > >>> > >>> > I don't see any annotation of type Topic. I see all of them > >>> > >>> > when > >>> > >>> I > >>> > >>> > >>> > use > >>> > >>> > >>> > >>> web UI, like I said. But when I try to annotate string from Java > >>> > >>> application, I don't get any Topic annotations. > >>> > >>> > >>> > >>> > Could you please help me on this one? > >>> > >>> > > >>> > >>> > Best, > >>> > >>> > Srecko > >>> > >>> > _______________________________________________ > >>> > >>> > Kim-discussion mailing list > >>> > >>> > Kim-discussion at ontotext.com > >>> > >>> > http://ontotext.com/mailman/listinfo/kim-discussion > >>> > >> > >>> > >> _______________________________________________ > >>> > >> Kim-discussion mailing list > >>> > >> Kim-discussion at ontotext.com > >>> > >> http://ontotext.com/mailman/listinfo/kim-discussion > >>> > >>> -- > >>> Boyan Kukushev > >>> Senior Software Engineer / Java Developer > >>> Ontotext AD @ Sirma Group Corp. > > > > _______________________________________________ > > Kim-discussion mailing list > > Kim-discussion at ontotext.com > > http://ontotext.com/mailman/listinfo/kim-discussion -- Boyan Kukushev Senior Software Engineer / Java Developer Ontotext AD @ Sirma Group Corp. From sreckojoksimovic at gmail.com Fri Jul 15 07:10:46 2011 From: sreckojoksimovic at gmail.com (srecko joksimovic) Date: Fri, 15 Jul 2011 13:10:46 +0200 Subject: [Kim-discussion] Topic In-Reply-To: <201107151354.44016.boyan.kukushev@ontotext.com> References: <201107151354.44016.boyan.kukushev@ontotext.com> Message-ID: Excellent :) I will try this. After that, we are going for a drink :) Thank you! On Fri, Jul 15, 2011 at 12:54 PM, Boyan Kukushev < boyan.kukushev at ontotext.com> wrote: > Hi :) > > We think we finally found where your problem is: the labels in this RDF are > wrong. > *wrong* : "Consumer Products at en" > *correct*: "Consumer Products"@en > > Steps to resolve: > > - re-create your RDF information - either move the language descriptor for > each label outside the quotes, or remove it from all labels; make sure you > have at least these statements for any entity you want to look up: > * protons:generatedBy > * rdf-syntax:type protont:Topic > * protons:hasMainAlias > * rdf-syntax:type protons:Alias > * rdf-schema:label "entity-label"@lang > > !!! Do not forget to declare trusted by adding the > statement > * rdf-syntax:type protons:Trusted > > - remove your entire KIM/context/default/populated directory (to clear > gazetteer cache) > > - make sure you have added the Topic annotation type to > KIM/config/nerc.properties:com.ontotext.kim.KIMConstants.IE_ANN_TYPES > > - edit KIM/config/owlim.ttl so that KIM will load your RDF information > (editing should be adding two lines - one to 'owlim:imports', one to > 'owlim:defaultNS' > > - start KIM server by KIM/bin/kim(.bat) gate and wait > till server > is up > > - initialize the JAPE rule for creating Topic annotations from Lookups and > add it in the pipeline after the gazetteer phases > > - process any document you are sure to have Topic annotations in text > > - if successful - store the GATE application pipeline as described in > previous mail; > > - report what happened :) > > I hope this brief step listing would be helpful. > > Regards, > Boyan > > I suggest you re-create > > On Friday, July 15, 2011 11:38:00 srecko joksimovic wrote: > > Hi Philip, > > Hi Boyan, > > > > I'm still having a problem with Topic, and annotations. > > Here is what I did: > > I added acm_proton.n3 file to owlim.ttl, and imported new instances. When > I > > run web UI, I'm able to see all I imported. > > After that I started GATE. Created query: > > > > PREFIX rdf: > > PREFIX protons: > > PREFIX protont: > > PREFIX rdfs: > > SELECT ?topic ?alias ?label > > WHERE { > > ?topic rdf:type protont:Topic. > > ?topic protons:hasMainAlias ?alias. > > ?alias a protons:Alias. > > ?alias rdfs:label ?label. > > } > > > > when I evaluate this query with JVisualVM, this is what I get: > > > > [http://www.lornet.org/acm-ccs/proton#J.7.1, > > > http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5d5 > > ee1a, "Consumer Products at en"] > > [http://www.lornet.org/acm-ccs/proton#B.7.3.1, > > > http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b55 > > f9c7, "Error-Checking at en"] > > [http://www.lornet.org/acm-ccs/proton#K.7.2, > > > http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b9479 > > 0c4e, "Organizations at en"] > > [http://www.lornet.org/acm-ccs/proton#D.3.2.10, > > > http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c99454 > > 2b23, "Nonprocedural Languages at en"] > > [http://www.lornet.org/acm-ccs/proton#C.1.1.2, > > > http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332ad9 > > d839, "RISC/CISC, VLIW Architectures at en"] > > ?. > > > > With GATE, I used JAPE rule that you suggested: > > Phase: GazTopic > > Input: Lookup > > Options: control = appelt > > Rule: Topic > > ( > > {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} > > ):topic > > --> > > > > :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, > > > > inst=:topic.Lookup.inst} > > > > Then I tried to create LKB Gazetteer. When I configure new LKB Gazetteer > > using this parameters: > > > > annotationLimit = -1 > > caseSensitive = false > > dictFeederClass = com.ontotext.kim.model.KimDictionaryFeederImpl > > dictFeederParams = > > FeedSetupPath=$relpath$../context/default/resources/gazetteer/acm-topic > > dynamicDictEnabled = false > > feedTransformerStages = > > outputASName = > > relpath = file_path > > staticDictEnabled = true > > staticDictSerializationPath = file_path > > > > (add to pipeline, before JAPE rule, and after default Gazetteer) > > > > I don't see Topic. But, when I put outputASName = Topic, I see Topic as a > > separated group, with Lookup as it's subgroup. When I select Lookup, I > can > > see that something is annotated, but it is not same as terms that I > > imported. > > When I try to annotate using Java code, it's still same. > > > > Could you please tell me what I did wrong? > > > > Best, > > Srecko > > > > > > On Thu, Jul 14, 2011 at 2:49 PM, Philip Alexiev @ Ontotext < > > > > philip.alexiev at ontotext.com> wrote: > > > Hi Srecko, > > > > > > Make sure you add the JAPE transducer after the gazetteer. > > > > > > Also it will be much simpler to examine the process directly in the > GATE > > > interface. Create a file, add it to the corpus, set the corpus to the > > > pipeline and execute the pipeline. > > > > > > Disable the last resource to not remove the temporary annotations. Run > > > the pipeline and explore the annotations. See if Lookup are created > over > > > the topics in the text. > > > > > > Things to have in mind: > > > - after you change the RDF that is imported in KIM, you should delete > > > the > > > > > > KIM/context/default/populated folder (where the cache is stored) and > > > > > > restart the server > > > - the JAPE transducer should be after the Gazetteer, as it uses the > > > annotations , created by the Gazetteer as its input. > > > > > > Hope this helps > > > Philip > > > > > > > > > On 14 Jul 2011, at 3:42 PM, srecko joksimovic wrote: > > > > > > Hi Boyan, > > > I created JAPE rule like you and Philip sugested, and stored to > > > context/default/resources/grammar/acm folder. Then I run gate, and > > > created JAPE transducer, topic_jape. I didn't specify inputASName, but > I > > > did add it to pipeline. Saved application state, populated KIMServer > > > corpus, and run the application. I don't know how, but there is > > > everything but Topic. > > > > > > I'm still missing something, but I don't know what. Should I create > Large > > > KB Gazetteer? > > > > > > Best, > > > Srecko > > > > > > On Thu, Jul 14, 2011 at 2:15 PM, srecko joksimovic < > > > > > > sreckojoksimovic at gmail.com> wrote: > > >> Hi Boyan, > > >> I didn't understand that I must create JAPE rule before I do > everything > > >> else. > > >> I'll try this now. > > >> > > >> Thank you! > > >> > > >> Srecko > > >> > > >> > > >> On Thu, Jul 14, 2011 at 2:13 PM, Boyan Kukushev < > > >> > > >> boyan.kukushev at ontotext.com> wrote: > > >>> Hi Srecko, > > >>> > > >>> In order to see your Topic annotations, you must create the JAPE rule > > >>> that > > >>> Philip suggested: > > >>> > > >>> Phase: GazTopic > > >>> Input: Lookup > > >>> Options: control = appelt > > >>> Rule: Topic > > >>> ( > > >>> > > >>> {Lookup.class == > > >>> "http://proton.semanticweb.org/2006/05/protont#Topic"} > > >>> > > >>> ):topic > > >>> --> > > >>> > > >>> :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, > > >>> > > >>> inst=:topic.Lookup.inst} > > >>> > > >>> and put that rule just after the gazetteer phases within the GATE > > >>> pipeline. > > >>> The easiest way to do this is using the KIM GATE interface by > starting > > >>> > > >>> KIM/bin/kim(.bat) gate > > >>> > > >>> and modifying the pipeline. > > >>> > > >>> You have already added the Topic annotation type to the list of > allowed > > >>> annotation types in KIM/config/nerc.properties. After you run the > > >>> pipeline > > >>> with this new resource incuded, Topic annotations should appear in > the > > >>> default > > >>> annotation set for each document you process. > > >>> > > >>> To be able to use again the pipeline, you should save it, again using > > >>> the KIM > > >>> GATE interface - right click on the pipeline and select 'Save > > >>> application state'. Remember to remove (or empty) the document corpus > > >>> used by the application. You choose whether to overwrite the default > > >>> KIM pipeline (IE.gapp) or create a new one and point KIM to use it > > >>> (setting the corresponding property in KIM/config/nerc.properties). > > >>> > > >>> Hope this helps! > > >>> > > >>> Regards, > > >>> Boyan > > >>> > > >>> P.S. What is happening exactly: > > >>> - the gazetteer phases use pre-defined knowledge base to find > specific > > >>> > > >>> 'things' in the text you process; they produce annotations of type > > >>> Lookup > > >>> > > >>> - the JAPE rule would take all Lookup annotations that have the > > >>> specific > > >>> > > >>> class (in your case that is > > >>> "http://proton.semanticweb.org/2006/05/protont#Topic") and would > create > > >>> a new > > >>> annotation of type Topic that is fully overlapping the current Lookup > > >>> annotation > > >>> > > >>> - the last phase in the pipeline removes all temporary annotations - > > >>> the > > >>> > > >>> Lookup annotation is also a temporary annotation, but Topic (as it is > > >>> added to > > >>> the allowed annotations list) will not be removed. > > >>> > > >>> On Thursday, July 14, 2011 14:50:51 srecko joksimovic wrote: > > >>> > I configured nerc.properties, and now I have this: > > >>> > > > >>> > com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, > > >>> > > >>> ContactInfo, > > >>> > > >>> > Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, > > >>> > > >>> KeyPerson, > > >>> > > >>> > KeyPhrase, Location, Money, Object, Organization, Percent, Person, > > >>> > Position, Time, Acquirement, JobTitle, Number, Topic > > >>> > > > >>> > then I disabled last resource in pipeline, but I still can't see > > >>> > Topic. Maybe I didn't understand well... should I first create Jape > > >>> > rule, or > > >>> > > >>> this > > >>> > > >>> > is enough to see Topic? > > >>> > > > >>> > Best, > > >>> > Srecko > > >>> > > > >>> > > > >>> > > > >>> > On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext < > > >>> > > > >>> > philip.alexiev at ontotext.com> wrote: > > >>> > > The process is described in the customization guide you > mentioned. > > >>> > > > > >>> > > You have added this RDF to the semantic repository. This means > > >>> > > that > > >>> > > >>> now > > >>> > > >>> > > the gazetteer will be able to match the instances described there > > >>> > > (searching for their labels in the texts) and will create Lookup > > >>> > > annotations, when it finds a match. The lookup process is > > >>> > > generally only one side of the IE process. Lookup annotations > are > > >>> > > further examined by some logic to determine their validity, or > > >>> > > they take part > > >>> > > >>> in > > >>> > > >>> > > the recognition of more complex phrases. That is why they are > not > > >>> > > >>> left > > >>> > > >>> > > after the IE has finished over the document. > > >>> > > > > >>> > > Tip: You can disable the last resource in the pipeline and run it > > >>> > > >>> again > > >>> > > >>> > > over a document to see all the annotations that are created in > the > > >>> > > process - also the temporary ones. This will show you the Lookup > > >>> > > annotations as well. You can search for your Topic instances > > >>> > > there. > > >>> > > > > >>> > > Once you have the lookups, you should tell KIM that they are > > >>> > > >>> important > > >>> > > >>> > > for you and you want to keep them. Add the Topic annotation type > > >>> > > to > > >>> > > >>> the > > >>> > > >>> > > * com.ontotext.kim.KIMConstants.IE_ANN_TYPES *list. This will > > >>> > > tell > > >>> > > >>> KIM > > >>> > > >>> > > not to clear them in the end of the IE process. Now what is left > > >>> > > for you to do, is to create a Topic annotation over the Lookup > > >>> > > for the topic that the Gazetteer has created. You can use a > > >>> > > simple Jape rule > > >>> > > >>> to > > >>> > > >>> > > do that: > > >>> > > > > >>> > > > > >>> > > Phase: GazTopic > > >>> > > Input: Lookup > > >>> > > Options: control = appelt > > >>> > > > > >>> > > Rule: Topic > > >>> > > ( > > >>> > > > > >>> > > {Lookup.class == " > > >>> > > >>> http://proton.semanticweb.org/2006/05/protont#Topic"} > > >>> > > >>> > > ):topic > > >>> > > --> > > >>> > > > > >>> > > :topic.Topic = {rule = Topic, class = :topic.Lookup.class, > > >>> > > > > >>> > > inst=:topic.Lookup.inst} > > >>> > > > > >>> > > > > >>> > > This is all that you need to include your topics in the IE > process > > >>> > > >>> and to > > >>> > > >>> > > be able to see them in the graphical interface. > > >>> > > > > >>> > > Hope this helps > > >>> > > philip > > >>> > > > > >>> > > > > >>> > > On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: > > >>> > > > > >>> > > It is little hard to explain because I didn't do customisation. I > > >>> > > >>> took > > >>> > > >>> > > the file where one of my colleagues did it. File contains about > > >>> > > 1200 instances and has content like this: > > >>> > > > > >>> > > @prefix protons: < > http://proton.semanticweb.org/2006/05/protons#> > > >>> > > . @prefix protont: > > >>> > > . > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c9 > > >>> 12 > > >>> > > >>> > > 1b4297 > > >>> > > > > >>> > > a protons:Alias ; > > >>> > > > > >>> > > > > >>> > > "Convex Programming at en" . > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227 > > >>> de > > >>> > > >>> > > 53c015 > > >>> > > > > >>> > > a protons:Alias ; > > >>> > > > > >>> > > > > >>> > > "Document and Text Processing at en" . > > >>> > > > > >>> > > > > >>> > > > > >>> > > a protont:Topic ; > > >>> > > protons:generatedBy > > >>> > > > > >>> > > > > >>> > > ; > > >>> > > > > >>> > > protons:hasMainAlias > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdd > > >>> e5 > > >>> > > >>> > > d5ee1a> . > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f > > >>> 09 > > >>> > > >>> > > 898d2d > > >>> > > > > >>> > > a protons:Alias ; > > >>> > > > > >>> > > > > >>> > > "Store and Forward Networks at en" . > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb5 > > >>> 3a > > >>> > > >>> > > e3b0b6 > > >>> > > > > >>> > > a protons:Alias ; > > >>> > > > > >>> > > > > >>> > > "Integral Equations at en" . > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a > > >>> 8c > > >>> > > >>> > > 034636 > > >>> > > > > >>> > > a protons:Alias ; > > >>> > > > > >>> > > > > >>> > > "Information Filtering at en" . > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-8116 > > >>> 10 > > >>> > > >>> > > a2f77f > > >>> > > > > >>> > > a protons:Alias ; > > >>> > > > > >>> > > > > >>> > > "Surfac eFitting at en" . > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d > > >>> 8d > > >>> > > >>> > > 150f30 > > >>> > > > > >>> > > a protons:Alias ; > > >>> > > > > >>> > > > > >>> > > "Reliability,Availability and Serviceability at en" . > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c881 > > >>> 76 > > >>> > > >>> > > 47765a > > >>> > > > > >>> > > a protons:Alias ; > > >>> > > > > >>> > > > > >>> > > "Aerospace at en" . > > >>> > > > > >>> > > > > >>> > > > > >>> > > a protont:Topic ; > > >>> > > protons:generatedBy > > >>> > > > > >>> > > > > >>> > > ; > > >>> > > > > >>> > > protons:hasMainAlias > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a9 > > >>> 3b > > >>> > > >>> > > 55f9c7> . > > >>> > > > > >>> > > > > >>> > > > > >>> > > a protont:Topic ; > > >>> > > protons:generatedBy > > >>> > > > > >>> > > > > >>> > > ; > > >>> > > > > >>> > > protons:hasMainAlias > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b > > >>> 94 > > >>> > > >>> > > 790c4e> . > > >>> > > > > >>> > > > > >>> > > > > >>> > > a protont:Topic ; > > >>> > > protons:generatedBy > > >>> > > > > >>> > > > > >>> > > ; > > >>> > > > > >>> > > protons:hasMainAlias > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c9 > > >>> 94 > > >>> > > >>> > > 542b23> . > > >>> > > > > >>> > > > > >>> > > > > >>> > > a protont:Topic ; > > >>> > > protons:generatedBy > > >>> > > > > >>> > > > > >>> > > ; > > >>> > > > > >>> > > protons:hasMainAlias > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df79 > > >>> 73 > > >>> > > >>> > > 4c5a3d> . > > >>> > > > > >>> > > > > >>> > > > > >>> > > a protont:Topic ; > > >>> > > protons:generatedBy > > >>> > > > > >>> > > > > >>> > > ; > > >>> > > > > >>> > > protons:hasMainAlias > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-2233 > > >>> 2a > > >>> > > >>> > > d9d839> . > > >>> > > > > >>> > > > > >>> > > > > >>> > > a protont:Topic ; > > >>> > > protons:generatedBy > > >>> > > > > >>> > > > > >>> > > ; > > >>> > > > > >>> > > protons:hasMainAlias > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416 > > >>> d2 > > >>> > > >>> > > 8f4cf1> . > > >>> > > > > >>> > > < > > >>> > > >>> > http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbd > > >>> a4 > > >>> > > >>> > > 616b72 > > >>> > > > > >>> > > a protons:Alias ; > > >>> > > > > >>> > > > > >>> > > "Pixel Classification at en" . > > >>> > > > > >>> > > I added this document to owlim.ttl and imported my instances. > > >>> > > > > >>> > > I tried to follow document Customizing KIM 3.pdf, but as mapping > > >>> > > has already been done, I didn't know what else to do. Maybe I > > >>> > > should > > >>> > > >>> create > > >>> > > >>> > > Jape rule, or something like that, but I think that I should see > > >>> > > >>> Topic > > >>> > > >>> > > with or without my instances. I'm not sure, that is only my > > >>> > > opinion. > > >>> > > > > >>> > > Best, > > >>> > > Srecko > > >>> > > > > >>> > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < > > >>> > > > > >>> > > philip.alexiev at ontotext.com> wrote: > > >>> > >> Can you describe the exact actions you take to add the topics to > > >>> > >> the > > >>> > > >>> IE > > >>> > > >>> > >> logic ? The exact customizations you have made to KIM. > > >>> > >> > > >>> > >> Thanks, > > >>> > >> Philip > > >>> > >> > > >>> > >> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: > > >>> > >> > > >>> > >> Hi Philip, > > >>> > >> with GATE is same as with Java code. I get the same annotations. > I > > >>> > > >>> tried > > >>> > > >>> > >> to edit nerc.properties and add Topic to > > >>> > >> *com.ontotext.kim.KIMConstants.IE_ANN_TYPES * list, but nothing > > >>> > >> changed*. > > >>> > >> * > > >>> > >> Do I have to change something else? > > >>> > >> > > >>> > >> Best, > > >>> > >> Srecko > > >>> > >> > > >>> > >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < > > >>> > >> > > >>> > >> philip.alexiev at ontotext.com> wrote: > > >>> > >>> Hi Srecko, > > >>> > >>> > > >>> > >>> You can run the gate interface to check exactly what > annotations > > >>> > > >>> are > > >>> > > >>> > >>> create ant their type. You can do this by running: > > >>> > >>> bash KIM/bin/kim gate > > >>> > >>> > > >>> > >>> You probably use a Jape rule to match the Lookup annotations > with > > >>> > >>> class=" http://proton.semanticweb.org/2006/05/protont#Topic" > and > > >>> > > >>> are > > >>> > > >>> > >>> creating one of the entity annotations over it (the entity > > >>> > >>> annotations are a whitelist of annotations that remain after > the > > >>> > >>> annotation process finishes, all annotations not in this list > are > > >>> > >>> removed). > > >>> > >>> > > >>> > >>> So check what type of annotation you are creating. > > >>> > >>> > > >>> > >>> If this is not the case, please provide more details how you > > >>> > >>> handle > > >>> > > >>> the > > >>> > > >>> > >>> topic lookups. > > >>> > >>> > > >>> > >>> All the best, > > >>> > >>> Philip > > >>> > >>> > > >>> > >>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: > > >>> > >>> > Hello Philip, > > >>> > >>> > > > >>> > >>> > I included my instances in KIM. When I use web UI, I see them > > >>> > > >>> all, > > >>> > > >>> > >>> > and > > >>> > >>> > > >>> > >>> everything looks ok. But when I run code like this: > > >>> > >>> > KIMDocument kimDoc = > > >>> > >>> > > >>> > >>> apiCorpora.createDocument(_string_to_annotate, true); > > >>> > >>> > > >>> > >>> > kimDoc = apiSemAnn.execute(kimDoc); > > >>> > >>> > > > >>> > >>> > KIMAnnotationSet kimASet = > kimDoc.getAnnotations(); > > >>> > >>> > Set typesSet = kimASet.getAllTypes(); > > >>> > >>> > Iterator iterator = typesSet.iterator(); > > >>> > >>> > > > >>> > >>> > // show annotations of every type separately > > >>> > >>> > while(iterator.hasNext()) > > >>> > >>> > { > > >>> > >>> > > > >>> > >>> > Object key = iterator.next(); > > >>> > >>> > KIMAnnotationSet kimFilteredASet = > > >>> > >>> > > >>> > >>> kimASet.get(String.valueOf(key)); > > >>> > >>> > > >>> > >>> > Iterator annIterator = > > >>> > >>> > kimFilteredASet.iterator(); > > >>> > >>> > System.out.println(" = Annotations of type [" > + > > >>> > >>> > > >>> > >>> String.valueOf(key) + "] :"); > > >>> > >>> > > >>> > >>> > while(annIterator.hasNext()) > > >>> > >>> > { > > >>> > >>> > > > >>> > >>> > System.out.println(" -- " + > > >>> > > >>> annIterator.next()); > > >>> > > >>> > >>> > } > > >>> > >>> > > > >>> > >>> > } > > >>> > >>> > System.out.println("[ Document's Typed Annotations > > >>> > > >>> (end) > > >>> > > >>> > >>> ]"); > > >>> > >>> > > >>> > >>> > I don't see any annotation of type Topic. I see all of them > > >>> > >>> > when > > >>> > > >>> I > > >>> > > >>> > >>> > use > > >>> > >>> > > >>> > >>> web UI, like I said. But when I try to annotate string from > Java > > >>> > >>> application, I don't get any Topic annotations. > > >>> > >>> > > >>> > >>> > Could you please help me on this one? > > >>> > >>> > > > >>> > >>> > Best, > > >>> > >>> > Srecko > > >>> > >>> > _______________________________________________ > > >>> > >>> > Kim-discussion mailing list > > >>> > >>> > Kim-discussion at ontotext.com > > >>> > >>> > http://ontotext.com/mailman/listinfo/kim-discussion > > >>> > >> > > >>> > >> _______________________________________________ > > >>> > >> Kim-discussion mailing list > > >>> > >> Kim-discussion at ontotext.com > > >>> > >> http://ontotext.com/mailman/listinfo/kim-discussion > > >>> > > >>> -- > > >>> Boyan Kukushev > > >>> Senior Software Engineer / Java Developer > > >>> Ontotext AD @ Sirma Group Corp. > > > > > > _______________________________________________ > > > Kim-discussion mailing list > > > Kim-discussion at ontotext.com > > > http://ontotext.com/mailman/listinfo/kim-discussion > > -- > Boyan Kukushev > Senior Software Engineer / Java Developer > Ontotext AD @ Sirma Group Corp. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sreckojoksimovic at gmail.com Fri Jul 15 16:29:03 2011 From: sreckojoksimovic at gmail.com (Srecko Joksimovic) Date: Fri, 15 Jul 2011 22:29:03 +0200 Subject: [Kim-discussion] Topic In-Reply-To: <201107151354.44016.boyan.kukushev@ontotext.com> References: <201107151354.44016.boyan.kukushev@ontotext.com> Message-ID: <001f01cc432d$d7f7c3b0$87e74b10$@com> Hi guys! I have a problem... just kidding :) everything is ok! Thank you! The problem was this: *wrong* : "Consumer Products at en" *correct*: "Consumer Products"@en Everything else was ok, but I didn?t see this. I don't know how, but... Thank you for your time, and help. I suppose to add SKOS:Concept. What would be the best way to do that? Best, Srecko -----Original Message----- From: Boyan Kukushev [mailto:boyan.kukushev at ontotext.com] Sent: Friday, July 15, 2011 12:55 To: srecko joksimovic Cc: Philip Alexiev @ Ontotext; kim-discussion at ontotext.com Subject: Re: [Kim-discussion] Topic Hi :) We think we finally found where your problem is: the labels in this RDF are wrong. *wrong* : "Consumer Products at en" *correct*: "Consumer Products"@en Steps to resolve: - re-create your RDF information - either move the language descriptor for each label outside the quotes, or remove it from all labels; make sure you have at least these statements for any entity you want to look up: * protons:generatedBy * rdf-syntax:type protont:Topic * protons:hasMainAlias * rdf-syntax:type protons:Alias * rdf-schema:label "entity-label"@lang !!! Do not forget to declare trusted by adding the statement * rdf-syntax:type protons:Trusted - remove your entire KIM/context/default/populated directory (to clear gazetteer cache) - make sure you have added the Topic annotation type to KIM/config/nerc.properties:com.ontotext.kim.KIMConstants.IE_ANN_TYPES - edit KIM/config/owlim.ttl so that KIM will load your RDF information (editing should be adding two lines - one to 'owlim:imports', one to 'owlim:defaultNS' - start KIM server by KIM/bin/kim(.bat) gate and wait till server is up - initialize the JAPE rule for creating Topic annotations from Lookups and add it in the pipeline after the gazetteer phases - process any document you are sure to have Topic annotations in text - if successful - store the GATE application pipeline as described in previous mail; - report what happened :) I hope this brief step listing would be helpful. Regards, Boyan I suggest you re-create On Friday, July 15, 2011 11:38:00 srecko joksimovic wrote: > Hi Philip, > Hi Boyan, > > I'm still having a problem with Topic, and annotations. > Here is what I did: > I added acm_proton.n3 file to owlim.ttl, and imported new instances. When I > run web UI, I'm able to see all I imported. > After that I started GATE. Created query: > > PREFIX rdf: > PREFIX protons: > PREFIX protont: > PREFIX rdfs: > SELECT ?topic ?alias ?label > WHERE { > ?topic rdf:type protont:Topic. > ?topic protons:hasMainAlias ?alias. > ?alias a protons:Alias. > ?alias rdfs:label ?label. > } > > when I evaluate this query with JVisualVM, this is what I get: > > [http://www.lornet.org/acm-ccs/proton#J.7.1, > http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdde5d5 > ee1a, "Consumer Products at en"] > [http://www.lornet.org/acm-ccs/proton#B.7.3.1, > http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a93b55 > f9c7, "Error-Checking at en"] > [http://www.lornet.org/acm-ccs/proton#K.7.2, > http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b9479 > 0c4e, "Organizations at en"] > [http://www.lornet.org/acm-ccs/proton#D.3.2.10, > http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c99454 > 2b23, "Nonprocedural Languages at en"] > [http://www.lornet.org/acm-ccs/proton#C.1.1.2, > http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332ad9 > d839, "RISC/CISC, VLIW Architectures at en"] > ?. > > With GATE, I used JAPE rule that you suggested: > Phase: GazTopic > Input: Lookup > Options: control = appelt > Rule: Topic > ( > {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"} > ):topic > --> > > :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, > > inst=:topic.Lookup.inst} > > Then I tried to create LKB Gazetteer. When I configure new LKB Gazetteer > using this parameters: > > annotationLimit = -1 > caseSensitive = false > dictFeederClass = com.ontotext.kim.model.KimDictionaryFeederImpl > dictFeederParams = > FeedSetupPath=$relpath$../context/default/resources/gazetteer/acm-topic > dynamicDictEnabled = false > feedTransformerStages = > outputASName = > relpath = file_path > staticDictEnabled = true > staticDictSerializationPath = file_path > > (add to pipeline, before JAPE rule, and after default Gazetteer) > > I don't see Topic. But, when I put outputASName = Topic, I see Topic as a > separated group, with Lookup as it's subgroup. When I select Lookup, I can > see that something is annotated, but it is not same as terms that I > imported. > When I try to annotate using Java code, it's still same. > > Could you please tell me what I did wrong? > > Best, > Srecko > > > On Thu, Jul 14, 2011 at 2:49 PM, Philip Alexiev @ Ontotext < > > philip.alexiev at ontotext.com> wrote: > > Hi Srecko, > > > > Make sure you add the JAPE transducer after the gazetteer. > > > > Also it will be much simpler to examine the process directly in the GATE > > interface. Create a file, add it to the corpus, set the corpus to the > > pipeline and execute the pipeline. > > > > Disable the last resource to not remove the temporary annotations. Run > > the pipeline and explore the annotations. See if Lookup are created over > > the topics in the text. > > > > Things to have in mind: > > - after you change the RDF that is imported in KIM, you should delete > > the > > > > KIM/context/default/populated folder (where the cache is stored) and > > > > restart the server > > - the JAPE transducer should be after the Gazetteer, as it uses the > > annotations , created by the Gazetteer as its input. > > > > Hope this helps > > Philip > > > > > > On 14 Jul 2011, at 3:42 PM, srecko joksimovic wrote: > > > > Hi Boyan, > > I created JAPE rule like you and Philip sugested, and stored to > > context/default/resources/grammar/acm folder. Then I run gate, and > > created JAPE transducer, topic_jape. I didn't specify inputASName, but I > > did add it to pipeline. Saved application state, populated KIMServer > > corpus, and run the application. I don't know how, but there is > > everything but Topic. > > > > I'm still missing something, but I don't know what. Should I create Large > > KB Gazetteer? > > > > Best, > > Srecko > > > > On Thu, Jul 14, 2011 at 2:15 PM, srecko joksimovic < > > > > sreckojoksimovic at gmail.com> wrote: > >> Hi Boyan, > >> I didn't understand that I must create JAPE rule before I do everything > >> else. > >> I'll try this now. > >> > >> Thank you! > >> > >> Srecko > >> > >> > >> On Thu, Jul 14, 2011 at 2:13 PM, Boyan Kukushev < > >> > >> boyan.kukushev at ontotext.com> wrote: > >>> Hi Srecko, > >>> > >>> In order to see your Topic annotations, you must create the JAPE rule > >>> that > >>> Philip suggested: > >>> > >>> Phase: GazTopic > >>> Input: Lookup > >>> Options: control = appelt > >>> Rule: Topic > >>> ( > >>> > >>> {Lookup.class == > >>> "http://proton.semanticweb.org/2006/05/protont#Topic"} > >>> > >>> ):topic > >>> --> > >>> > >>> :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class, > >>> > >>> inst=:topic.Lookup.inst} > >>> > >>> and put that rule just after the gazetteer phases within the GATE > >>> pipeline. > >>> The easiest way to do this is using the KIM GATE interface by starting > >>> > >>> KIM/bin/kim(.bat) gate > >>> > >>> and modifying the pipeline. > >>> > >>> You have already added the Topic annotation type to the list of allowed > >>> annotation types in KIM/config/nerc.properties. After you run the > >>> pipeline > >>> with this new resource incuded, Topic annotations should appear in the > >>> default > >>> annotation set for each document you process. > >>> > >>> To be able to use again the pipeline, you should save it, again using > >>> the KIM > >>> GATE interface - right click on the pipeline and select 'Save > >>> application state'. Remember to remove (or empty) the document corpus > >>> used by the application. You choose whether to overwrite the default > >>> KIM pipeline (IE.gapp) or create a new one and point KIM to use it > >>> (setting the corresponding property in KIM/config/nerc.properties). > >>> > >>> Hope this helps! > >>> > >>> Regards, > >>> Boyan > >>> > >>> P.S. What is happening exactly: > >>> - the gazetteer phases use pre-defined knowledge base to find specific > >>> > >>> 'things' in the text you process; they produce annotations of type > >>> Lookup > >>> > >>> - the JAPE rule would take all Lookup annotations that have the > >>> specific > >>> > >>> class (in your case that is > >>> "http://proton.semanticweb.org/2006/05/protont#Topic") and would create > >>> a new > >>> annotation of type Topic that is fully overlapping the current Lookup > >>> annotation > >>> > >>> - the last phase in the pipeline removes all temporary annotations - > >>> the > >>> > >>> Lookup annotation is also a temporary annotation, but Topic (as it is > >>> added to > >>> the allowed annotations list) will not be removed. > >>> > >>> On Thursday, July 14, 2011 14:50:51 srecko joksimovic wrote: > >>> > I configured nerc.properties, and now I have this: > >>> > > >>> > com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, > >>> > >>> ContactInfo, > >>> > >>> > Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, > >>> > >>> KeyPerson, > >>> > >>> > KeyPhrase, Location, Money, Object, Organization, Percent, Person, > >>> > Position, Time, Acquirement, JobTitle, Number, Topic > >>> > > >>> > then I disabled last resource in pipeline, but I still can't see > >>> > Topic. Maybe I didn't understand well... should I first create Jape > >>> > rule, or > >>> > >>> this > >>> > >>> > is enough to see Topic? > >>> > > >>> > Best, > >>> > Srecko > >>> > > >>> > > >>> > > >>> > On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext < > >>> > > >>> > philip.alexiev at ontotext.com> wrote: > >>> > > The process is described in the customization guide you mentioned. > >>> > > > >>> > > You have added this RDF to the semantic repository. This means > >>> > > that > >>> > >>> now > >>> > >>> > > the gazetteer will be able to match the instances described there > >>> > > (searching for their labels in the texts) and will create Lookup > >>> > > annotations, when it finds a match. The lookup process is > >>> > > generally only one side of the IE process. Lookup annotations are > >>> > > further examined by some logic to determine their validity, or > >>> > > they take part > >>> > >>> in > >>> > >>> > > the recognition of more complex phrases. That is why they are not > >>> > >>> left > >>> > >>> > > after the IE has finished over the document. > >>> > > > >>> > > Tip: You can disable the last resource in the pipeline and run it > >>> > >>> again > >>> > >>> > > over a document to see all the annotations that are created in the > >>> > > process - also the temporary ones. This will show you the Lookup > >>> > > annotations as well. You can search for your Topic instances > >>> > > there. > >>> > > > >>> > > Once you have the lookups, you should tell KIM that they are > >>> > >>> important > >>> > >>> > > for you and you want to keep them. Add the Topic annotation type > >>> > > to > >>> > >>> the > >>> > >>> > > * com.ontotext.kim.KIMConstants.IE_ANN_TYPES *list. This will > >>> > > tell > >>> > >>> KIM > >>> > >>> > > not to clear them in the end of the IE process. Now what is left > >>> > > for you to do, is to create a Topic annotation over the Lookup > >>> > > for the topic that the Gazetteer has created. You can use a > >>> > > simple Jape rule > >>> > >>> to > >>> > >>> > > do that: > >>> > > > >>> > > > >>> > > Phase: GazTopic > >>> > > Input: Lookup > >>> > > Options: control = appelt > >>> > > > >>> > > Rule: Topic > >>> > > ( > >>> > > > >>> > > {Lookup.class == " > >>> > >>> http://proton.semanticweb.org/2006/05/protont#Topic"} > >>> > >>> > > ):topic > >>> > > --> > >>> > > > >>> > > :topic.Topic = {rule = Topic, class = :topic.Lookup.class, > >>> > > > >>> > > inst=:topic.Lookup.inst} > >>> > > > >>> > > > >>> > > This is all that you need to include your topics in the IE process > >>> > >>> and to > >>> > >>> > > be able to see them in the graphical interface. > >>> > > > >>> > > Hope this helps > >>> > > philip > >>> > > > >>> > > > >>> > > On 14 Jul 2011, at 1:58 PM, srecko joksimovic wrote: > >>> > > > >>> > > It is little hard to explain because I didn't do customisation. I > >>> > >>> took > >>> > >>> > > the file where one of my colleagues did it. File contains about > >>> > > 1200 instances and has content like this: > >>> > > > >>> > > @prefix protons: > >>> > > . @prefix protont: > >>> > > . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_07c6001f-8f5c-49e1-ae3c-92c9 > >>> 12 > >>> > >>> > > 1b4297 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Convex Programming at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_63786f1d-7b3c-4872-b4e0-8227 > >>> de > >>> > >>> > > 53c015 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Document and Text Processing at en" . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_dee6eec3-b503-4d3e-a98d-ecdd > >>> e5 > >>> > >>> > > d5ee1a> . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_ff639082-2cc4-484e-92f7-5f0f > >>> 09 > >>> > >>> > > 898d2d > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Store and Forward Networks at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_7f00cebc-8828-4415-83b3-2eb5 > >>> 3a > >>> > >>> > > e3b0b6 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Integral Equations at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_08a8af20-311b-4a43-b996-7d4a > >>> 8c > >>> > >>> > > 034636 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Information Filtering at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_bb28b422-05f4-47f8-a8f5-8116 > >>> 10 > >>> > >>> > > a2f77f > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Surfac eFitting at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_8b20c6a0-1b02-4c34-bd30-740d > >>> 8d > >>> > >>> > > 150f30 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Reliability,Availability and Serviceability at en" . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_b6e48ecc-f1a9-44e6-8ff9-c881 > >>> 76 > >>> > >>> > > 47765a > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Aerospace at en" . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_a4fdb728-b855-4ee6-b220-b8a9 > >>> 3b > >>> > >>> > > 55f9c7> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b > >>> 94 > >>> > >>> > > 790c4e> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c9 > >>> 94 > >>> > >>> > > 542b23> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df79 > >>> 73 > >>> > >>> > > 4c5a3d> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-2233 > >>> 2a > >>> > >>> > > d9d839> . > >>> > > > >>> > > > >>> > > > >>> > > a protont:Topic ; > >>> > > protons:generatedBy > >>> > > > >>> > > > >>> > > ; > >>> > > > >>> > > protons:hasMainAlias > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416 > >>> d2 > >>> > >>> > > 8f4cf1> . > >>> > > > >>> > > < > >>> > >>> http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbd > >>> a4 > >>> > >>> > > 616b72 > >>> > > > >>> > > a protons:Alias ; > >>> > > > >>> > > > >>> > > "Pixel Classification at en" . > >>> > > > >>> > > I added this document to owlim.ttl and imported my instances. > >>> > > > >>> > > I tried to follow document Customizing KIM 3.pdf, but as mapping > >>> > > has already been done, I didn't know what else to do. Maybe I > >>> > > should > >>> > >>> create > >>> > >>> > > Jape rule, or something like that, but I think that I should see > >>> > >>> Topic > >>> > >>> > > with or without my instances. I'm not sure, that is only my > >>> > > opinion. > >>> > > > >>> > > Best, > >>> > > Srecko > >>> > > > >>> > > On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext < > >>> > > > >>> > > philip.alexiev at ontotext.com> wrote: > >>> > >> Can you describe the exact actions you take to add the topics to > >>> > >> the > >>> > >>> IE > >>> > >>> > >> logic ? The exact customizations you have made to KIM. > >>> > >> > >>> > >> Thanks, > >>> > >> Philip > >>> > >> > >>> > >> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote: > >>> > >> > >>> > >> Hi Philip, > >>> > >> with GATE is same as with Java code. I get the same annotations. I > >>> > >>> tried > >>> > >>> > >> to edit nerc.properties and add Topic to > >>> > >> *com.ontotext.kim.KIMConstants.IE_ANN_TYPES * list, but nothing > >>> > >> changed*. > >>> > >> * > >>> > >> Do I have to change something else? > >>> > >> > >>> > >> Best, > >>> > >> Srecko > >>> > >> > >>> > >> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext < > >>> > >> > >>> > >> philip.alexiev at ontotext.com> wrote: > >>> > >>> Hi Srecko, > >>> > >>> > >>> > >>> You can run the gate interface to check exactly what annotations > >>> > >>> are > >>> > >>> > >>> create ant their type. You can do this by running: > >>> > >>> bash KIM/bin/kim gate > >>> > >>> > >>> > >>> You probably use a Jape rule to match the Lookup annotations with > >>> > >>> class=" http://proton.semanticweb.org/2006/05/protont#Topic" and > >>> > >>> are > >>> > >>> > >>> creating one of the entity annotations over it (the entity > >>> > >>> annotations are a whitelist of annotations that remain after the > >>> > >>> annotation process finishes, all annotations not in this list are > >>> > >>> removed). > >>> > >>> > >>> > >>> So check what type of annotation you are creating. > >>> > >>> > >>> > >>> If this is not the case, please provide more details how you > >>> > >>> handle > >>> > >>> the > >>> > >>> > >>> topic lookups. > >>> > >>> > >>> > >>> All the best, > >>> > >>> Philip > >>> > >>> > >>> > >>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote: > >>> > >>> > Hello Philip, > >>> > >>> > > >>> > >>> > I included my instances in KIM. When I use web UI, I see them > >>> > >>> all, > >>> > >>> > >>> > and > >>> > >>> > >>> > >>> everything looks ok. But when I run code like this: > >>> > >>> > KIMDocument kimDoc = > >>> > >>> > >>> > >>> apiCorpora.createDocument(_string_to_annotate, true); > >>> > >>> > >>> > >>> > kimDoc = apiSemAnn.execute(kimDoc); > >>> > >>> > > >>> > >>> > KIMAnnotationSet kimASet = kimDoc.getAnnotations(); > >>> > >>> > Set typesSet = kimASet.getAllTypes(); > >>> > >>> > Iterator iterator = typesSet.iterator(); > >>> > >>> > > >>> > >>> > // show annotations of every type separately > >>> > >>> > while(iterator.hasNext()) > >>> > >>> > { > >>> > >>> > > >>> > >>> > Object key = iterator.next(); > >>> > >>> > KIMAnnotationSet kimFilteredASet = > >>> > >>> > >>> > >>> kimASet.get(String.valueOf(key)); > >>> > >>> > >>> > >>> > Iterator annIterator = > >>> > >>> > kimFilteredASet.iterator(); > >>> > >>> > System.out.println(" = Annotations of type [" + > >>> > >>> > >>> > >>> String.valueOf(key) + "] :"); > >>> > >>> > >>> > >>> > while(annIterator.hasNext()) > >>> > >>> > { > >>> > >>> > > >>> > >>> > System.out.println(" -- " + > >>> > >>> annIterator.next()); > >>> > >>> > >>> > } > >>> > >>> > > >>> > >>> > } > >>> > >>> > System.out.println("[ Document's Typed Annotations > >>> > >>> (end) > >>> > >>> > >>> ]"); > >>> > >>> > >>> > >>> > I don't see any annotation of type Topic. I see all of them > >>> > >>> > when > >>> > >>> I > >>> > >>> > >>> > use > >>> > >>> > >>> > >>> web UI, like I said. But when I try to annotate string from Java > >>> > >>> application, I don't get any Topic annotations. > >>> > >>> > >>> > >>> > Could you please help me on this one? > >>> > >>> > > >>> > >>> > Best, > >>> > >>> > Srecko > >>> > >>> > _______________________________________________ > >>> > >>> > Kim-discussion mailing list > >>> > >>> > Kim-discussion at ontotext.com > >>> > >>> > http://ontotext.com/mailman/listinfo/kim-discussion > >>> > >> > >>> > >> _______________________________________________ > >>> > >> Kim-discussion mailing list > >>> > >> Kim-discussion at ontotext.com > >>> > >> http://ontotext.com/mailman/listinfo/kim-discussion > >>> > >>> -- > >>> Boyan Kukushev > >>> Senior Software Engineer / Java Developer > >>> Ontotext AD @ Sirma Group Corp. > > > > _______________________________________________ > > Kim-discussion mailing list > > Kim-discussion at ontotext.com > > http://ontotext.com/mailman/listinfo/kim-discussion -- Boyan Kukushev Senior Software Engineer / Java Developer Ontotext AD @ Sirma Group Corp. From reneta.popova at ontotext.com Thu Jul 21 10:12:13 2011 From: reneta.popova at ontotext.com (Reneta Popova) Date: Thu, 21 Jul 2011 17:12:13 +0300 Subject: [Kim-discussion] Fwd: LKB gazetteer doubts References: Message-ID: Begin forwarded message: > From: Javier Ruiz Martin > Date: 21 ??? 2011 17:01:14 ????????+0300 > To: danko at ontotext.com, marin.nozhchev at ontotext.com, reneta.popova at ontotext.com > Subject: LKB gazetteer doubts > > Hello everyone, > > I am Javier Ruiz Martin, from Barcelona. I really appreciate it if you can give me an answer to a technical question and a legal question about your LKB Gazetteer. > > I'm trying to use your LKB Gazetteer with Virtuoso repository., as it is on the web: > > http://nmwiki.ontotext.com/lkb_gazetteer/general.html # rdf-database-compatibility > > I attached my file config.ttl and the response returned to register the plugin gate LKBGazetteer with this configuration. > > The questions are: > > 1) You've got the connection between your Gazetteer and Virtuoso with the version of your plugin that includes gate (gate-6.1-snapshot-build3920-ALL)? If yes, can you send me the file config.ttl used? May be the version of kim-util-3.0-RC5.jar is not supporting this feature? > > 2) I don't find the license to use the plugin. In fact, website shows: > http://nmwiki.ontotext.com/lkb_gazetteer/license.html > > "No project license is defined for This Project.." > > Is this correct? > > Thank you very much > Javier > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: error.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: config.ttl Type: application/octet-stream Size: 2419 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From boyan.kukushev at ontotext.com Thu Jul 21 12:06:27 2011 From: boyan.kukushev at ontotext.com (Boyan Kukushev) Date: Thu, 21 Jul 2011 19:06:27 +0300 Subject: [Kim-discussion] Fwd: LKB gazetteer doubts In-Reply-To: References: Message-ID: <201107211906.27327.boyan.kukushev@ontotext.com> Hello Javier, According to the output you provided, you are trying to run KIM with a custom configuration for the LKB gazetteer. IMO, there are missing libraries that should contain Virtuoso-related classes, resources, etc. within the KIM server classpath. Please make sure you have added Virtuoso-specific libraries (e.g. Virtuoso .jar file, containing the VirtuosoRepository class) to the KIM server classpath: - add them to /lib - or create /lib/customizations folder and put these .jar files there Hope this helps! Regards, Boyan Kukushev On Thursday, July 21, 2011 17:12:13 Reneta Popova wrote: > Begin forwarded message: > > From: Javier Ruiz Martin > > Date: 21 ??? 2011 17:01:14 ????????+0300 > > To: danko at ontotext.com, marin.nozhchev at ontotext.com, > > reneta.popova at ontotext.com Subject: LKB gazetteer doubts > > > > Hello everyone, > > > > I am Javier Ruiz Martin, from Barcelona. I really appreciate it if you > > can give me an answer to a technical question and a legal question about > > your LKB Gazetteer. > > > > I'm trying to use your LKB Gazetteer with Virtuoso repository., as it is > > on the web: > > > > http://nmwiki.ontotext.com/lkb_gazetteer/general.html # > > rdf-database-compatibility > > > > I attached my file config.ttl and the response returned to register the > > plugin gate LKBGazetteer with this configuration. > > > > The questions are: > > > > 1) You've got the connection between your Gazetteer and Virtuoso with the > > version of your plugin that includes gate > > (gate-6.1-snapshot-build3920-ALL)? If yes, can you send me the file > > config.ttl used? May be the version of kim-util-3.0-RC5.jar is not > > supporting this feature? > > > > 2) I don't find the license to use the plugin. In fact, website shows: > > http://nmwiki.ontotext.com/lkb_gazetteer/license.html > > > > "No project license is defined for This Project.." > > > > Is this correct? > > > > Thank you very much > > Javier -- Boyan Kukushev Senior Software Engineer / Java Developer Ontotext AD @ Sirma Group Corp.