Read about SEMANTiCS pre-conference day, which covered the topics of interoperability, ESG data, knowledge engineering, scholarly communication, and academia & industry collaboration.
Did you know that, if you add “take a deep breath” to a prompt, chances are you will get more accurate results from large language models (LLMs)? I didn’t either. I learned that fact from a comment in the audience on the second day of SEMANTICS 2023 – the European conference series focused on semantic technologies ever since 2005.
The last day of SEMANTiCS started as excitingly as all the rest. This time with a SPARQL query.
Aidan Hogan, associate Professor at the Department of Computer Science, University of Chile, and an Associate Researcher at the Millennium Institute for Foundational Research on Data (IMFD), asked ChatGPT:
“Can you give me a SPARQL query for Wikidata to find Turin Award winners who were born in Latin America?”
Further, Hogan elaborated on the ways we can use the LLM to access knowledge from knowledge graphs, sparkling (pun intended) an interesting inner dialogue: What if we used ChatGPT to write SPARQL queries instead of natural language answers?
Showcasing in real-time how ChatGPT can be used to write SPARQL queries, Aidan brought up the point that LLMs can be a potential entry point to the Semantic Web home. He shared the need for more research at the intersection of LLMs and knowledge graphs.
A slide from the publicly available Keynote of Aidan Hogan at SEMANTICS 2023
Hogan also spoke about the current capabilities and limitations of LLMs and compared them to those of knowledge graphs in terms of usability, explainability, provenance, resoluteness, mutability, timeliness, comprehensiveness, utility, expressiveness, cost, and truth.
“[LLMs] call into question a fundamental tenet of Data Management: that in order to address non-trivial information needs, the first step is to explicitly structure data in order to lift them from the ambiguous swamp of our human language. Aidan Hogan”
Throughout his presentation [PDF], he made a plethora of academic references on all the open questions deriving from use cases where the interplay between knowledge graphs and LLMs is involved.
Aidan Hogan at SEMANTiCS 2023. Source: SWC
Aidan concluded by saying that LLMs can make knowledge graphs more usable and widespread as a technology. He invited everyone to contribute to Transactions on Graph Data and Knowledge (TGDK) – an open-access journal publishing research about graph data and knowledge. He also reminded us all about his wonderful book, available online with open access.
Somewhat complementary to Hogan’s presentation was the talk Data Spaces and Lexical Resources. I say complimentary because other than using Linked Data to better work with LLMs and knowledge graphs, we can also use lexical resources to enrich LLM’s “learning” path.
Ilan Kernermann, CEO of Lexicala, and Martin Kaltenböck, CFO of Semantic Web Company (SWC), presented an approach to making use of Language Models and Knowledge Graphs in Data Spaces and Data Markets to foster data exchange and semantic interoperability. Both speakers talked about common metadata standards and adequate language resources as key enablers of efficient interoperable, multilingual projects.
Ilan also spoke about the flaws of LLMs in terms of language-specific flaws, to kind of complement the flaws Hogan showed.
Ilan Kernermann presenting the flaws of LLM technology at SEMANTiCS 2023
From Ilan’s perspective, one rooted in years of work and research to provide multilingual lexical data solutions for the Language Technology industry and academia, these flaws were to be addressed by lexicology. Ilan argued that we could map the DNA of language by organizing lexical resources data-wise.
Among the language resources Ian listed were systemic lexical mapping, definitions, sense indicators, typical language patterns, phraseology, domain classification, grammatical details, morphology, and cross-lingual correspondence.
The last talk I attended was titled: Laws For A Genie: Governance And Evidence Frameworks For Large Language Model-Based Chatbots In Medicine given by Stephen Gilbert, Professor of Medical Device Regulatory Science at the Else Kröner Fresenius Center for Digital Health, Technische Universität Dresden. It was an entertaining, highly informative, and thoughtful walk through the ethical and technological aspects of the use of LLMs in medicine.
Stephen Gilbert stressed how important the regulation and ethical frameworks were for using LLMs in the healthcare domain. For, despite the great medical potential for LLM-based generative chat tools (such as ChatGPT or Google’s MedPaLM), LLMs with explainability, low bias, predictability, correctness, and verifiable outputs do not currently exist. This calls for developing frameworks and rules that will eventually tame the genie that is already out of the bottle.
At his talk, Stephen Gilbert from @EKFZdigital @tudresden_de discussed “Laws for a Genie: Governance and Evidence Frameworks for Large Language Model-based Chatbots in #Medicine.” #AI #Healthcare #DigitalHealth #SemanticsConf pic.twitter.com/ErXW0UYM9D
— SEMANTiCS Conference (@SemanticsConf) September 22, 2023
The conference took place in conjunction with LANGUAGE INTELLIGENCE 2023 – an event showcasing the latest developments in Multilingual Artificial Intelligence. It offered 3 days full of academic and industry tracks, business talks, tutorials, and workshops. If I were to talk about some of the insights I got from the parallel event, I would probably end up with 3 more posts. Thankfully, lt-innovate.org already did a concise wrap-up.
But, if I am to distill the gist, it would be: language technology.
Just like the typewriter in the hall hosting the Poster’s park, LLMs are yet another tool poised to change the way we work with language. And yes, although this tool might look sophisticated and exciting, it comes with many ethical concerns and a lot of work for us to do before we are able to use it responsibly and sustainably. As with any technology, the thing that matters most is the knowledge we manage to transfer via it. And, as we all know, when it comes to machine-mediated knowledge, a little semantics goes a long way.
A typewriter in the hall hosting the Poster’s park at SEMANTiCS 2023
SEMANTiCS 2023 proved that semantic technologies are invaluable when it comes to encoding knowledge and working towards better spaces of collaboration, innovation, and most importantly shared understanding.
LLMS are only as knowledgeable as the data we feed them so let’s see how knowledgeable our LLMs will be and how knowledge-centric our projects are a year from now.
Looking forward to seeing you in Amsterdam for the 20th edition of SEMANTiCS in 2024!