GraphDB Counteracts Stock Market Manipulation

In this post, we offer some perspectives on the frenzy initiated by the r/WallStreetBets and its impact on the stock market. We look at some of the challenges financial intermediaries and regulators are facing when trying to apply the established rules and corrective actions. Finally, we share our thoughts on why GraphDB's capabilities we effectively use for trade surveillance to prevent manipulation from professional traders can provide a solution in the current situation.

February 3, 2021 8 mins. read Peio Popov

Last week, when the internet and stock market went crazy over the sky-rocketing shares of US video-game retailer GameStop, it all traced back to Reddit’s WallStreetBets community. Originally a place for sharing trading advice, it became a platform where day traders banded together to inflate the price of stocks like GameStop, AMC, Blackberry, Bed, Bath and Beyond, etc.

There are many perspectives to the story, but whether gambling, market manipulation, or a mission to take on Wall Street’s big players, these recent events show the volatility of the market and its exposure to outside (including social media’s) interference.

Fraud or Civic Position – the Effect is the Same

Financial fraud is nothing new – where there is an opportunity to make money, there is a scheme on how to appropriate it. However, recent advances in the use of technology by market participants and others have presented new challenges for the monitoring and detection of market manipulation. The ease to facilitate coordinated actions for unethical and malicious activities has greatly expanded with the proliferation of social media.

Of course, Reddit’s WallStreetBets forum is not a pioneer in the field. Long before internet forums, people made use of social media’s predecessor – the dial-up bulletin board system – to post fraudulent trading recommendations. Attempts to manipulate share prices by using social media to spread false or misleading information about stocks lead to the SEC’s Investor Alert of 2015, warning institutional investors about the possible impact of social media.

However, it is difficult to map the r/WallStreetBets events to the classic financial fraud models.

The Securities Exchange Act of 1934 has made it illegal to use “any manipulative device or contrivance” when buying or selling regulated shares. One such practice that comes close is the Pump-and-dump, which is a form of stock market manipulation where the price of initially cheaply purchased stock is artificially inflated to enable selling it at a higher price. To help decide what falls under this, the Commodity Futures Trading Commission (CFTC) has formulated the following four-part test for market manipulation:

  • (1) That the accused had the ability to influence market prices;
  • (2) that the accused specifically intended to create or effect a price or price trend that does not reflect legitimate forces of supply and demand;
  • (3) that artificial prices existed; and
  • (4) that the accused caused the artificial prices.

Some obvious and well-known Pump-and-dump examples are the microcap scheme targeting elderly retail investors in the US, the three Israeli penny stock promoters, the civil action against Meadow Vista Financial Corp. and Downshire Capital, Inc. in Canada, and there are of course many more.

Short-squeeze is another interpretation of the events. The term refers to the pressure on short sellers to cover their positions as a result of sharp price increases or difficulty in borrowing the security the sellers are short. This is a contradictory tactic that might be considered a legitimate investment strategy or market manipulation. However, in their Key Points, SEC indicates the fraudulent potential:

Although some short squeezes may occur naturally in the market, a scheme to manipulate the price or availability of stock in order to cause a short squeeze is illegal.

As we already said, the WallStreetBets groups’ activities do not easily fit into the classic financial fraud models. Some of the most notable differences to established practices are:

  1. Intent – most of the Reddit traders are setting ask prices, which clearly demonstrate their intent to keep the stock and not profit from the resulting price fluctuations;
  2. Scale – most of the invested amounts are so small that they were unable to influence prices on their own;
  3. Coordination – “small pieces loosely joined” is the title of the classic book describing how the web works and in this case the loose cooperation and combination of various motivations make it hard to qualify as a “scheme”.
  4. Causality – the challenge for the regulator to clearly prove intent and the relationship between actions and result.

However, whichever way one chooses to look at it, last week’s events still present challenges to applying the established rules and corrective actions. To remediate the effect, the market intermediaries and regulators are left with a set of blunt instruments such as discriminating classes of participants, excluding entire platforms, forcing sales of assets, or suppression, even censorship of speech. We can do better than that.

The Shortcomings of the Legacy Solutions

The effect of r/WallStreetBests clearly demonstrates the shortcomings of the legacy solutions. The current methods of detecting market manipulation are based on hard-coded rules, combined with the classic time-series analysis of market data such as trade, order, quote, and execution data. While this type of analysis provides an irreplaceable foundation for detecting well-known fraudulent practices like frontrunning and spoofing, it leaves a lot to be desired when dealing with challenges posed by sophisticated actors and/or lack of contextual information around the market data.

An ongoing challenge for trade surveillance, KYC, AML, sanctions, and all compliance processes is the high number of fraud alerts. Most of the traditional compliance systems are calibrated for high recall, which results in a lot of false positives that need to be manually reviewed.

Combined with the lack of contextual information about the actors and the (series of) transactions, this leads to the need to maintain the impossible balance between:

  • the ever-increasing time and cost of compliance and its disruptive effect on legitimate activities, on one side and
  • the risk of missing problematic activities and the consequent regulatory sanctions and reputation exposure, on the other.

This hard choice was very well illustrated by the unusual actions of the brokerages whose customers were the small investors, which had no means to control the activities they intermediated and made choices, which resulted in dozens of lawsuits and uniting political foes from the opposite parts of the spectrum.

It doesn’t have to be this way.

How Knowledge Graphs Help

We have recently published a case-study about a global bank that chose Ontotext’s GraphDB to power their Trade Surveillance system. It is true that Trade Surveillance in an investment bank is different from detecting market manipulation of retail investors who possess no insider knowledge. However, despite the fact that the global bank’s problem was how to best track the financial investments made by people with inside knowledge and their friends and family, we believe that our solution can benefit both cases.

One of the main benefits for the investment bank was that the introduction of Ontotext’s knowledge graph based solution improved the efficiency and precision of their alert review process by providing the following capabilities to the existing compliance system:

  1. Identification – unique global identifiers for all parties and instruments;
  2. Meaning – interoperable definitions helping to align all different regulatory and business requirements;
  3. Expressivity – the ability to express domain and use-case specific knowledge and constraints;
  4. Constraints – the ability to ensure consistent quality of the data by enforcing business rules on it;
  5. Historic view  – the clarity of who knew what and when in order to prove ongoing compliance;
  6. Rank – a measure of influence to rank and contextualize the most important members in a group, associated features, and signals;
  7. Meaningful data for ML – last, but not least, the contextually rich data from semantically integrated is much better input for all machine learning models developed to detect the problematic patterns.

The following diagram represents a high-level conceptual architecture for a solution:

By using formal semantics, Ontotext’s solution captured the “meaning” of the available market data with all its inherent relationships in a single enterprise knowledge graph. The graph’s flexible and dynamic structure greatly improved the quality of the data and enriched it with automatically inferred new facts.

Applying an unambiguous and permanent identity resolution technique facilitated the reconciliation of identifiers coming from different systems. It also helped measure connectedness and rank the group on the basis of centrality and made it easier to analyze clusters of interrelated entities and to classify and group them accordingly. The Financial Industry Business Ontology provided a foundation and acted as an accelerator for developing a suitable hierarchy. This enabled identity and meaning-based interoperability of the traded instruments, despite the multiple identifiers used by some of the traders.

By its nature, the knowledge graph provides context. Concepts use other concepts to describe each other. The interlinked descriptions of concepts and entities created the necessary context and made the work of the bank’s compliance team much easier. What’s more, Ontotext’s solution also allowed them to interpret each alert not only against the latest version of the organizational structure but against its state at the time when it happened.

Last but not least, the formal semantics of the knowledge graph enabled domain-centric views on compliance information and data, which provided richer context for smart interpretation. This empowered different types of pattern searches based on GraphDB semantic similarity allowing discoverability on the basis of graph and word vector similarity.


The combination of all the capabilities provided by a knowledge graph solution offers a much more sophisticated toolset for monitoring and analysis of the actions of the market participants. Instead of using indiscriminate mass action, the intermediaries and the regulator can apply fine-grained measures, corresponding to the actions and intent of the participants.

In this way, the goal of democratizing access to the stock market can be achieved while still keeping customer’s confidence, lowering frustration, and avoiding litigation and reputational risks.

New call-to-action

Article's content

Sales Executive at Ontotext

Peio Popov is a legal counsel with strong IT background, specializing in business development, intellectual property, and technology-related contract negotiations. Besides being highly skillful in business operations, Peio’s professional focus spans software licensing and development, free software and Open Data licensing issues, text analysis and Information extraction, PKI and Identity services and applicable cryptography, among others.