Plexgraf is a startup in semi-stealth mode. Welcome!
Bots With the Facts Straight Bots With the Facts Straight


Assembli is a reference platform for researching guardrail strategies that make AI conversational apps and Large Language Models (LLM) safe for business use cases, including customer contact, analytics and numerous enterprise service and support touchpoints.

There is growing sentiment that AI chatbots based on LLM's (e.g., ChatGPT, Claude, Gemini) are not ready for prime time in terms of reliability and information safety for business-critical applications.

The concern is valid, but in actuality, it's possible with the approach discussed below to use Machine Learning (ML), Natural Language Processing (NLP) and Information Retrieval (IR) methodologies to mitigate AI business hazards.

"The future is already here, it's just not evenly distributed." - William Gibson

Bad Bot Headlines

An example of a leading LLM Chatbot getting it wrong:

Stanford ML/AI courses and books spell out danger:

Explicit cautions from Google about AI threats:

Business Realities for AI

If AI applications and chatbots are to succeed, there's 4 huge challenges to overcome:

DEFINITION: Retrieval Augmented Generation (RAG). A large percentage of currently available business apps for AI are based on Retrieval Augmented Generation, an approach that injects custom business content into the prompt when chatbot users send questions to AI LLMs. RAG is dangerous for business apps unless it's gaurdrailed with mature ML/NLP/IR techniques, as per the Assembli approach.

DEFINITION: Fine Tuning (FT). Another approach to enhancing LLMs is fine tuning, which uses supervised or semi-supervised learning to train a 'vanilla' LLM on a vertical knowledge domain or specific tasks that are needed in a business workflow. Fine tuning is expensive and time-consuming. Assembli avoids repeated fine tuning efforts by putting the real-time dynamic aspect of the knowledge base in mature ML/NLP/IR and DBMS resources.


LLM's hallucinate and give wrong or inappropriate answers: Hallucinations are a natural part of the large language model landscape because of the non-deterministic way that transformer networks process information... it's organic, it's emergent, its creative, it's adaptive.... it's also prone to errors that business applications can't tolerate. It's not practically possible to eliminate factual errors to the level needed by business applications with RAG and FT.


To fix the hallucinations: Readily available ML/NLP/IR techniques can create a controlled set of concepts and vocabularies that serve as guardrails for LLM agents and chatbots.


AI apps need a human loop: LLMs Don't naturally lend themselves to being chopped up into discrete process steps, ie, what you expect from transactional or analytical enterprise applications and database driven flows.


To fix the human loop: Assembli orchestrates humans into AI app flow at just the right points in the end-to-end process so that problem escalation, human oversight, sentiment response, shared problem solving happen when they should. Assembli does the human loop via a Knowledge Response Center platform that is very tightly coupled to the LLM workflow and chatbot activity.


LLMs don't learn in real time: They are largely static between major re-training efforts, i.e., you have to go through re-tuning, re-training, re-pre-training, etc. to update their knowledge and even then the LLM is not an ideal stand-alone knowledge repository because there are limits to it's retrieval modalities - it's not an sql database that can handle a vast range of storage and retrieval modes.


To fix the learning problem: Assembli provide a Knowledge Response Center that builds a canonical living, real time store for human and bot knowledge. the Response Center helps in many ways. For instance when a QA chatbot AI need to be seeded with canonical answers, humans work together in the Knowledge Response Center to create an answer set. When a problem the bots can't handle arises, humans convene in the Response Center, solve the problem and it becomes part of the Assembli knowledge corpus for the access via bots and humans using a wide range of AI and conventional data management modalities.


AI: expensive to build and maintain: As shown below, the cost of fine tuning can be huge, especially if it is repeated often due to changing business content, product features, business model, etc. For business-critical AI applications the costs spiral when it comes to " the last mile" that gets you to a whole solution.


To fix the scaling and expense problem: Assembli overcomes the high-costs and low quality of RAG/FT AI with a knowledge graf-based Knowledge Response Center, which compliment LLM's in many important ways. The knowledge graf DMBS repository and IR guardrail system can scale and adapt to rapidly changing business environments and customers needs very cost effectively.

LLM AI's are language models... networks of language. To expect them to operate like mission critical business apps that have mature structured data, databases, business rules with explicit conditional logic is wrong. It may be best to think of LLM capabilities like the skills of a brilliant personality who is not entirely trustworthy or predictable, but can be hugely productive with the right guardrails and safety nets. Assembli provides the guardrails and safety nets with well architected ML/NLP/IR schemas, ontologies and flow logic.

The Plexgraf Solution

If AI-based business apps are to support customers reliably, conventional context augmentation ( i.e., RAG) must be wrapped in an ML/NLP/IR architecture that guides the LLM into the correct knowledge domain. In the case of the hotel guest hearing about yoga when they wanted the gym hours, the IR will acquire a set of domain vocabulary via very precise synonym matching and feed that to the LLM along with other context and the user query so the answer is correct.

In addition to ML/NLP/IR enhancements for AI, there is a need to architect a human loop that brings humans into the agent flow at precisely the correct point in time. Hence, Plexgraf creates a reusable, repurposable adaptive repository of augmented knowledge objects so that AI's and humans can reason together across this fabric in a flexible open ended way.

Solution Summary

Assembli prototype screen

Build a 'Brains and Bots' knowledge fabric

In any small or large organization, there exists a primary set of linguistically based domains of conceptual knowledge. These “knowledge networks” are specific to each industry, e.g., healthcare, manufacturing, finance, shipping, retail, energy, etc.

Within any specific knowledge domains there are subnetworks for knowledge niches that support granular operational workflows. For instance , a healthcare claims-processing the overall organization dictionary contains a wide range of medical and healthcare finance terminology... but at the operational level, in a specific claims workflow such as quality improvement, a subnetwork of knowledge and language is needed to support the niche, i.e., Quality of Care initiatives have their own language nuances.

Unfortunately LLM models like Gemini and ChatGPT are not great at getting the right level of abstractions. After all they are just language networks and bound my rules based logic or even ML principles, ie, more rigorous statistics than 'sampling the languages model'.

The Assembli captures corporate organizational knowledge and structures it so that the LLM can be guided into the correct knowledge domain and sub-domain.

Assembli Building Blocks

AI applications typically manifest as AI chatbots or intelligent proactive ai agents that answer questions and help with operational workflows such as sales, product support, medical issues, travel ticketing, content creation, etc.

QA Bots. Conversation QA app design, model selection, with prompt engineering Sometimes api calls from your application to a public LLM is enough to get the job done to support conversation end user sessions with rich text Q and A flows. The starting point for this exercise is to document capabilities of the LLM in relation to the target corporate application. This level of work often involves enhanced prompt engineering using the corporate corpus.

AI Task Agents. Creation of semi autonomous AI agents that can do more than answer questions, eg, solve problems and then send email, do database queries, create reports, make API calls.

Assembli is unique in that it’s answer bots and agents operate in an heterogeneous data-rich environment that knits together:

Knowledge Response Center. AI-based customer and employee support need pin-point 'fall-back' (also called 'no-match') functions that loop in humans for escalation, sentiment, and experience that lies outside the LLM realm. Assembli provides this with a "knowledge response center' where experts and bots build a canonical knowledge repository that works hand-in-hand with the fully automated generative support bot activity.

Machine Learning & IR. Things don't bode well for organizations who try and solve all their customer and employee knowledge support needs with LLM AI alone. What's really needed is a blend of LLM AI + machine learning + information retrieval technologies knitted together with bots and humans in an elegant brains+bots architecture, i.e., Assembli !

Target Applications for AI Augmented with ML/IR

Once LLMs have been augmented by ML / NLP / IR and given a 'living' knowledge repository that collects brain/bot learnings in real time.. there’s an ideal opportunity to build automated agents and task automation apps that exploit the LLM + custom data. Here are some examples:

Each of these applications leverages the combination of the LLM's natural language understanding capabilities with the specific, contextual data of the company, providing tailored and efficient solutions to various business needs.

Quality and Safety layer

The Plexgraf multi-agent architecture addresses quality and safety issues with ‘Safety Sentry Agents” that conduct continuous monitoring and auditing of generative AI processes.The safety Sentries work off their body of augmented safety and quality knowledge that address these issues and more:

Bias and Hallucinations: AI systems are prone to factual errors and can inherit biases present in their training data, leading to unfair or sub-optimal outcomes. This is particularly concerning in sensitive areas like corporate confidentiality, research and development content, financial content and policy enforcement.

Data Privacy and Security: AI systems often rely on large amounts of personal data, raising concerns about privacy and data protection. There's also a risk of data being manipulated to skew AI decisions.

Lack of Explainability: Many AI models, especially deep learning systems, are often seen as "black boxes" due to their complexity, making it difficult to understand how they reach certain conclusions or decisions.

Robustness and Reliability: AI systems can be prone to errors or manipulation, such as adversarial attacks, where small, intentional changes to input data can lead to incorrect outputs.

Ethical and Legal Impacts: AI applications can have far-reaching impacts on society, including job displacement, surveillance concerns, and the potential for misuse in areas like autonomous weaponry.

In all the above cases, Plexgraf Safe Sentry agents comb through generated content doing cross-checking in real time looking for anomalies, mistakes and harmful texts..

Appendix 1

Taxonomy of Augmentations for AI Business Applications:

Normally QA Conversational application interfaces and Agent task flows work on top of vanilla public LLMs (OpenAI GPT, Gemini, Anthropic, etc.) but with Assembli, QA chatbots and agents can reason across augmented LLMs that have content customized and extended by a wide range of ML / NLP / IR techniques. Business logic and structured content augmentation allows LLMs to take advantage of a company’s internal data and internal human knowledge experts, without compromising the privacy and security of that data.

How to Tame a Large Language Model (from simple to complex).

1) Conventional search. In this approach, the LLM chat bot or operational agent can do conventional search queries against public or private text bases ( e.g., wikipedia look-ups) during the course of a chatbot session. Info found this way is not a persistent part of the LLM’s base model..

2) Best Match Text library. In addition to queries of conventional search services, vertical private data can be stored and indexed with BM25 or similar TF/ IDF methods and made available for custom queries from bots and agents.

3) Context Injection. Typical GPT and other chatbots continually look across all the prompt text a user enters to gain context during a session. Context injection allows the chatbot to incorporate new and relevant information into the conversation beyond what the user enters. This can include current events, private corporate data, in-depth research or any other data that can make the conversation more relevant and personalized. Context injection can be accomplished by searching an auxiliary content base in real time during a chatbot session and then feeding chunks of relevant info into the prompt window as background for the actual user query.

4) Vector Embedding. Text embedding stores enhance LLMs with entire documents that are mapped as vectors in a high-dimensional semantic space to capture semantic and syntactic relationships between the words or phrases. LLMs can access supplemental embedding content in the process of generative text exchanges with human base on vertical knowledge bases and specialized corporate know-how.

5) Knowledge Graphs. A knowledge graph, when combined with a Language Model, serves as a complementary content store that enhances the model's capabilities, enabling “Rhizomatic” knowledge sharing platforms.

6) Structured Information Store. Knowledge graphs store information in a structured format, typically in the form of entities (like people, places, events) and relationships between these entities. This structured format is different from the way information is stored in an LLM, which is primarily through weights in a neural network based on textual data.

7) Enhanced Data Retrieval. When an LLM is integrated with a knowledge graph, it can access this structured information directly. This allows for more precise and accurate retrieval of factual data, as knowledge graphs are often curated and can be more up-to-date compared to the LLM's training data.

8) LLM Fine Tuning. LLMs can be fine tuned via reinforcement learning with supplemental content from private corporate knowledge stores and domain expert content. Fine tuning often involves labeling of data to provide categorization and feedback loops for the LLM. This is typically more intensive/expensive than RAG / embedding but less effort than training an LLM neural network from scratch.

The Plexgraf platform for federated brain/bot applications is called “ Assembli ” . Think of it as StackOverflow (QA) meets Salesforce (business apps) meets vertical ‘wikipedia-esque’ knowledge stores.

LLM Context Augmentation Primer

In addition to ML / NLP / IR content flow structure, and a 'living' knowledge-base for human loop support, RAG is an important aspect of LLM applications. The primary goal of RAG is to provide LLMs with contextually relevant and factually accurate information, ensuring that the generated content meets the highest standards of quality and relevance.

To achieve this, the RAG system is divided into subsystems, each playing a crucial role in the overall process. The tools integral to the RAG system are not standalone entities; they interweave to form the subsystems that drive the RAG process.

Each tool fits within one of the following subsystems:

1) Index

2) Retrieval

3) Augment

These work together as an orchestrated flow that transforms a user’s query into a contextually rich and accurate response.

Doing RAG right:

Before building out a RAG system, it’s essential to familiarize yourself with the tools that make this process possible.

Each tool plays a specific role, ensuring that the RAG system operates efficiently and effectively:

Example Embedding Models:

  1. Word2Vec
  2. GloVe (Global Vectors for Word Representation)
  3. FastText
  4. ELMo (Embeddings from Language Models)
  5. GTE-Base (Graft Default)
  6. GTE-Large
  7. GTE-Small
  8. E5-Small
  9. MultiLingual
  10. RoBERTa (2022)
  11. MPNet V2
  12. Scibert Science-Vocabulary Uncased
  13. Longformer Base 4096
  14. Distilbert Base Uncased
  15. Bert Base Uncased
  16. MultiLingual BERT
  17. E5-Base
  18. LED 16K

Appendix 2

Plexgraf Capabilities:


N-tier SAAS architectures

ELK Stack (Open Search, Lucene, SoLr, Nutch)

DBMS design

Transaction process algo's

Knowledge Community platforms

Data Architecture

The value of transformative analytics infrastructure


Enterprise Integration Patterns

Why Enterprise Integration Patterns?

Data Warehouse design (Kimball, Inmon)

Data warehouse design best practices

App Based OLAP








Low Code development

Google's AppSheet,, Etc.

Static code review

Dynamic code review


Agile Project Management


ML algorithms

LLM planning, testing, management

LLM Model pre-training, fine-tuning, embedding, prompt context, etc.


AI proof-of-concept sandboxes

Classification, summarization, Q/A, helpdesk, knowledge collaboration, learning systems and other key use cases

Semantic embedding content for LLM's

Transcend the limits of the context window!

Statistical inference models for ML/AI apps

AI Topic Modeling (LDA, BERTopic, LLMs)


Natural Language Processing / Information Retrieval

Natural Language Processing applications

Search and retrieval technology


The role of ontology and info architecture in AI



Knowledge Graphs

Conceptual Graphs

Information mapping and the conceptual graph

Business Process Modeling



Qualitative user research - Qual

Quantitative usability testing - Quant

Rapid UI prototyping

Stateful hi-res prototypes for complex, interactive workflows

Data Visualization design

Authoritative charts, tables and visuals for analytics and data-driven apps.

Cooper Scenario-based task analysis


Journey maps, personas, mental models

Design systems

Minimum Viable Product Planning

Agile / UX interface


Risk Analysis

Risk Types: Security, Financial, Operational, Compliance, Reputational

Source code security analysis

Application security analysis

AI based security analysis

AI red/blue team design



WCAG / 508 / ADA

Zero Trust networks

Data Governance (Data Governance Audits)

Data governance audit checklist


Technical Evangelists.

Technical Marketing and Sales

White paper and presentation content creation

R&D projects


Please schedule a brainstorm session about your AI, NLP, ML, IR, chatbot/agent product and infrastructure challenges.

Free consultation:   Schedule a call with Plexgraf

Let's brainstorm exponential opportunities together


Back to top