Bots With the Facts Straight Bots With the Facts Straight


Assembli is a reference platform for researching guardrail strategies that target conversational apps and Large Language Model AIs.

There is growing sentiment that AI chatbots based on large language models are not ready for business prime time in terms of reliability and information safety... but in actuality, the ML/NLP technology that could mitigate AI business hazards is already here... but not widely used or understood.

"The future is already here, it's just not evenly distributed" - William Gibson

Assembli prototype screen

The Plexgraf Assembli testbed is for study of AI-driven customer/ employee support apps with a goal of delivering reliable workflows that orchestrate AI / human collaboration... "brain/bot" learning loops!

Please schedule a brainstorm session about your AI product or infrastructure challenges.

Let's think about exponential opportunities together: Schedule a free call with Plexgraf

Assembli prototype screen

Assembli can be thought of as a multi-agent ‘rhizomatic’ platform for brain / bot collaboration. It creates a persistent repurposable knowledge store that lives alongside the AI LLM, enabling precision hand-offs between humans and bots.

Assembli uniquely addresses cross-discipline problem-solving in the AI age with an platform approach that is based on over 50 years of combined experience in information retrieval engineering (IR), natural language text processing (NLP) and conversational interfaces,

Business Context

In any small or large organization, there exists a primary set of linguistically based domains of conceptual knowledge. These “knowledge networks” are specific to each industry, e.g., healthcare, manufacturing, finance, shipping, retail, energy, etc.

Within any specific knowledge domains there are subnetworks for knowledge niches that support granular operational workflows. For instance , a healthcare claims processing the overall organization dictionary contains a wide range of medical and healthcare finance terminology .. but in a specific operation workflow such as quality improvement a subnetwork of knowledge and language is needed to support a niche like Quality of Care initiatives that have their own language nuances.

The challenge for knowledge rhizome engineers is to create a reusable, repurposable adaptive repository of augmented knowledge objects so that AI’s and humans can reason together across this fabric in a flexible open ended way without vendor or technology lock-in.

The Plexgraf Platform for supporting cross-discipline collaboration in the context AI reasoning is called Assembli.

Assembli Building Blocks

AI applications typically manifest as AI chatbots or intelligent proactive ai agents that answer questions and help with operational workflows such as sales, product support, medical issues, travel ticketing, content creation, etc.

QA Bots. Conversation QA app design, model selection, with prompt engineering Sometimes api calls from your application to a public LLM is enough to get the job done to support conversation end user sessions with rich text Q and A flows. The starting point for this exercise is to document capabilities of the LLM in relation to the target corporate application. This level of work often involves enhanced prompt engineering using the corporate corpus.

AI Task Agents. Creation of semi autonomous AI agents that can do more than answer questions, eg, solve problems and then send email, do database queries, create reports, make API calls.

Assembli is unique in that it’s answer bots and agents operate in an heterogeneous data-rich environment that knits together:

The following section describes how public LLM knowledge-bases are augmented by custom content stores

How to Tame a Large Language Model

(From simple to complex)

Normally QA Conversational application interfaces and Agent task flows work on top of vanilla public LLMs (OpenAI GPT, Bard, Anthropic, etc.) but with Assembli, QA chatbots and agents can reason across augmented LLMs that have content customized and extended by a wide range of techniques. Content augmentation allows LLMs to take advantage of a company’s internal data and internal human knowledge experts, without compromising the privacy and security of that data.

1) Conventional search. In this approach, the LLM chat bot or operational agent can do conventional search queries against public or private text bases ( e.g., wikipedia look-ups) during the course of a chatbot session. Info found this way is not a persistent part of the LLM’s base model..

2) Best Match Text library. In addition to queries of conventional search services, vertical private data can be stored and indexed with BM25 or similar TF/ IDF methods and made available for custom queries from bots and agents.

3) Context Injection. Typical GPT and other chatbots continually look across all the prompt text a user enters to gain context during a session. Context injection allows the chatbot to incorporate new and relevant information into the conversation beyond what the user enters. This can include current events, private corporate data, in-depth research or any other data that can make the conversation more relevant and personalized. Context injection can be accomplished by searching an auxiliary content base in real time during a chatbot session and then feeding chunks of relevant info into the prompt window as background for the actual user query.

4) Vector Embedding. Text embedding stores enhance LLMs with entire documents that are mapped as vectors in a high-dimensional semantic space to capture semantic and syntactic relationships between the words or phrases. LLMs can access supplemental embedding content in the process of generative text exchanges with human base on vertical knowledge bases and specialized corporate know-how.

5) Knowledge Graphs. A knowledge graph, when combined with a Language Model, serves as a complementary content store that enhances the model's capabilities, enabling “Rhizomatic” knowledge sharing platforms.

6) Structured Information Store. Knowledge graphs store information in a structured format, typically in the form of entities (like people, places, events) and relationships between these entities. This structured format is different from the way information is stored in an LLM, which is primarily through weights in a neural network based on textual data.

7) Enhanced Data Retrieval. When an LLM is integrated with a knowledge graph, it can access this structured information directly. This allows for more precise and accurate retrieval of factual data, as knowledge graphs are often curated and can be more up-to-date compared to the LLM's training data.

8) LLM Fine Tuning. LLMs can be fine tuned via reinforcement learning with supplemental content from private corporate knowledge stores and domain expert content. Fine tuning often involves labeling of data to provide categorization and feedback loops for the LLM. This is typically more intensive/expensive than RAG / embedding but less effort than training an LLM neural network from scratch.

The Plexgraf platform for federated brain/bot applications is called “ Assembli ” . Think of it as StackOverflow (QA) meets Salesforce ( business apps) meets vertical ‘wikipedia-esque’ knowledge stores.

Intelligent Agents

Once LLMs have been augmented by a company’s internal data, there’s an ideal opportunity to build agents and task automation apps that exploit the LLM + custom data.

Integrating a large language model (LLM) chatbot with a company's internal data via vector embeddings can lead to the development of numerous innovative AI agents and task automation applications. Here are some examples:

Each of these applications leverages the combination of the LLM's natural language understanding capabilities with the specific, contextual data of the company, providing tailored and efficient solutions to various business needs.

Quality and Safety layer

The Plexgraf multi-agent architecture addresses quality and safety issues with ‘Safety Sentry Agents” that conduct continuous monitoring and auditing of generative AI processes.The safety Sentries work off their body of augmented safety and quality knowledge that address these issues and more:

Bias and Hallucinations: AI systems are prone to factual errors and can inherit biases present in their training data, leading to unfair or sub-optimal outcomes. This is particularly concerning in sensitive areas like corporate confidentiality, research and development content, financial content and policy enforcement.

Data Privacy and Security: AI systems often rely on large amounts of personal data, raising concerns about privacy and data protection. There's also a risk of data being manipulated to skew AI decisions.

Lack of Explainability: Many AI models, especially deep learning systems, are often seen as "black boxes" due to their complexity, making it difficult to understand how they reach certain conclusions or decisions.

Robustness and Reliability: AI systems can be prone to errors or manipulation, such as adversarial attacks, where small, intentional changes to input data can lead to incorrect outputs.

Ethical and Legal Impacts: AI applications can have far-reaching impacts on society, including job displacement, surveillance concerns, and the potential for misuse in areas like autonomous weaponry.

In all the above cases, Plexgraf Safe Sentry agents comb through generated content doing cross-checking in real time looking for anomalies, mistakes and harmful texts..

RAG Architecture and Subsystems

The primary goal of RAG is to provide LLMs with contextually relevant and factually accurate information, ensuring that the generated content meets the highest standards of quality and relevance.

To achieve this, the RAG system is divided into subsystems, each playing a crucial role in the overall process. The tools integral to the RAG system are not standalone entities; they interweave to form the subsystems that drive the RAG process.

Each tool fits within one of the following subsystems:

1) Index

2) Retrieval

3) Augment

These work together as an orchestrated flow that transforms a user’s query into a contextually rich and accurate response.

What You Need for RAG Implementation:

Before building out a RAG system, it’s essential to familiarize yourself with the tools that make this process possible.

Each tool plays a specific role, ensuring that the RAG system operates efficiently and effectively.

Example Embedding Models:

  1. Word2Vec
  2. GloVe (Global Vectors for Word Representation)
  3. FastText
  4. ELMo (Embeddings from Language Models)
  5. GTE-Base (Graft Default)
  6. GTE-Large
  7. GTE-Small
  8. E5-Small
  9. MultiLingual
  10. RoBERTa (2022)
  11. MPNet V2
  12. Scibert Science-Vocabulary Uncased
  13. Longformer Base 4096
  14. Distilbert Base Uncased
  15. Bert Base Uncased
  16. MultiLingual BERT
  17. E5-Base
  18. LED 16K


Plexgraf Capabilities:


N-tier SAAS architectures

ELK Stack (Open Search, Lucene, SoLr, Nutch)

DBMS design

Transaction process algo's

Knowledge Community platforms

Data Architecture

The value of transformative analytics infrastructure


Enterprise Integration Patterns

Why Enterprise Integration Patterns?

Data Warehouse design (Kimball, Inmon)

Data warehouse design best practices

App Based OLAP








Low Code development

Google's AppSheet,, Etc.

Static code review

Dynamic code review


Agile Project Management


ML algorithms

LLM planning, testing, management

LLM Model pre-training, fine-tuning, embedding, prompt context, etc.


AI proof-of-concept sandboxes

Classification, summarization, Q/A, helpdesk, knowledge collaboration, learning systems and other key use cases

Semantic embedding content for LLM's

Transcend the limits of the context window!

Statistical inference models for ML/AI apps

AI Topic Modeling (LDA, BERTopic, LLMs)


Natural Language Processing / Information Retrieval

Natural Language Processing applications

Search and retrieval technology


The role of ontology and info architecture in AI



Knowledge Graphs

Conceptual Graphs

Information mapping and the conceptual graph

Business Process Modeling



Qualitative user research - Qual

Quantitative usability testing - Quant

Rapid UI prototyping

Stateful hi-res prototypes for complex, interactive workflows

Data Visualization design

Authoritative charts, tables and visuals for analytics and data-driven apps.

Cooper Scenario-based task analysis


Journey maps, personas, mental models

Design systems

Minimum Viable Product Planning

Agile / UX interface


Risk Analysis

Risk Types: Security, Financial, Operational, Compliance, Reputational

Source code security analysis

Application security analysis

AI based security analysis

AI red/blue team design



WCAG / 508 / ADA

Zero Trust networks

Data Governance (Data Governance Audits)

Data governance audit checklist


Technical Evangelists.

Technical Marketing and Sales

White paper and presentation content creation

R&D projects


Please schedule a brainstorm session about your AI, NLP, ML, IR, chatbot/agent product and infrastructure challenges.

Free consultation:   Schedule a call with Plexgraf

Let's brainstorm exponential opportunities together


Back to top