Topic Community platforms and Group Intelligence software can be deployed with popular open-source software and cloud services: Apache, ElasticSearch, Hadoop, Spark, AWS LAMP stacks, NginX, Lime, React, etc.
For instance, ElasticSearch. On one level ElasticSearch is an awesome, widely used engine for information retrieval applications, faceted search, log analysis, etc.. but ES is also used as an engine for advanced analytics in healthcare, ecommerce, telecommunications,manufacturing, etc. for analyzing structured and semi-structured content using machine learning ( ML) and computation text processing.
ElasticSearch, Spark and related analytical and machine learning platforms deliver Information Retrieval and Natural Language Processing (NLP) capabilities that can organize, query and socialize content that is formally published, web crawled, user-generated or operationally created in structured, unstructured or semi-structured formats.
Key capabilities
Underlying algorithms and services:
When combining big data technologies with group intelligence software and topic-based knowledge exchanges, we are working in an exciting new realm where content information retrieval, text analytics and machine learning are used to pre-digest vast amounts of structured and unstructured data which can be continually fed into collaborative knowledge workflows in a semantically accessible and familiar form..
ES/Hadoop/Spark/Python can integrate IR / TA / ML capabilities with existing business platforms, including collaboration suites, Business Intelligence software, CRM, SFA, legal systems and content management (CMS) applications. Making mountains of content understandable
Unstructured Data Analysis
Capabilities include semantic analysis of unstructured or “semi-structured” text content such as web pages, documents, social media, research papers, reports, medical records, work logs and forms, RDF triplestores, and any free-form text:
Structured Data Analysis
Support for semantic analysis of structured data such as that found in relational, B.I. or flat databases is supported with following capabilities:
In a traditional information retrieval application... documents are indexed with TF/IDF and then query against them to get a search results listing…
Wikipedia on TF/IDF search technology
ElasticSearch can support a very wide range of TF/IDF applications and can also do the opposite… index a large set of queries ( i.e., rules, business logic, metadata structures, exploratory categorization routines) and then throw incoming documents and structured content against the indexed queries..
This so called "reverse indexing" approach makes it possible to quickly parse and process a large number of heterogeneous documents, papers, research notes, transaction records, annotations, social media content, and unstructured / semi structured text records looking for categories, topics, tags, and fuzzy emergent patterns.
More importantly the engine helps capture and share the intelligence for these reusable queries between knowledge workers.
In IR/ TA terms.. the underlying mechanism is called percolation.
ElasticSearch reverse indexing document percolation
With the python API to ElasticSearch we can write software that extends the IR/TA capabilities into the most advanced reaches of probabilistic machine learning, creating new forms of social knowledge sharing applications in finance, insurance, intelligence, marketing, publishing, e-commerce and healthcare.
Please see companion sites for more portfolio and project content:
Let's brainstorm exponential opportunities and applications:
Back to top