← All work
Product · 2023

Conversational AI & Topic-Modeling R&D Service

A social storytelling / lead-gen platform

Overview

An AI research-and-development workbench exploring conversational agents, entity-aware search, and topic modeling over storytelling data for a social platform. It evaluated both hosted (OpenAI) and local (GPT4All / LLaMA) models behind a lightweight service.

Why It Exists

The platform wanted to understand how LLMs could power conversational discovery and automatic organization of user-generated stories. This codebase served as the sandbox to test approaches before productizing any of them.

What We Built

A Python service (Dockerized, with a Flask layer) containing multiple build tracks: LangChain- and LlamaIndex-based retrieval agents; a custom agents/ package with named-entity, fuzzy, and entity-search modules; BERTopic notebooks for topic clustering; Autolabel-driven dataset labeling; and side-by-side trials of OpenAI APIs versus locally hosted GPT4All/LLaMA models. Working CSV/JSONL datasets and notebooks document the exploration.

Technologies & Approach

LangChain 0.0.2x and LlamaIndex for orchestration and retrieval; OpenAI plus GPT4All/LLaMA for hosted-vs-local comparison; BERTopic for unsupervised topic discovery; Autolabel for LLM-assisted data labeling; Flask and Docker Compose for packaging the build service.

Outcome / Impact

Validated which LLM patterns and model-hosting trade-offs were viable for conversational discovery and content organization, and produced reusable building blocks (entity search, topic modeling, auto-labeling) for downstream product work. Positioned as applied AI/ML R&D.

Capabilities Demonstrated

  • Rapid LLM application building with LangChain and LlamaIndex
  • Hosted vs. local model evaluation (OpenAI, GPT4All, LLaMA)
  • Topic modeling and LLM-assisted data labeling at dataset scale
  • Entity-aware and fuzzy retrieval over unstructured content
More work See all →