MongoDB - Arcee Quickstart
The MongoDB - Arcee Quickstart is a project aimed at facilitating the rapid and straightforward deployment of AI-driven applications utilizing MongoDB Atlas and the Arcee models. It offers scripts and configurations to streamline and automate the setup process, integrating MongoDB Atlas for data storage and Arcee models for AI functionalities.
Table of Contents
- Overview
- System Architecture
- Components
- Installation & Deployment
- Configuration
- Usage
- API Reference
- Security Considerations
- Monitoring & Logging
- Troubleshooting
- Development Guide
- Maintenance & Operations
1. Overview
The MongoDB - Arcee Quickstart is a comprehensive, integrated end-to-end technology stack meticulously designed to facilitate the rapid development and seamless deployment of AI-powered applications. This innovative framework combines the robust capabilities of MongoDB Atlas for scalable data storage and advanced vector search functionalities with the powerful machine learning capabilities of AWS SageMaker and Arcee.ai's state-of-the-art language models. The entire system is encapsulated within a user-friendly interface, enabling effortless interaction and management.
Key features include:
- Advanced vector search capabilities for nuanced, contextual information retrieval
- Seamless integration with AWS SageMaker for efficient AI model inference
- Highly intuitive chat interface for natural language querying and streamlined file uploads
- Scalable microservices architecture designed for optimal performance and resource utilization
- Robust data ingestion and processing pipeline capable of handling diverse data types
- Multi-modal file processing for comprehensive data analysis
- Real-time AI interactions for immediate insights and responses
- Stringent security measures to ensure the integrity and confidentiality of all data handled within the system
This system empowers developers to create sophisticated AI applications that can understand context, process natural language queries, and provide intelligent responses based on ingested data. This makes it an ideal solution for a wide range of applications, from advanced customer support systems and intelligent document analysis tools to complex research assistants and innovative educational platforms.
2. System Architecture
This system is built on a robust and flexible microservices architecture, comprising three primary services that work in concert to deliver a seamless AI-powered application experience:
-
UI Service (Port 7860): This service forms the front-end of the MAAP application, providing an intuitive and responsive user interface for interaction. It serves as the primary point of contact for end-users, allowing them to input queries, upload files, and view AI-generated responses.
-
Main Service (Port 8000): Acting as the brain of the system, the Main Service handles the core application logic, manages database queries, and orchestrates AI model interactions. It processes user inputs received from the UI Service, retrieves relevant information from the database, and coordinates with AWS SageMaker for AI model inferences.
-
Loader Service (Port 8001): This service is responsible for managing file uploads and orchestrating the data ingestion process. It handles the complex task of processing various file formats, extracting relevant information, and preparing the data for storage in MongoDB Atlas.
These microservices interact seamlessly with MongoDB Atlas, which serves as the primary data store and provides powerful vector search capabilities. The architecture also integrates with AWS SageMaker, leveraging its scalable infrastructure for AI model hosting and inference.
Data flow within the system:
- User inputs (text queries or file uploads) are initially received by the UI Service.
- For document processing, the Loader Service takes over, utilizing AWS Bedrock and the Titan Embeddings model to generate vector representations of the content.
- These vector embeddings are then stored in MongoDB Atlas, enabling fast and accurate similarity searches.
- When a query is processed, the Main Service coordinates the retrieval of relevant information from MongoDB Atlas and the generation of appropriate responses.
- For more complex language understanding and generation tasks, the Main Service interacts with the AI models hosted on AWS SageMaker, specifically leveraging Arcee.ai's advanced language models.
This architecture ensures high scalability, allowing the system to handle increasing loads by scaling individual components as needed. It also provides flexibility, enabling easy updates or replacements of specific services without affecting the entire system.
Document Storage and Segmentation in MongoDB
Key Fields in MongoDB
Each document uploaded by a user is stored in MongoDB with the following fields:
_id
: MongoDB-generated unique identifier for the document.userId
: Unique identifier for the user (e.g., email address or UUID). This field ensures all documents are segmented and associated with their respective users.document_text
: The full text extracted from the uploaded document.document_embedding
: Vector embeddings generated from thedocument_text
for similarity-based searches.link_texts
: Anchor texts of hyperlinks within the document.link_urls
: Corresponding URLs for the hyperlinks.languages
: Detected language(s) of the document content.filetype
: The type of file uploaded (e.g.,text/html
).url
: The source URL of the document (if available).category
: The classification or type of the document (e.g.,CompositeElement
).element_id
: A unique identifier for the specific content element.
Data Segmentation by User ID
- The
userId
field is critical for isolating and segmenting data. - During queries, only the documents associated with a specific
userId
are retrieved, ensuring data privacy and security. - This segmentation allows for multi-tenant architecture while maintaining strict user data isolation.
Data Upload Process
User Actions
- Document Upload: The user uploads a document through the UI Service (running on port 7860).
- User Identification: The system captures the user's unique identifier (
userId
) during the upload process provided in theUser Id
text field on the UI.
System Workflow
-
Content Extraction:
- The Loader Service (running on port 8001) processes the document, extracting:
document_text
: The full textual content.link_texts
andlink_urls
: Hyperlinked phrases and their corresponding URLs.
- Additional metadata is captured, including file type (
filetype
) and detected languages (languages
).
- The Loader Service (running on port 8001) processes the document, extracting:
-
Embedding Generation:
- The Loader Service (running on port 8001) also generates vector embeddings from the
document_text
using an embeddings model. - These embeddings are used for similarity-based searches.
- The Loader Service (running on port 8001) also generates vector embeddings from the
-
Data Storage in MongoDB:
- All processed data, including
document_text
,document_embedding
, and metadata, is stored in MongoDB. - The data is associated with the
userId
to ensure proper segmentation.
- All processed data, including
-
Search Indexing:
- The vector embeddings are indexed by MongoDB Atlas Vector Search Index.
- This indexing allows for efficient similarity searches when querying documents.
3. Components
This system is composed of several key components:
UI Service
- Purpose: Provides a web-based interface for user interactions
- Technologies: Gradio, Python
- Interactions: Communicates with Main Service for query processing and Loader Service for file uploads
Main Service
- Purpose: Handles core application logic, database queries, and AI model interactions
- Technologies: FastAPI, LangChain, Python
- Interactions: Communicates with MongoDB Atlas for data retrieval and AWS SageMaker for model inference
Loader Service
- Purpose: Manages file uploads and data ingestion into MongoDB Atlas
- Technologies: FastAPI, Unstructured, Python
- Interactions: Communicates with MongoDB Atlas for data storage
MongoDB Atlas
- Purpose: Provides scalable data storage and vector search capabilities
- Features: Vector indexing, multi-collection search
AWS SageMaker
- Purpose: Hosts and serves AI models for inference
- Features: Scalable model deployment, API endpoints for prediction
Core Components
-
CloudFormation Templates:
deploy-infra.yaml
: Infrastructure setupdeploy-sagemaker.yaml
: SageMaker deploymentdeploy-ec2.yaml
: EC2 instance configuration
-
Python Services:
- MongoDB Atlas integration
- Vector search implementation
- Document processing
- API endpoints
-
Docker Containers:
- Isolated service environments
- Dependency management
- Resource allocation
Technology Stack
- Backend: Python 3.10+
- Database: MongoDB Atlas
- ML Platform: AWS SageMaker
- API Framework: FastAPI
- Frontend: Gradio
- Containerization: Docker
- Infrastructure: AWS CloudFormation
4. Installation & Deployment
Prerequisites
- AWS account with appropriate permissions
- MongoDB Atlas account with appropriate permissions
- Python 3.10+
- AWS CLI installed and configured
- SageMaker quota for
ml.g5.12xlarge
- EC2 quota for
t3.xlarge
- Programmatic access to your MongoDB Atlas project
MongoDB Atlas Programmatic Access
To enable programmatic access to your MongoDB Atlas project, follow these steps to create and manage API keys securely:
1. Create an API Key
-
Navigate to Project Access Manager:
- In the Atlas UI, select your organization and project.
- Go to Project Access under the Access Manager menu.
-
Create API Key:
- Click on the Applications tab.
- Select API Keys.
- Click Create API Key.
- Provide a description for the key.
- Assign appropriate project permissions by selecting roles that align with the principle of least privilege.
- Click Next.
-
Save API Key Credentials:
- Copy and securely store the Public Key (username) and Private Key (password).
- Important: The private key is displayed only once; ensure it's stored securely.
2. Configure API Access List
-
Add Access List Entry:
- After creating the API key, add an IP address or CIDR block to the API access list to specify allowed sources for API requests.
- Click Add Access List Entry.
- Enter the IP address or click Use Current IP Address if accessing from the current host.
- Click Save.
-
Manage Access List:
- To modify the access list, navigate to the API Keys section.
- Click the ellipsis (...) next to the API key and select Edit Permissions.
- Update the access list as needed.
3. Secure API Key Usage
-
Environment Variables: Store API keys in environment variables to prevent hardcoding them in your application's source code.
-
Access Controls: Limit API key permissions to the minimum required for your application's functionality.
-
Regular Rotation: Periodically rotate API keys and update your applications to use the new keys to enhance security.
-
Audit Logging: Monitor API key usage through Atlas's auditing features to detect any unauthorized access.
By following these steps, you can securely grant programmatic access to your MongoDB Atlas project, ensuring that your API keys are managed and utilized in accordance with best practices.
For more detailed information, refer to Guide.
Minimum System Requirements
- For SageMaker: At least one
ml.g5.12xlarge
instance (or equivalent GPU instance) - For EC2: At least a
t3.medium
instance (or higher, depending on workload) - Sufficient EBS storage for EC2 instance (at least 100 GB recommended)
- MongoDB Atlas M10 Cluster (auto-deployed by the
one-click
script)
4.1 One-Click Deployment
The one-click.ksh
Korn shell script automates the deployment of the MongoDB - Arcee Quickstart application on AWS infrastructure. It sets up the necessary AWS resources, deploys an EC2 instance, and configures the application environment.