Agent Skills#
The MS-Agent Skill Module is a powerful, extensible skill execution framework that enables LLM agents to automatically discover, analyze, and execute domain-specific skills for complex task completion. It is an implementation of the Anthropic Agent Skills protocol.
With the Skill Module, agents can handle complex tasks such as:
βGenerate a PDF report for Q4 sales dataβ
βCreate a presentation about AI trends with chartsβ
βConvert this document to PPTX format with custom themesβ
Key Features#
Intelligent Skill Retrieval: Hybrid search combining FAISS dense retrieval with BM25 sparse retrieval, plus LLM-based relevance filtering
DAG Execution Engine: Builds execution DAGs based on skill dependencies, supports parallel execution of independent skills, and automatically passes inputs/outputs between skills
Progressive Skill Analysis: Two-phase analysis (plan first, then load resources), incrementally loads scripts/references/resources on demand, optimizing context window usage
Secure Execution Environment: Supports isolated execution via ms-enclave Docker sandboxes, or controlled local execution
Self-Reflection & Retry: LLM-based error analysis, automatic code fixes, and configurable retry attempts
Standard Protocol Compatibility: Fully compatible with the Anthropic Skills protocol
Architecture#
High-Level Architecture#
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LLMAgent β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β AutoSkills β β
β β ββββββββββββ ββββββββββββ ββββββββββββββββββ β β
β β β Loader β β Retrieverβ β SkillAnalyzer β β β
β β β β β (Hybrid) β β (Progressive) β β β
β β ββββββ¬ββββββ ββββββ¬ββββββ βββββββββ¬βββββββββ β β
β β β β β β β
β β βΌ βΌ βΌ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β DAGExecutor ββ β
β β β ββββββββββ ββββββββββ ββββββββββ ββ β
β β β βSkill 1 ββ βSkill 2 ββ βSkill N β ββ β
β β β βββββ¬βββββ βββββ¬βββββ βββββ¬βββββ ββ β
β β β ββββββββββββββ΄βββββββββββ ββ β
β β β β ββ β
β β β SkillContainer (Execution) ββ β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Execution Flow#
User Query
β
βΌ
βββββββββββββββββββ
β Query Analysis β βββ Is this a skill-related query?
ββββββββββ¬βββββββββ
β Yes
βΌ
βββββββββββββββββββ
β Skill Retrieval β βββ Hybrid search (FAISS + BM25)
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Skill Filtering β βββ LLM-based relevance filtering
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β DAG Building β βββ Build dependency graph
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Progressive β βββ Plan β Load β Execute
β Execution β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Result β βββ Merge outputs, format response
β Aggregation β
βββββββββββββββββββ
The skill module implements a multi-level progressive context loading mechanism:
Level 1 (Metadata): Loads only skill metadata (name, description) for semantic search
Level 2 (Retrieval): Retrieves relevant skills and loads the full SKILL.md
Level 3 (Resources): Further loads required reference materials and resource files
Level 4 (Analysis|Planning|Execution): Analyzes skill context, autonomously creates plans and task lists, loads required resources and runs related scripts
Skill Directory Structure#
Each skill is a self-contained directory:
skill-name/
βββ SKILL.md # Required: Main documentation and instructions
βββ META.yaml # Optional: Metadata (name, description, version, tags)
βββ scripts/ # Optional: Executable scripts
β βββ main.py
β βββ utils.py
β βββ run.sh
βββ references/ # Optional: Reference documents
β βββ api_docs.md
β βββ examples.json
βββ resources/ # Optional: Assets and resources
β βββ template.html
β βββ fonts/
β βββ images/
βββ requirements.txt # Optional: Python dependencies
SKILL.md Format#
# Skill Name
Brief description of what this skill does.
## Capabilities
- Capability 1
- Capability 2
## Usage
Instructions for using this skill...
## Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| input | str | Input data |
| format | str | Output format |
## Examples
Example usage scenarios...
META.yaml Format#
name: "PDF Generator"
description: "Generates professional PDF documents from markdown or data"
version: "1.0.0"
author: "Your Name"
tags:
- document
- pdf
- report
Quick Start#
Prerequisites#
Python 3.9+
Docker (for sandbox execution, optional)
FAISS (for skill retrieval)
Installation#
pip install 'ms-agent>=1.4.0'
# Or install from source
git clone https://github.com/modelscope/ms-agent.git
cd ms-agent
pip install -e .
Method 1: Using LLMAgent Configuration#
import asyncio
from omegaconf import DictConfig
from ms_agent.agent import LLMAgent
config = DictConfig({
'llm': {
'service': 'openai',
'model': 'gpt-4',
'openai_api_key': 'your-api-key',
'openai_base_url': 'https://api.openai.com/v1'
},
'skills': {
'path': '/path/to/skills',
'auto_execute': True,
'work_dir': '/path/to/workspace',
'use_sandbox': False,
}
})
agent = LLMAgent(config, tag='skill-agent')
async def main():
result = await agent.run('Generate a mock PDF report about AI trends')
print(result)
asyncio.run(main())
Method 2: Using AutoSkills Directly#
import asyncio
from ms_agent.skill import AutoSkills
from ms_agent.llm import LLM
from omegaconf import DictConfig
llm_config = DictConfig({
'llm': {
'service': 'openai',
'model': 'gpt-4',
'openai_api_key': 'your-api-key',
'openai_base_url': 'https://api.openai.com/v1'
}
})
llm = LLM.from_config(llm_config)
auto_skills = AutoSkills(
skills='/path/to/skills',
llm=llm,
work_dir='/path/to/workspace',
use_sandbox=False,
)
async def main():
result = await auto_skills.run(
query='Generate a mock PDF report about AI trends'
)
print(f"Result: {result.execution_result}")
asyncio.run(main())
Parameter Reference:
skills: Skill source, supporting the following formats:Single path or list of paths to local skill directories
Single or multiple ModelScope skill repository IDs, e.g.
ms-agent/claude_skills(see ModelScope Hub)Format:
owner/skill_nameorowner/skill_name/subfolderSingle or multiple
SkillSchemaobjects
work_dir: Working directory for skill execution outputsuse_sandbox: Whether to use Docker sandbox for execution (defaultTrue); set toFalsefor local execution with security checksauto_execute: Whether to automatically execute skills (defaultTrue)
Configuration#
Configure the skill module in agent.yaml:
skills:
# Required: Path to skills directory or ModelScope repo ID
path: /path/to/skills
# Optional: Whether to enable retriever (auto-detect based on skill count if omitted)
enable_retrieve:
# Optional: Retriever arguments
retrieve_args:
top_k: 3
min_score: 0.8
# Optional: Maximum candidate skills to consider (default: 10)
max_candidate_skills: 10
# Optional: Maximum retry attempts (default: 3)
max_retries: 3
# Optional: Working directory for outputs
work_dir: /path/to/workspace
# Optional: Use Docker sandbox for execution (default: True)
use_sandbox: false
# Optional: Auto-execute skills (default: True)
auto_execute: true
For general YAML configuration details, see Config & Parameters.
Core Components#
Component |
Description |
|---|---|
|
Main entry point for skill execution, coordinating retrieval, analysis and execution |
|
Secure skill execution environment (sandbox or local) |
|
Progressive skill analyzer with incremental resource loading |
|
DAG executor with dependency management |
|
Skill loading and management |
|
Finds relevant skills using semantic search |
|
Skill schema definition |
AutoSkills#
class AutoSkills:
def __init__(
self,
skills: Union[str, List[str], List[SkillSchema]],
llm: LLM,
enable_retrieve: Optional[bool] = None,
retrieve_args: Dict[str, Any] = None,
max_candidate_skills: int = 10,
max_retries: int = 3,
work_dir: Optional[str] = None,
use_sandbox: bool = True,
): ...
async def run(self, query: str, ...) -> SkillDAGResult:
"""Execute skills for a query."""
async def get_skill_dag(self, query: str) -> SkillDAGResult:
"""Get skill DAG without executing."""
SkillContainer#
class SkillContainer:
def __init__(
self,
workspace_dir: Optional[Path] = None,
use_sandbox: bool = True,
timeout: int = 300,
memory_limit: str = "2g",
enable_security_check: bool = True,
): ...
async def execute_python_code(self, code: str, ...) -> ExecutionOutput:
"""Execute Python code."""
async def execute_shell(self, command: str, ...) -> ExecutionOutput:
"""Execute shell command."""
SkillAnalyzer#
class SkillAnalyzer:
def __init__(self, llm: LLM): ...
def analyze_skill_plan(self, skill: SkillSchema, query: str) -> SkillContext:
"""Phase 1: Analyze skill and create execution plan."""
def load_skill_resources(self, context: SkillContext) -> SkillContext:
"""Phase 2: Load resources based on plan."""
def generate_execution_commands(self, context: SkillContext) -> List[Dict]:
"""Phase 3: Generate execution commands."""
DAGExecutor#
class DAGExecutor:
def __init__(
self,
container: SkillContainer,
skills: Dict[str, SkillSchema],
llm: LLM = None,
enable_progressive_analysis: bool = True,
enable_self_reflection: bool = True,
max_retries: int = 3,
): ...
async def execute(
self,
dag: Dict[str, List[str]],
execution_order: List[Union[str, List[str]]],
stop_on_failure: bool = True,
query: str = '',
) -> DAGExecutionResult:
"""Execute the skill DAG."""
Usage Examples#
Example 1: PDF Report Generation#
import asyncio
from ms_agent.skill import AutoSkills
from ms_agent.llm import LLM
async def generate_pdf_report():
llm = LLM.from_config(config)
auto_skills = AutoSkills(
skills='/path/to/skills',
llm=llm,
work_dir='/tmp/reports'
)
result = await auto_skills.run(
query='Generate a PDF report analyzing Q4 2024 sales data with charts'
)
if result.execution_result and result.execution_result.success:
for skill_id, skill_result in result.execution_result.results.items():
if skill_result.output.output_files:
print(f"Generated files: {skill_result.output.output_files}")
asyncio.run(generate_pdf_report())
Example 2: Multi-Skill Pipeline#
async def create_presentation():
auto_skills = AutoSkills(
skills='/path/to/skills',
llm=llm,
work_dir='/tmp/presentation'
)
# This query might trigger multiple skills working together:
# 1. data-analysis skill to process data
# 2. chart-generator skill to create visualizations
# 3. pptx skill to create the presentation
result = await auto_skills.run(
query='Create a presentation about AI market trends with data visualizations'
)
print(f"Execution order: {result.execution_order}")
for skill_id in result.execution_order:
if isinstance(skill_id, str):
context = auto_skills.get_skill_context(skill_id)
if context and context.plan:
print(f"{skill_id}: {context.plan.plan_summary}")
asyncio.run(create_presentation())
Example 3: Custom Input Execution#
from ms_agent.skill.container import ExecutionInput
async def execute_with_custom_input():
auto_skills = AutoSkills(
skills='/path/to/skills',
llm=llm,
work_dir='/tmp/custom'
)
dag_result = await auto_skills.get_skill_dag(
query='Convert my document to PDF'
)
custom_input = ExecutionInput(
input_files={'document.md': '/path/to/my/document.md'},
env_vars={'OUTPUT_FORMAT': 'A4', 'MARGINS': '1in'}
)
exec_result = await auto_skills.execute_dag(
dag_result=dag_result,
execution_input=custom_input,
query='Convert my document to PDF'
)
print(f"Success: {exec_result.success}")
asyncio.run(execute_with_custom_input())
Security#
Sandbox Execution (Recommended)#
When use_sandbox=True, skills run in isolated Docker containers with:
Network isolation (configurable)
Filesystem isolation (only workspace directory mounted)
Resource limits (memory, CPU)
No access to host system
Automatic installation of skill-declared dependencies
Local Execution#
When use_sandbox=False, security is enforced through:
Pattern-based scanning for dangerous code
Restricted file system access
Environment variable sanitization
Make sure you trust the skill scripts before executing them to avoid potential security risks. For local execution, ensure all required dependencies are installed in your Python environment.
Creating Custom Skills#
Create a new subdirectory under your skills path
Add a
SKILL.mdfile with documentation and instructionsAdd a
META.yamlfile with metadataAdd scripts, references, and resources as needed
Test with
AutoSkills.get_skill_dag()to verify the skill can be retrieved correctly
Best Practices#
Write clear, comprehensive
SKILL.mdthat fully describes the skillβs capabilities, usage, and parametersExplicitly declare all dependencies in
requirements.txtKeep skills self-contained by packaging all necessary resources within the directory
Handle errors gracefully in scripts
Use the
SKILL_OUTPUT_DIRenvironment variable to specify output directories