A.D.A.M.O. — Agent for language-Driven Actions with Multimodal Observations

Figure 1 — System overview of A.D.A.M.O.
A.D.A.M.O. integrates symbolic and visual perception into a unified Observation module, combining egocentric RGB frames with labeled bounding boxes, 3D reference points, and a synchronized symbolic state. These observations populate a Multi-Modal Short-Term Memory (MSTM) composed of fixed behavioral instructions, a dynamic symbolic state, and a rolling buffer of recent interactions. A pretrained VLM processes this multimodal context to produce explicit reasoning notes and tool-based actions (Walk, Look, Pick, Drop), closing the perceive–reason–act loop that drives the agent inside the 3D environment.

Abstract

Creating believable Virtual Humans requires a unified architecture that connects perception and action through language-mediated reasoning. Existing systems remain modular, linking dialogue, perception, and control via handcrafted pipelines that limit adaptability and grounded behavior in 3D environments. We introduce A.D.A.M.O. (Agent for language-Driven Actions with Multimodal Observations), a language-driven Virtual Human framework that leverages a pretrained VLM with tool-calling to integrate perception, reasoning, and control. A.D.A.M.O. maintains a dual visual–symbolic world model, enabling spatial reasoning and goal-directed actions directly from natural language prompts. To evaluate this capability, we introduce a novel benchmark structured around a Capability-Difficulty taxonomy that decomposes spatial tasks by procedural and linguistic complexity, providing an interpretable measure of embodied reasoning difficulty. Experiments across multiple task families and scenes show consistent task-level generalization largely reducing manual authoring, thus establishing a foundation for scalable, language-grounded agentic behavior.

Figure 2 — Infrastructure architecture.

A.D.A.M.O. implements the Thought–Act–Observation loop through four modular components:

Runtime Engine (Unity).
Manages the 3D environment, rendering, physics, and agent embodiment.
Collects observations and executes high-level actions via the Action Server.

Cognitive Server (Python).
Runs the Thought–Act–Observation loop, maintains the multimodal short-term memory (MSTM), assembles prompts, and communicates with the VLM through LangGraph/LangChain.

VLM Inference Server (Model Providers).
Provides access to cloud or local VLMs (GPT-4o-vision, Claude Sonnet-3.5, Ollama).
Performs multimodal reasoning and selects the next tool.

Action Server (Unity).
Translates tool calls (Walk, Look, Pick, Drop) into Unity actions and returns tool feedback with updated observations.

Execution Flow.

Unity sends initial observations + task prompt to the Cognitive Server.
The Cognitive Server queries the VLM for the next tool or termination.
If a tool is selected, it is executed by the Action Server.
Unity returns updated observations.
Steps 2–4 repeat until completion, then the final answer is returned to Unity.

Repository Overview

The repository follows the system architecture shown in Figure 2 and is organized into three main folders:

adam_unity/ — the Unity project containing the Runtime Engine, Action Server, embodiment logic, experiment scenes, and the Unity build used for large-scale evaluation.
adam_python/ — the Cognitive Server, including the reasoning loop, MSTM, tool interface, LLM manager, and all Python-side logic.
adam_experiments/ — scripts and utilities for batch experiments, automatic result aggregation, and plotting.

In the root folder you will also find the supplementary_material.pdf containing detailed task definitions, solution-checker mappings, and additional documentation used in the paper.

Requirements

A.D.A.M.O. relies on a hybrid Unity–Python architecture.
Below we list all required components for running the system.

Unity Runtime Engine

Unity Version: 6000.2.2f1

Three commercial Unity packages are required for locomotion (free), inverse kinematics (paid) and in game debug console (free):

Motion Matching for Unity v2.3.2 — locomotion system for smooth, data-driven agent movement
Final IK v2.4 — inverse kinematics system for articulated agent posing and reaching
In Game Debug Console v1.8.2 — in-game debug console for logging, inspection, and developer commands during runtime

To correctly integrate these packages in the project:

Import the packages into the project from the package manager
Inside "Assets/Plugins/MotionMatching" folder reorganize scripts and sub-folders to follow the structure illustrated in "Assets/Plugins/MotionMatching/MotionMatching_folder_structure.txt"

MotionMatching_folder_structure.txt

MotionMatching
|   
+---Code
|   |   README.md
|   |   
|   +---Integrations
|   |       Integration_README.txt
|   |       
|   \---MxM
|       |   MxM.asmdef     
|       |   
|       +---Data
|       |   |   
|       |   +---Animation
|       |   |   |   BlendClipData.cs
|       |   |   |   BlendSpace1DData.cs
|       |   |   |   BlendSpaceData.cs
|       |   |   |   ClipData.cs
|       |   |   |   CompositeData.cs
|       |   |   |   IComplexAnimData.cs
|       |   |   |   IdleSetData.cs
|       |   |   |   MotionCurveData.cs
|       |   |   |   SequenceData.cs
|       |   |   |   
|       |   |   \---Assets
|       |   |           IMxMAnim.cs
|       |   |           MxMAnimationClipComposite.cs
|       |   |           MxMAnimationIdleSet.cs
|       |   |           MxMBlendClip.cs
|       |   |           MxMBlendSpace.cs
|       |   |           
|       |   +---AnimSpeedModifier
|       |   |       MotionPreset.cs
|       |   |       MotionSection.cs
|       |   |       MotionTimingPresets.cs
|       |   |       SpeedModData.cs
|       |   |       
|       |   +---Core
|       |   |       CalibrationModule.cs
|       |   |       CompositeCategory.cs
|       |   |       Goal.cs
|       |   |       JointData.cs
|       |   |       MxMAnimData.cs
|       |   |       MxMCalibrationData.cs
|       |   |       MxMPreProcessData.cs
|       |   |       NativeAnimData.cs
|       |   |       PoseCluster.cs
|       |   |       PoseData.cs
|       |   |       PoseJoint.cs
|       |   |       Trajectory.cs
|       |   |       TrajectoryPoint.cs
|       |   |       WarpModule.cs
|       |   |       
|       |   +---Curves
|       |   |       MxMCurveTrack.cs
|       |   |       
|       |   +---Debug
|       |   |       MxMDebugData.cs
|       |   |       MxMDebugFrame.cs
|       |   |       PoseMask.cs
|       |   |       
|       |   +---Events
|       |   |       EventContact.cs
|       |   |       EventData.cs
|       |   |       EventFrameData.cs
|       |   |       MxMEventDefinition.cs
|       |   |       
|       |   +---Modules
|       |   |       AnimationModule.cs
|       |   |       AnimModuleDefaults.cs
|       |   |       EventNamingModule.cs
|       |   |       MotionMatchConfigModule.cs
|       |   |       TagNamingModule.cs
|       |   |       TrajectoryGeneratorModule.cs
|       |   |       
|       |   +---Tags
|       |   |       BoolTagTrack.cs
|       |   |       EventMarker.cs
|       |   |       FloatTagTrack.cs
|       |   |       FootStepTagTrack.cs
|       |   |       TagTrack.cs
|       |   |       TagTrackBase.cs
|       |   |       
|       |   \---Utility
|       |           FootstepTagTrackData.cs
|       |           GenericTagTrackData.cs
|       |           MxMInputProfile.cs
|       |           
|       +---Editor
|       |   |   DocumentationLinks.cs
|       |   |   MxM.Editor.asmdef
|       |   |   MxMAssetHandler.cs
|       |   |   
|       |   +---Data
|       |   |       MxMSettings.cs
|       |   |       MxMSettingsProvider.cs
|       |   |       
|       |   +---EditorWindows
|       |   |   |   MxMAnimationClipCompositeWindow.cs
|       |   |   |   MxMAnimationIdleSetWindow.cs
|       |   |   |   MxMAnimConfigWindow.cs
|       |   |   |   MxMBlendSpaceWindow.cs
|       |   |   |   MxMDebuggerWindow.cs
|       |   |   |   MxMTaggingWindow.cs
|       |   |   |   
|       |   |   \---Utility
|       |   |           AnimModuleSettingsWindow.cs
|       |   |           CompositeCategorySettingsWindow.cs
|       |   |           
|       |   +---Enumerations
|       |   |       EUtilityTagTrack.cs
|       |   |       
|       |   +---Inspectors
|       |   |       AnimationModuleInspector.cs
|       |   |       CalibrationModuleInspector.cs
|       |   |       ConfigurationModuleInspector.cs
|       |   |       EventNamingModuleInspector.cs
|       |   |       MotionTimingPresetInspector.cs
|       |   |       MxMAnimatorInspector.cs
|       |   |       MxMAnimDataInspector.cs
|       |   |       MxMBlendSpaceInspector.cs
|       |   |       MxMCompositeInspector.cs
|       |   |       MxMEventDefinitionInspector.cs
|       |   |       MxMIdleSetInspector.cs
|       |   |       MxMInputProfileInspector.cs
|       |   |       MxMPreProcessDataInspector.cs
|       |   |       MxMRootMotionApplicatorInspector.cs
|       |   |       TagNamingModuleInspector.cs
|       |   |       WarpModuleInspector.cs
|       |   |       
|       |   +---Preview
|       |   |       IPreviewable.cs
|       |   |       MxMPreviewScene.cs
|       |   |       
|       |   +---Resources
|       |   |       BlackProtoGrid.png
|       |   |       DebugArrow.mesh
|       |   |       DebugArrowMat.mat
|       |   |       GroundGrid.prefab
|       |   |       ProtoGridBlack.mat
|       |   |       ProtoGridBlack2.mat
|       |   |       
|       |   \---UTIL
|       |           EditorUtility.cs
|       |           
|       +---Enumerations
|       |       EAnimatorStates.cs
|       |       EBlendSpaceSmoothing.cs
|       |       EBlendSpaceType.cs
|       |       EBlendStatus.cs
|       |       EComplexAnimType.cs
|       |       EEventState.cs
|       |       EEventWarpType.cs
|       |       EFavourTagMode.cs
|       |       EFootStepTags.cs
|       |       EGenericTags.cs
|       |       EIdleState.cs
|       |       EJointVelocityCalculationMethod.cs
|       |       ELongitudinalErrorWarp.cs
|       |       EMotionModSmooth.cs
|       |       EMotionModType.cs
|       |       EMotionWarping.cs
|       |       EMxMAnimtype.cs
|       |       EMxMEventType.cs
|       |       EMxMRootMotion.cs
|       |       EPastTrajectoryMode.cs
|       |       EPoseMatchMethod.cs
|       |       EPostEventTrajectoryMode.cs
|       |       ETags.cs
|       |       ETrajectoryMoveMode.cs
|       |       ETransitionMethod.cs
|       |       EUserTags.cs
|       |       TagSelection.cs
|       |       
|       +---Processors
|       |       GlobalSpacePose.cs
|       |       MxMFootstepDetector.cs
|       |       MxMPreProcessor.cs
|       |       
|       \---Runtime
|           |   
|           +---Components
|           |   |   
|           |   +---Controller
|           |   |       GenericControllerWrapper.cs
|           |   |       UnityControllerWrapper.cs
|           |   |       
|           |   +---Decoupling
|           |   |       MxMAnimationDecoupler.cs
|           |   |       
|           |   +---Extensions
|           |   |       MxMAnimatorPlaybackSync.cs
|           |   |       MxMBlendSpaceLayers.cs
|           |   |       MxMEventLayers.cs
|           |   |       MxMTIPExtension.cs
|           |   |       
|           |   +---RootMotion
|           |   |       MxMDecoupleMotionApplicator.cs
|           |   |       MxMRootMotionApplicator.cs
|           |   |       
|           |   +---Spawner
|           |   |       RuntimeMxMConstructor.cs
|           |   |       
|           |   \---Trajectory
|           |           MxMTrajectoryGenerator.cs
|           |           MxMTrajectoryGeneratorBase.cs
|           |           MxMTrajectoryGenerator_AI.cs
|           |           MxMTrajectoryGenerator_BasicAI.cs
|           |           TrajectoryGeneratorJob.cs
|           |           
|           +---Debug
|           |       DrawArrow.cs
|           |       PlayableUtils.cs
|           |       
|           +---Experimental
|           |   |   
|           |   +---InertialBlending
|           |   |       InertialBlendModule.cs
|           |   |       InertializerJob.cs
|           |   |       TransformData.cs
|           |   |       
|           |   \---SearchManager
|           |           MxMSearchManager.cs
|           |           
|           +---FSM
|           |       FSM.cs
|           |       FSMState.cs
|           |       
|           +---Interfaces
|           |       ILongitudinalWarper.cs
|           |       IMxMExtension.cs
|           |       IMxMRootMotion.cs
|           |       IMxMTrajectory.cs
|           |       IMxMUnityRiggingIntegration.cs
|           |       
|           +---Jobs
|           |       BlendSpaceWeightJob.cs
|           |       JobData.cs
|           |       MinimaJobs.cs
|           |       PoseJobs.cs
|           |       PoseJobs_VelCost.cs
|           |       TrajectoryJobs.cs
|           |       
|           +---MxMAnimator
|           |       MxMAnimator.cs
|           |       MxMAnimator_AnimManagement.cs
|           |       MxMAnimator_BlendSpace.cs
|           |       MxMAnimator_Curves.cs
|           |       MxMAnimator_Debug.cs
|           |       MxMAnimator_Events.cs
|           |       MxMAnimator_Idle.cs
|           |       MxMAnimator_Jobs.cs
|           |       MxMAnimator_Layers.cs
|           |       MxMAnimator_States.cs
|           |       MxMAnimator_Tags.cs
|           |       MxMLayer.cs
|           |       MxMUtility.cs
|           |       
|           +---Playables
|           |       MotionMatchingPlayable.cs
|           |       MxMBlendSpaceState.cs
|           |       MxMPlayableState.cs
|           |       
|           \---Templates
|                   RootMotionApplicatorTemplate.cs
|                   TrajectoryGeneratorTemplate.cs
|                   
+---Demo
|   |   
|   +---Animations
|   |   |   LowerBody.mask
|   |   |   UpperBody.mask
|   |   |   
|   |   +---Events
|   |   |   |   JogJump_ToLeft_1.anim
|   |   |   |   JogJump_ToLeft_1_Mirror.anim
|   |   |   |   JogJump_ToLeft_2.anim
|   |   |   |   JogJump_ToLeft_2_Mirror.anim
|   |   |   |   RunJump_ToLeft_1.anim
|   |   |   |   RunJump_ToLeft_1_Mirror.anim
|   |   |   |   RunJump_ToLeft_3.anim
|   |   |   |   RunJump_ToLeft_3_Mirror.anim
|   |   |   |   RunJump_ToLeft_4.anim
|   |   |   |   RunJump_ToLeft_4_Mirror.anim
|   |   |   |   RunSlide_ToRight_1.anim
|   |   |   |   RunSlide_ToRight_2.anim
|   |   |   |   
|   |   |   +---VaultOff
|   |   |   |       Run_JumpDownHigh_Roll_Run.anim
|   |   |   |       Run_JumpDownHigh_Roll_Run_Mirror.anim
|   |   |   |       Run_JumpDownLow_Roll_Run.anim
|   |   |   |       Run_JumpDownLow_Roll_Run_Mirror.anim
|   |   |   |       _209_Run_JumpDownLow_Run.anim
|   |   |   |       _209_Run_JumpDownLow_Run_Mirror.anim
|   |   |   |       
|   |   |   \---VaultUp
|   |   |           RunJump_ToLeft_2.anim
|   |   |           RunJump_ToRight_2.anim
|   |   |           Run_JumpUpHigh_Run.anim
|   |   |           Run_JumpUpHigh_Run_Mirror.anim
|   |   |           Run_JumpUpMedium_2Hands_Run.anim
|   |   |           Run_JumpUpMedium_2Hands_Run_Mirror.anim
|   |   |           
|   |   +---Idle
|   |   |       Idle_Neutral_1.anim
|   |   |       
|   |   +---LayerTests
|   |   |       IdleGrab_FrontHigh.anim
|   |   |       IdleGrab_FrontHigh_Looped.anim
|   |   |       Idle_JumpDownMed_Idle.anim
|   |   |       
|   |   +---Locomotion
|   |   |   |   HalfSteps2Idle_PasingLongStepTOIdle.anim
|   |   |   |   HalfSteps2Idle_PasingLongStepTOIdle_Right.anim
|   |   |   |   Idle2Run135L.anim
|   |   |   |   Idle2Run135R.anim
|   |   |   |   Idle2Run180L.anim
|   |   |   |   Idle2Run180R.anim
|   |   |   |   Idle2Run45L.anim
|   |   |   |   Idle2Run45R.anim
|   |   |   |   Idle2Run90L.anim
|   |   |   |   Idle2Run90R.anim
|   |   |   |   Idle2walk_AllAngles.anim
|   |   |   |   Idle2walk_AllAngles_Right.anim
|   |   |   |   JogForwardTurnLeft_NtrlMedium.anim
|   |   |   |   JogForwardTurnRight_NtrlMedium.anim
|   |   |   |   PlantNTurn135_Run_L.anim
|   |   |   |   PlantNTurn135_Run_R.anim
|   |   |   |   PlantNTurn180_Run_L_2.anim
|   |   |   |   PlantNTurn180_Run_R_2.anim
|   |   |   |   PlantNTurn90_Run_L.anim
|   |   |   |   PlantNTurn90_Run_R.anim
|   |   |   |   RunForwardStart.anim
|   |   |   |   RunFwdStop.anim
|   |   |   |   Run_LedgeStop2_Idle.anim
|   |   |   |   Run_LedgeStop_Idle.anim
|   |   |   |   SmallStep.anim
|   |   |   |   WalkForwardStart.anim
|   |   |   |   WalkForward_NtrlFaceFwd.anim
|   |   |   |   
|   |   |   \---BlendSpace_Anims
|   |   |           RunArcLeft_Narrow.anim
|   |   |           RunArcLeft_Wide.anim
|   |   |           RunArcRight_Narrow.anim
|   |   |           RunArcRight_Wide.anim
|   |   |           RunForward_NtrlFaceFwd.anim
|   |   |           WalkArkLeft.anim
|   |   |           WalkArkLeft_Narrow.anim
|   |   |           WalkArkRight.anim
|   |   |           WalkArkRight_Narrow.anim
|   |   |           WalkFWD.anim
|   |   |           
|   |   \---Mocap
|   |           RunningMocapSet.fbx
|   |           SprintMocapSet.fbx
|   |           StrafeMocapSet.fbx
|   |           WalkingMocapSet.fbx
|   |           
|   +---Code
|   |   |   AIDestinationSetter.cs
|   |   |   ExampleDecoupleMovementControl.cs
|   |   |   ExampleDemoInput.cs
|   |   |   LocomotionSpeedRamp.cs
|   |   |   StressTestSpawner.cs
|   |   |   
|   |   +---UI
|   |   |       HelpUIControl.cs
|   |   |       
|   |   \---Vault System
|   |           EVaultContactOffsetMethod.cs
|   |           EVaultType.cs
|   |           VaultableProfile.cs
|   |           VaultDefinition.cs
|   |           VaultDetectionConfig.cs
|   |           VaultDetector.cs
|   |           
|   +---Data
|   |   |   
|   |   +---EventDefinitions
|   |   |       EventDef_Dance.asset
|   |   |       EventDef_VaultOff.asset
|   |   |       EventDef_VaultOff_High.asset
|   |   |       EventDef_VaultOff_Med.asset
|   |   |       EventDef_VaultOverLong.asset
|   |   |       EventDef_VaultOverShort.asset
|   |   |       EventDef_VaultUp.asset
|   |   |       EventDef_VaultUp_High.asset
|   |   |       EventDef_VaultUp_Med.asset
|   |   |       JumpEventDef.asset
|   |   |       SlideEventDef.asset
|   |   |       
|   |   +---InputProfiles
|   |   |   |   MxMInputProfile.asset
|   |   |   |   
|   |   |   +---Balanced
|   |   |   |       MocapInputProfile_Balanced.asset
|   |   |   |       MocapSprintInputProfile_Balanced.asset
|   |   |   |       MocapStrafeInputProfile_Balanced.asset
|   |   |   |       
|   |   |   +---HighQuality
|   |   |   |       MocapInputProfile_HQ.asset
|   |   |   |       MocapSprintInputProfile_HQ.asset
|   |   |   |       
|   |   |   \---HighResponsiveness
|   |   |           MocapInputProfile_Responsive.asset
|   |   |           MocapSprintInputProfile_Responsive.asset
|   |   |           
|   |   +---Legacy
|   |   |       MotionMatchConfigModule.asset
|   |   |       MxMAnimData.asset
|   |   |       MxMAnimDataAI.asset
|   |   |       MxMDemo_RunAnims.asset
|   |   |       MxMDemo_RunAnims_AI.asset
|   |   |       MxMDemo_WalkAnims_AI.asset
|   |   |       MxMPreProcessData.asset
|   |   |       MxMPreProcessDataAI.asset
|   |   |       Test.asset
|   |   |       
|   |   +---MotionMatching
|   |   |   |   MocapAnimData.asset
|   |   |   |   MocapAnimDataAI.asset
|   |   |   |   MocapPreProcessData.asset
|   |   |   |   MocapPreProcessData_AI.asset
|   |   |   |   
|   |   |   \---Modules
|   |   |       |   EventNamingModule.asset
|   |   |       |   GeneralWarpModule.asset
|   |   |       |   MocapMatchConfigModule.asset
|   |   |       |   TagNamingModule.asset
|   |   |       |   
|   |   |       +---AnimModules
|   |   |       |       MocapDemo_OtherAnims.asset
|   |   |       |       MocapDemo_RunAnims.asset
|   |   |       |       MocapDemo_RunAnims_AI.asset
|   |   |       |       MocapDemo_SprintAnims.asset
|   |   |       |       MocapDemo_StrafeAnims.asset
|   |   |       |       MocapDemo_WalkAnims.asset
|   |   |       |       MxMDemo_ParkourAnims.asset
|   |   |       |       
|   |   |       \---Calibrations
|   |   |               MxMCalibrationModule_Balanced.asset
|   |   |               MxMCalibrationModule_HighQuality.asset
|   |   |               MxMCalibrationModule_HighResponsiveness.asset
|   |   |               
|   |   \---VaultData
|   |           VaultDef_VaultOff.asset
|   |           VaultDef_VaultOff_High.asset
|   |           VaultDef_VaultOff_Med.asset
|   |           VaultDef_VaultOverLong.asset
|   |           VaultDef_VaultOverShort_FromStanding.asset
|   |           VaultDef_VaultUp.asset
|   |           VaultDef_VaultUp_High.asset
|   |           VaultDef_VaultUp_Med.asset
|   |           VaultDetectionConfig.asset
|   |           
|   +---Materials
|   |       AITarget.mat
|   |       AITarget2.mat
|   |       Ground.mat
|   |       Obstacle.mat
|   |       ProtoGridOrange.mat
|   |       Robot_Color.mat
|   |       Wall.mat
|   |       
|   +---Model
|   |       Robot Kyle.fbx
|   |       
|   +---Prefabs
|   |       CM_ThirdPerson.prefab
|   |       Main Camera.prefab
|   |       Robot Kyle.prefab
|   |       Robot Kyle_AI (WithRootMotion).prefab
|   |       Robot Kyle_AI.prefab
|   |       Robot Kyle_MOCAP_Balanced.prefab
|   |       Robot Kyle_MOCAP_HighResponsiveness.prefab
|   |       Robot Kyle_MOCAP_Quality.prefab
|   |       Robot Kyle_StressTest.prefab
|   |       RobotKyle (AlternativeHierarchy).prefab
|   |       RobotKyle_Decouple.prefab
|   |       
|   +---Scenes
|   |   |   MxMDemo.unity
|   |   |   MxMDemoSettings.lighting
|   |   |   MxMDemo_StressTest.unity
|   |   |   
|   |   \---MxMDemo
|   |           LightingData.asset
|   |           NavMesh.asset
|   |           ReflectionProbe-0.exr
|   |           
|   \---Textures
|           OrangeProtoGrid.png
|           Robot_Color.tga
|           Robot_Normal.tga
|           
\---Prefabs
        MxMSearchManager.prefab

Python

Python Version: 3.12.10

We recommend using a dedicated virtual environment so that Unity can call a Python executable with the correct dependencies installed.

Create and activate a .venv, then install all requirements:

python -m venv .venv
# On Windows
.\.venv\Scripts\activate
# On macOS / Linux
source .venv/bin/activate

pip install -r requirements.txt

Vision-Language Models (VLM)

A.D.A.M.O. requires a VLM with native tool/function calling and multimodal input (image + text). The system is designed to support multiple models and providers and can be easily extended to new backends through the LLM manager.

Model and provider management is handled by:

adam_python/agentic_forge/llm_manager.py  (class: LlmManager)

LlmManager loads its configuration from:

adam_python/agentic_forge/configs/llm_manager_config.yaml

This configuration file defines, for each provider and model:

Which provider to use (e.g., OpenAI, OpenRouter, Ollama)
The model identifiers and default parameters
Provider-specific options such as base URLs and timeouts Currently supported providers:
- OpenAI
- OpenRouter
- Ollama (local server)

Provider API keys are configured via:

adam_python/agentic_forge/configs/provider_apy_key.example.yaml (remove .example)

Edit this file and set the corresponding fields for OpenAI and OpenRouter

LlmManager loads its configuration from:

adam_python/agentic_forge/configs/llm_manager_config.yaml

For local models served through Ollama, update the corresponding provider section in the same configuration file, setting the appropriate base_url for the Ollama server and specifying the desired model names. The LlmManager uses these configuration files to instantiate and route calls to different models in a unified way, so adding or tweaking models and providers only requires editing the YAML configuration rather than changing the core code.

How to Use A.D.A.M.O.

A.D.A.M.O. can be used in two different modes depending on whether you have access to the Unity commercial packages required by the Runtime Engine.

1. Unity Editor Mode (requires commercial assets)

This mode uses the full Unity project (adam_unity/) and allows:

interactive debugging
visualization inside the 3D scene
custom task configuration from the Editor

However, this mode requires the following paid (or free but with license agreement) Unity packages:

Motion Matching for Unity (free)
Final IK (paid)
In-Game Debug Console (free)

If you do not own these assets, the Editor mode cannot be used.

2. Build Mode (recommended for most users)

This mode uses the preconfigured Unity build and does not require any commercial packages.
It is the recommended setup for:

running the entire benchmark
reproducing all results from the paper
parallel execution on multi-core systems
headless evaluation without Unity Editor

All batch experiments in the paper were executed through the Build Mode using:

adam_experiments/run_experiments.py

If you do not have the commercial Unity packages, skip the Editor instructions section and go directly to the Build Mode section.

Running Tasks from the Unity Editor

A.D.A.M.O. includes an integrated experimentation scene for launching custom tasks directly from the Unity Editor.
This workflow is useful for debugging, rapid prototyping, visualization, and small-scale experimentation.

1. Open the Experiment Scene

Inside the Unity project, open:

adam_unity/Assets/Scenes/_Experiments.unity

This scene is intentionally minimal and contains a single manager object: Experiments_Prefab

which holds the BenchmarkManager component. This component orchestrates experiment setup, execution, and communication with the Python Cognitive Server.

Figure 3 — The Benchmark Manager component inside the _Experiments scene.

2. Benchmark Manager Overview

The Benchmark Manager automatically:

starts the Python Cognitive Server
sends configuration parameters (scene, task, model, repetitions, etc.)
runs episodes in Unity
logs and stores results under BenchmarkData/{ExperimentName}

Below we detail all configuration fields.

2.1 Configs Section

These settings control how runs are executed:

Debug Mode Enables step-by-step execution. After each repetition Unity pauses and displays a dialog asking whether to continue. Useful for inspecting logs or intermediate agent states.
Use Custom Run When enabled, Unity executes a single run using the parameters in Current Run. When disabled, runs are loaded from a CSV file (see CSV Relative Path).
Run Python Server Automatically launches the Cognitive Server on localhost:50000. The server is implemented in FastAPI (adam_python/adam_agent_server.py).
CSV Relative Path Path to a CSV file describing a batch of runs. Example file:

adam_unity/Assets/BenchmarkData/run_static.csv

Disable Use Custom Run to let the CSV drive all experiment parameters.

Experiment Name Results will be saved under:

adam_unity/Assets/BenchmarkData/{Experiment Name}/

Time Multiplier Speeds up Unity execution (animations, physics). Useful for accelerating experiments while keeping logic consistent.
Timeout Seconds Maximum time Unity waits for a reply from the Cognitive Server before terminating the run.

2.2 Runs Data — Current Run

These fields define a single episode configuration and are used only if Use Custom Run is enabled.

Scene Select the environment: S1 (tabletop) or S2 (living room).
Task Id Selects the task to execute. Must correspond to a valid entry in the benchmark (see supplementary material).
Task Prompt Natural-language instruction given to the agent.
Solution Checker Chooses the deterministic checker for evaluating task completion. Each task has a dedicated checker (see supplementary material for mapping).
Graphical Resolution & Graphical Lighting Deprecated (do not use).
Model Selects which VLM backend to use: G4O (GPT-4o-vision), S35 (Claude Sonnet-3.5) or internal codes mapped in llm_manager_config.yaml.
Object Identifier Labeling scheme for object tags:
- SEM → semantic (ObjectName_InstanceID)
- OPAQ → numeric opaque (InstanceID only)
Coordinates Type Deprecated (do not use).
Repetitions Number of independent repetitions for the selected run.

All remaining fields are runtime or debug fields used internally — do not edit them.

2.3. Launching the Run

Open the _Experiments scene
Configure Configs and Current Run
Press Play in Unity
Unity will:
1. spawn the agent
2. start the Cognitive Server
3. run the task
4. save logs and results under BenchmarkData/{ExperimentName}
5. loop for the specified number of repetitions

For batch execution, disable Use Custom Run and set the CSV path to a valid experiment file.

Running Batch Experiments from the Build

In addition to running single tasks from the Unity Editor, A.D.A.M.O. supports fully automated batch experiments using a Unity build and the run_experiments.py script.
This mode is intended for large-scale evaluation, parallel execution, and automatic result aggregation.

0. Setup

Extract the .zip content from GitHub release in Build directory (create it if missing):

adam_unity/Build/

1. Benchmark Data Folder and CSV Specification

Batch runs use a shared benchmark data directory:

adam_unity/Build/ADAMO_Build_data/BenchmarkData

In this folder you must provide a master CSV:

runs.csv (runs_static.csv provides an example of the table)

each row in runs.csv specifies a single episode configuration (scene, task, model, labeling scheme, repetitions, etc.), using the same fields used by the Unity BenchmarkManager (Current Run). When run_experiments.py is executed, Unity will:

read the episode definitions from runs.csv
execute them through the build
write logs back into the same BenchmarkData directory, organized according to the parameters provided on the command line.
final aggregated results (metrics and plots) are computed automatically and stored under:

adam_experiments/experiment/{exp-name}/

where {exp-name} is the experiment name you choose when invoking run_experiments.py.

2. AdamConfig (internal runtime configuration)

The Python side includes an AdamConfig model that controls how the Cognitive Server and Unity tool server communicate:

agent_host / agent_port HTTP endpoint of the Cognitive Server (FastAPI). This is the address Unity uses to send observations and receive actions.
tool_host / tool_port HTTP endpoint of the Unity Action Server, used by the Cognitive Server to call tools (Walk, Look, Pick, Drop).
msg_window_size Size of the rolling buffer in the MSTM (number of messages kept in context). Note: a tool call and its corresponding tool response are treated as a single atomic message in this window.

These parameters are not exposed directly via the run_experiments.py CLI; they are set when Unity spawns the Cognitive Server. If you need to change them, you must do so in the Python code by editing the defaults in AdamConfig (and ensuring Unity is configured accordingly), rather than through command-line arguments.

3. Running Batch Experiments with `run_experiments.py`

The main entry point for batch execution is:

./run_experiments.py

Internally, this script:

Reads the master runs.csv from the benchmark root directory.
Splits it into smaller batch CSVs (if needed) under a batch root directory.
Launches multiple Unity build processes in parallel, each consuming one batch CSV.
Collects logs and metrics from BenchmarkData.
Aggregates episode results, computes summary tables and plots, and writes them under adam_experiments/experiment/{exp-name}.

A typical invocation:

python run_experiments.py \
  --parallelism 4 \
  --exp-name cd_benchmark_g4o_sem \
  --timescale 4

4. Command-line Arguments

The script exposes the following CLI parameters (defaults are defined inside run_experiments.py):

Path-related arguments (--exe, --python-exe, --csv, --benchmark-root, --batch-root, --exp-dir) are usually left at their defaults, which are aligned with the project layout

The most useful parameters to tune from the command line are:

--parallelism (-p) Number of Unity processes to run in parallel. Parallelism is applied over episodes, not over repetitions of the same episode.
--exp-name Name of the experiment; determines the folder under adam_experiments/experiment/{exp-name} where final metrics and plots are stored.
--timescale Time multiplier for Unity. Values > 1 speed up simulation (shorter wall-clock time), while 1 corresponds to realtime.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A.D.A.M.O. — Agent for language-Driven Actions with Multimodal Observations

Abstract

Repository Overview

Requirements

Unity Runtime Engine

Python

Vision-Language Models (VLM)

How to Use A.D.A.M.O.

1. Unity Editor Mode (requires commercial assets)

2. Build Mode (recommended for most users)

Running Tasks from the Unity Editor

1. Open the Experiment Scene

2. Benchmark Manager Overview

2.1 Configs Section

2.2 Runs Data — Current Run

2.3. Launching the Run

Running Batch Experiments from the Build

0. Setup

1. Benchmark Data Folder and CSV Specification

2. AdamConfig (internal runtime configuration)

3. Running Batch Experiments with `run_experiments.py`

4. Command-line Arguments

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
adam_experiments		adam_experiments
adam_python		adam_python
adam_unity		adam_unity
readme_resources		readme_resources
.gitattributes		.gitattributes
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt
run_experiments.py		run_experiments.py

Folders and files

Latest commit

History

Repository files navigation

A.D.A.M.O. — Agent for language-Driven Actions with Multimodal Observations

Abstract

Repository Overview

Requirements

Unity Runtime Engine

Python

Vision-Language Models (VLM)

How to Use A.D.A.M.O.

1. Unity Editor Mode (requires commercial assets)

2. Build Mode (recommended for most users)

Running Tasks from the Unity Editor

1. Open the Experiment Scene

2. Benchmark Manager Overview

2.1 Configs Section

2.2 Runs Data — Current Run

2.3. Launching the Run

Running Batch Experiments from the Build

0. Setup

1. Benchmark Data Folder and CSV Specification

2. AdamConfig (internal runtime configuration)

3. Running Batch Experiments with run_experiments.py

4. Command-line Arguments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

3. Running Batch Experiments with `run_experiments.py`

Packages