MIPI - Physical Interface for MIDI Files
Loading...
Searching...
No Matches
MIPI - Physical Interface for MIDI Files

MIPI

This package implements 3 standalone nodes that work together to control a UR3e to play the piano. Below is an overview of the nodes and their functionality, as well as some details on the code structure and behaviour.

Main Page Overview

RoboticsStudio2-JAMC Repository

Node and Library Descriptions

UI - Human 2 Robot Interaction

UI** - Usage Guide

The PianoUI serves as the primary interaction layer for the JAMC system. It is implemented as a dual-inheritance Hybrid Node, combining the event-driven nature of Qt 5/6 * with the distributed messaging of ROS 2 Humble.

Layout & User Experience

The interface is designed with a high-contrast "Dark Mode" aesthetic (Visual Studio Code style) to ensure readability in lab environments. The layout is divided into four functional zones:

  • Vision Module: A centralized 400x300 viewport displaying the live UR3 camera feed.
  • Configuration Zone: Includes a dynamic instrument/channel selector (generated at runtime), a MIDI file browser, and a speed/velocity slider.
  • Playback Controls: Standard transport controls (Play/Pause, Direction Toggle) and a track progress slider.
  • Status Bar: Real-time feedback on system readiness, current playback speed, and song time.

Technical Implementation

Hybrid Execution Model** The class inherits from QWidget and rclcpp::Node. To prevent the ROS 2 executor from blocking the Qt Main Thread (which would freeze the UI), the node is designed to handle asynchronous callbacks.

Thread-Safe Image Processing** One of the most critical features of the UI is the image_callback. Incoming camera frames from the /camera/camera/color/image_raw topic are processed in the ROS background thread and safely handed over to the Qt thread:

  • Conversion: Raw sensor_msgs data is mapped to QImage::Format_RGB888.
  • Transformation: Images are scaled using Qt::SmoothTransformation to maintain visual quality.
  • Invocation: The UI update is triggered via QMetaObject::invokeMethod using a Qt::QueuedConnection, ensuring the GUI remains responsive and thread-safe.

    Dynamic Control Logic** The UI maintains internal state machines for:

  • Playback State: Tracks is_playing to toggle between ▶ (Play) and ▐▐ (Pause) icons.
  • Directionality: Manages the toggle between Forward and Reverse trajectories.
  • Resource Management: The force_pause_and_reset() method ensures that the robot comes to a safe halt before new MIDI data is loaded into the controller.

ROS 2 Interface Mapping

UI Element ROS 2 Object Interface Type
Camera View _camera_sub sensor_msgs::msg::Image
Play/Pause Button playback_client jamc::srv::Func
Foreard/Rewind Button playback client jamc::SRC::Func
Debug Button playback client jamc::SRV::Func
Speed Slider time_scale_client jamc::srv::TimeScale
File/Track Selector channel_client jamc::srv::Load

MIDI Processing - For parsing and converting midi files to a readable format by our system

MIDI Processing** - Usage Guide

MidiProcessor is a standalone C++ utility class designed for reading, processing, filtering, and serializing MIDI data into a custom JSON-based .mipi format. It uses the midifile library for MIDI parsing and nlohmann::json for data storage and retrieval.

The class extracts musical information from MIDI files on a per-channel basis, including instruments, notes, note timings, note durations, and song length. It also converts MIDI note data into a simplified keyboard mapping system suitable for playback, visualization, or hardware interaction.

Features

  • Loads and parses standard MIDI files
  • Extracts:
  • MIDI channels
  • Instrument/program assignments
  • Note pitches
  • Note timestamps
  • Note durations
  • Song duration
  • Applies preprocessing filters to simplify note data
  • Maps notes to a constrained keyboard range
  • Saves processed data into reusable .mipi JSON files
  • Reloads previously processed .mipi files
  • Provides getter methods for all processed data structures

Pipeline

The processMidiFile() function performs the complete processing workflow in the following order:

  1. Open and analyze the MIDI file
  2. Extract instruments and channels
  3. Extract notes, timestamps, and durations
  4. Determine total song duration
  5. Apply note filtering:

    5.1. Chord filtering

    5.2. Trill filtering

    5.3. Overlapping note filtering

    5.4. Trim overlapping note durations

  6. Assign notes to keyboard ranges
  7. Save processed data to a .mipi file

Filtering

The class includes several preprocessing filters to simplify dense MIDI data:

Chord Filter:

  • Detects notes played simultaneously
  • Keeps either the highest or lowest note depending on instrument type

Trill Filter

  • Removes rapidly repeated notes occurring within a minimum time gap
  • Preserves only the first note

Overlapping Note Filter

  • Removes notes completely contained within the duration of another note

Duration Trimming

  • Prevents note durations from overlapping subsequent notes
  • Keyboard Mapping System

After filtering, notes are mapped onto a 37-key keyboard range centered around the average pitch of each channel. Notes outside the valid range are octave-shifted until they fit within bounds.

Controller - Control Engine for UR3e Piano Playing

Controller** - Usage Guide

The Controller class in the Control namespace is responsible for all direct ROS2 interaction with the robots UR_DRIVER and MoveIt Servo libraries. It exposes no public members to be interacted with and as such must be controlled via its various ROS2 Interfaces (See Usage Guide above).

Behaviour

Before Starting the node, it assumes that you have followed the Intallation, Physical Robot Setup and Software Setup sections defined in the Usage Guide. Without all that infrastructure running, the Node will be unable to function. When spinning the node into the ROS2 environment, the following will happen:

  1. The node will enter a startup loop where it will:
    • Attempt to read /joint_states message from the robot
    • Attempt to send direct JointJog Trajectory messages to the robot to bring it to the starting position
    • For best results, ensure robot is in the default "Home" position before starting the node
  2. Once startup is achieved, the node will enter the main loop where it will wait for a song to be loaded and the play button to be pressed. In this state, the robot will:
    • Read the /key_positions topic to get the current position of the keyboard keys
    • Have the 4 control services mentioned in the usage guide available to manipulate song playback
    • Once the play button is pressed:
      • The node will read the MIDI data from the loaded song file
      • Iterate through the note array and:
        • For each note, attempt to align the robot end effector with the key position
        • If they key is not visible, it will travel the length of the keyboard to find it
        • Once aligned, it will wait until the key is meant to be pressed down (based on the timing of the song), and then press the key once this time is expired
      • It will iterate through the entire song until it is done
  3. On shutdown, the robot will attempt to return to the "Home" position before all nodes are shutdown using the same method as the startup process.

Perception & Computer Vision - Let's the robot see

Perception** - Usage Guide

The vision code utilises the Intel RealSense camera to detect the two AprilTags placed on either side of the piano, along with each black and white key on the keyboard. Every key is individually segmented, allowing its midpoint to be marked and displayed. At present, the keys are mapped from the right to the left side of the keyboard. The next step is to identify the coloured dots on the keys, which will allow each key to be recognised more accurately and its position mapped using these coloured stickers.

External Libraries - JSON, MIDI Parsing, etc

Acknowledgements of External Library Usage:

  • Qt 5/6: Used for the UI implementation
  • rclcpp: The C++ client library for ROS 2
  • OpenCV: Utilized for image processing tasks in the perception module
  • midifile: A C++ library for parsing MIDI files
  • nlohmann/json: A popular C++ library for JSON parsing and serialisation

Code Behaviour and Usage

Build Instructions

To build the package please refer to *BUILD**

Launch Files - Node Startup and Execution (Usage)

.launch.py

- This launch file will start all 3 nodes together, and is the recommended way to start the system. It will also start all dependencies, like the UR_driver, MoveIt Servo, and Realsense Camera program.

_test.launch.py

- Starts the Controller node on its own, without any dependencies.

_test.launch.py

- Starts the UI node on its own, without any dependencies.

Refer to the source code for detailed topic and service names, and to adjust parameters as needed for your environment.