UniUSNet

A Promptable Framework for Universal Ultrasound Disease Prediction and Tissue Segmentation

Medical AI Deep Learning Ultrasound Imaging
Project Leaders

Zehui Lin

Faculty of Applied Sciences
Macao Polytechnic University


Co-Authors:

Zhuoneng Zhang, Xindi Hu, Zhifan Gao, Xin Yang,
Yue Sun, Dong Ni, Tao Tan

Abstract

Ultrasound is widely used in clinical practice due to its affordability, portability, and safety. However, current AI research often overlooks combined disease prediction and tissue segmentation. We propose UniUSNet, a universal framework for ultrasound image classification and segmentation. This model handles various ultrasound types, anatomical positions, and input formats, excelling in both segmentation and classification tasks.

Trained on a comprehensive dataset with over 9.7K annotations from 7 distinct anatomical positions, our model matches state-of-the-art performance and surpasses single-dataset and ablated models. Zero-shot and fine-tuning experiments show strong generalization and adaptability with minimal fine-tuning.

Key Features

  • Universal Framework - Handles both disease prediction and tissue segmentation tasks
  • Multi-Anatomical Support - Works across 7 different anatomical positions
  • Versatile Input Handling - Supports various ultrasound types and input formats
  • State-of-the-Art Performance - Matches or exceeds current best methods
  • Strong Generalization - Excellent zero-shot and fine-tuning capabilities
  • Parameter Efficient - 66% fewer parameters than some comparable models

The BroadUS-9.7K Dataset

The BroadUS-9.7K dataset contains 9.7K annotations of 6.9K ultrasound images from 7 different anatomical positions.

9.7K+
Annotations
6.9K
Ultrasound Images
7
Anatomical Positions

Dataset Breakdown

The dataset includes diverse annotations covering various imaging natures, anatomical positions, tasks (segmentation and classification), and input types. Note that breast images can contain both segmentation and classification labels, and images with segmentation labels can form three different input types.

Architecture

The architecture of UniUSNet is a general encoder-decoder model that uses prompts to simultaneously handle multiple ultrasound tasks like segmentation and classification.

Encoder

The encoder extracts hierarchical features from ultrasound images, capturing both low-level details and high-level semantic information essential for medical image analysis.

Task-Specific Decoders

Task-specific decoders are enhanced by four types of prompts—nature, position, task, and type—added to each transformer layer via prompt projection embedding.

Prompt Engineering

The innovative prompting mechanism boosts the model's versatility and performance, enabling it to adapt to different ultrasound imaging conditions and analysis requirements without extensive retraining.

Results

R1: Overall Performance Comparison

SAM's official weights perform poorly in zero-shot inference (37.12%) due to the domain gap between natural and medical images. SAMUS improves performance (80.65%) but doesn't surpass the Single model. Our automatic prompt model, with 66% fewer parameters, achieves similar segmentation results (80.01%). Ablation studies reveal that UniUSNet (79.89%) outperforms both the ablation version (78.46%) and Single (78.43%) models, proving the effectiveness of prompts.

R2: Segmentation Results

Segmentation results reveal that UniUSNet outperforms SAM and other models by effectively using nature and position prompts for deeper task understanding. Visual comparisons show more accurate tissue boundaries and lesion delineation.

R3: Feature Distribution Analysis

t-SNE visualization of feature distributions across BUS-BRA, BUSIS, and UDIAT datasets shows that the Single model has a clear domain shift, while UniUSNet w/o prompt reduces this shift, indicating better domain adaptation. Prompts further minimize the domain offset.

R4: Adapter Performance

On the BUSI dataset, UniUSNet w/o prompt and UniUSNet outperform the Single model, demonstrating better generalization and prompt effectiveness. Additionally, the Adapter setup, with minimal fine-tuning, surpasses the Scratch setup, showcasing our model's adaptability to new datasets efficiently.

Resources

Additional Resources

We provide a detailed data processing method for the BroadUS-9.7K dataset, as well as data demos for checking whether the data format is prepared properly and for quickly starting experiments or inferences. Pretrained models can be downloaded from our GitHub repository.

Citation

If you find this work useful for your research, please cite:

@inproceedings{lin2024uniusnet,
  title={UniUSNet: A Promptable Framework for Universal Ultrasound Disease Prediction and Tissue Segmentation},
  author={Lin, Zehui and Zhang, Zhuoneng and Hu, Xindi and Gao, Zhifan and Yang, Xin and Sun, Yue and Ni, Dong and Tan, Tao},
  booktitle={2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)},
  pages={3501--3504},
  year={2024},
  organization={IEEE}
}

Acknowledgements

This work was supported by Science and Technology Development Fund of Macao (0021/2022/AGJ) and Science and Technology Development Fund of Macao (0041/2023/RIB2).

Project Gallery

UniUSNet Project

Project image would be displayed here