Microsoft’s Phi-3.5 series unveils triple threat

B&T Television

Adobe will launch generative AI video creation tool later this year

September

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

more tags

Microsoft’s Phi-3.5 series unveils triple threat

Tags: microsoft new video

Author: DATE POSTED:August 21, 2024

Feed: Dataconomy

View: Original article

Microsoft’s Phi-3.5 series unveils triple threat

Microsoft is stepping up its game in the AI world with the new Phi-3.5 series, offering three cutting-edge models designed for different tasks. These models aren’t just powerful—they’re also versatile, making it easier for developers to tackle everything from basic coding to complex problem-solving and even visual tasks. Whether you’re working with limited resources or need advanced artificial intelligence capabilities, the Phi-3.5 models have something to offer, and here is a quick glimpse of them.

Breaking down Microsoft’s Phi-3.5 models

Microsoft’s latest release, the Phi 3.5 series, introduces three advanced AI models: Phi-3.5-mini-instruct, Phi-3.5-MoE-instruct, and Phi-3.5-vision-instruct. Each model is crafted to address specific needs, from basic reasoning to advanced multimodal tasks.

All three Microsoft Phi-3.5 models are available under the MIT license, which allows developers to use, modify, and distribute the models with minimal restrictions. This open-source approach supports widespread adoption and fosters innovation across various applications and research domains.

Phi-3.5 Mini Instruct: Efficient and compact

The Microsoft Phi-3.5 Mini Instruct model is designed to perform exceptionally well in environments with limited computational resources. With 3.8 billion parameters, it is tailored for tasks that require strong reasoning capabilities but do not demand extensive computational power. Trained on 3.4 trillion tokens using 512 H100-80G GPUs over 10 days.

Key features:

Parameters: 3.8 billion
Context length: 128k tokens
Primary use cases: Code generation, mathematical problem solving, logic-based reasoning
Performance: Despite its smaller size, it demonstrates competitive performance in multilingual and multi-turn conversational tasks. It excels in benchmarks such as RepoQA, which measures long-context code understanding, surpassing other similarly-sized models like Llama-3.1-8B-instruct.

Phi-3.5 Mini Instruct’s efficient design allows it to deliver robust performance while being mindful of resource constraints. This makes it suitable for deployment in scenarios where computational resources are limited but high performance is still required.

Phi-3.5 MoE: Mixture of experts architecture

The Microsoft Phi-3.5 MoE (Mixture of Experts) model represents a sophisticated approach to AI architecture by combining multiple specialized models into one. It features a unique design where different “experts” are activated depending on the task, optimizing performance across various domains. Trained on 4.9 trillion tokens with 512 H100-80G GPUs over 23 days.

Key features:

Parameters: 42 billion (active), with 6.6 billion actively used during operation
Context length: 128k tokens
Primary use cases: Complex reasoning tasks, code understanding, multilingual language comprehension
Performance: The MoE model performs exceptionally well in code and math tasks and exhibits strong multilingual understanding. It frequently outperforms larger models in specific benchmarks, including a notable edge over GPT-4o mini in the 5-shot MMLU (Massive Multitask Language Understanding) test.

The Phi-3.5 MoE architecture enhances scalability and efficiency by activating only a subset of parameters relevant to a given task. This enables the model to handle a wide range of applications while maintaining high performance across different languages and subjects.

Phi-3.5 Vision Instruct: Advanced multimodal capabilities

The Microsoft Phi-3.5 Vision Instruct model is designed to handle both text and image data, making it a powerful tool for multimodal AI tasks. It integrates advanced image processing with textual understanding, supporting a variety of complex visual and textual analysis tasks. Trained on 500 billion tokens using 256 A100-80G GPUs over 6 days.

Key features:

Parameters: 4.15 billion
Context length: 128k tokens
Primary use cases: Image understanding, optical character recognition (OCR), chart and table comprehension, video summarization
Performance: Trained on a combination of synthetic and filtered publicly available datasets, the Vision Instruct model excels in handling complex, multi-frame visual tasks and provides comprehensive analysis of visual and textual information.

Phi-3.5 Vision Instruct’s ability to process and integrate both text and images makes it highly versatile for applications requiring detailed visual analysis. This capability is particularly valuable for tasks involving diverse data types and formats.

The Phi-3.5 Vision Instruct model is also accessible through Azure AI Studio.

Feed: Dataconomy

View: Original article

Tags: microsoft new video